- we can use type casting to check for type instead of adding a dependency on
reflect
- lint complains about redundant types, so get rid of that for peace of mind
Since writers are normally only used synchronously (i.e. to write one document
at a time), we don't guard modifications to their internal
state (e.g. temporarily replacing the string.Builder in WriteNodesAsString)
against race conditions.
The package global `orgWriter` and corresponding use cases of it (`org.String`,
`$node.String`) break that pattern - the writer is potentially used from
multiple go routines at the same time. This results in race conditions that
manifest as error messages like e.g.
could not write output: runtime error: invalid memory address or nil pointer dereference. Using unrendered content.
Additionally, since we catch panics in `Document.Write`, the corresponding
stack trace is lost and dependents of go-org never know what hit them.
As using a writer across simultaneously across go routines is not a standard
pattern, we'll sync the use of the global `orgWriter` instead of trying to make
the actual writer threadsafe; less code noise for the common use case.
Allows consumers to specify `TopLevelHLevel` to `HTMLWriter`, which
works identically to Org's official [`:html-toplevel-hlevel` /
`org-html-toplevel-hlevel`](https://orgmode.org/manual/Publishing-options.html) property.
Fixes#94.
current support for latex fragments was inline only, i.e. lines containing block
elements (e.g. a line starting with `* `, i.e. a headline) will not be parsed
as part of the latex fragment but the respective block element. Parsing latex
blocks at the block level should fix that. Note that in any case we don't do
any processing and just emit the raw latex (leaving the rendering to e.g. js).
Before this commit, if an org document was titled "Title here", the
first line of HTML output would be as follows:
<h1 class="title"><p>Title here\n</p></h1>
This commit changes the HTML writer to instead output the following:
<h1 class="title">Title Here</h1>
I conversatively modified the code, so there might be more cases where
elements should be omitted from the title.
org mode supports [1] setting the value attribute [2] of ordered list items to
change the numbering for the current and following items. Let's do the same. As
the attribute has no meaning for other types of lists [2] we'll just not
support it for those cases [3].
[1] https://orgmode.org/manual/Plain-Lists.html#Plain-Lists
[2] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/li#attributes
[3]
Org mode seems to instead set the id attribute for e.g. unordered lists
starting with `[@\d+]\s` - but I don't really see the value in that and will
skip that for now.
org mode allows rendering the toc anywhere in the html document using the `TOC`
keyword [1]. There's more options but `#+TOC: headlines $n` should be enough
for starters. Note that org mode still requires setting `#+OPTIONS: toc:nil` to
disable the default toc
[1] https://orgmode.org/manual/Table-of-Contents.html
Hugo defaults to serving files with pretty urls [1] - this means
`/posts/foo.org` is served at `/posts/foo/`. This works because servers
default to serving index.html when a directory is specified and hugo renders
the post to `/posts/foo/index.html` instead of `/posts/foo.html`. To make
relative links work we need to (1) remove the fake `foo/` subdirectory from
unrooted links and (2) replace any `.org` suffix with `/`.
[1] https://gohugo.io/content-management/urls/#pretty-urls
Turns out Org mode supports image links natively and we don't have to go out of
spec!
From https://orgmode.org/manual/Images-in-HTML-export.html:
[...] if the description part of the Org link is itself another link, such as
‘file:’ or ‘http:’ URL pointing to an image, the HTML export back-end in-lines
this image and links to [...]
html does not support table separator rows as Org mode does. Emacs org export
simulates rows as defined by separators by wrapping all the rows between 2
separators into a separate tbody. The html spec is fine with that [0] so we
follow.
[0] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/tbody
All tags are put on a line by themselves to help with visual
diffing. Apparently this extra cosmetic whitespace causes problems inside p
tags for ppl who want to use `white-space: pre`. Not much hurt for visual
diffing in removing cosmetic whitespace for just p tags and can't think of
anything that would break because of this right now. So let's do it and wait
for things to break.
if we define a custom LINK for file we run into index problems bc it's trimmed
before already - this fixes that. Shouldn't ever happen but whatever, fuzzing
found it.
To support code block directives like :exports none we need context - i.e. we
need to have the block and it's results at once and can't just render them
independently.
While example blocks do not render inline markup and are thus parsed raw in
some way, their contents are not literal html and thus still need to be html
escaped.
The org mode toc OPTION does not just support true/false - it also allows
specifying the max headline level [1] to be included in the toc.
[1] headline level as seen in org mode - not the html tag level
WriteNodesAsString is simple enough to implement but exposing it is helpful in
the implementation of extending writers and we don't aim to keep writer a small
interface so let's expose it.
Extension of the org & html writers is made possible by creating circular
references between the extending and extended writer - that way the extending
writer can forward all methods it doesn't implement to the extended writer and the
extended writer can use the extending writer as the root for method calls to
make sure methods overridden in the extending writer are used even for nested
method calls.
This circular reference leads to problems when cloning writers - cloning the
extended writer merely copies the pointer to the extending writer - i.e. the
extending writer does not get cloned with an updated reference to the extended
writer. Thus method calls to the extending writer act as if no cloning took
place and things break.
The easiest solution is to just get rid of cloning. We could also clone the
ExtendingWriter and replace it's reference to the extended writer with the just
cloned one but that's harder so we just remove it.
As there are a lot of "extending writer" and "extended writer" in the above
paragraphs and I'm too lazy to write up something better here's another attempt
at a TLDR:
Cloning is broken as ExtendingWriter is a reference to a writer that has
a reference to the writer we are cloning - that writer would have to have it's
reference updated but that's hard. So we solve it it by not cloning at all.
Go does not support inheritance, just composition. While composition with type
embedding (i.e. forwarding method calls to the embedded type) can replace
inheritance for most use cases this is not one of them. We really want to
overwrite methods so that method calls from inside the base writer also use the
custom methods ouf our extending writer - naive embedding does not work here
as the this in this.WriteText refers to the embedded type rather than the outer
extending type (see open recursion).
A simple solution is to make a reference of the extending type
available from the extended type and use that for nested method calls. We'll go
with that one as it does not require huge code changes. Another solution would
be to flatten the writing process and not use nested method calls - this is
what blackfriday does. Assuming the current solution works I feel it's cleaner
and keeps the ugliness of simulating inheritance with composition contained to
a small portion of the code while blackfridays approach requires all write
methods to be written in a flat style (i.e. not do nested calls to write by
being called twice with entering / leaving). The current solution becomes ugly
if we want to do multiple levels of extending but i don't expect that to be a
valid use case - if it turns out to be one we can always adapt to it
later. YAGNI.
now that i'm already looking at it due to the bug leenzhu found why not put the
</dt> on a separate line to match the convention - looks better to me; doesn't
change anything.
writer.footnotes must be a pointer as we copy the writer in nodesAsString() and
can thus end up modifying the footnotes.list slice without it being reflected in
the original writer (i.e. when the backing array of the slice changes).