It's possible for the input to end right after the explicit line break,
i.e. after the second \. This currently leads to an out of range index into
input (as the for loop starts with start+2 and [start:start+1] is the \\).
The regexps are meant to extract a match immediately following the cursor - the
anchor should have been there from the beginning...
Also empty sub/superscript doesn't make sense - nested sub/superscript does
make sense but yagni.
Until now we expected the .org file to print back to itself - we can't do that
when the input is not pretty printed already - with the introduction of blocks
with unindented content that will be the case.