Re: [code] [textadept] Lexing markdown: folding sections, lexing fenced code blocks.

From: Mitchell <>
Date: Sun, 3 Jun 2018 18:05:57 -0400 (EDT)

Hi Nicholas,

On Sun, 3 Jun 2018, Nicholas Ochiel wrote:

> # 1. Lexing fenced code blocks
> In markdown documents, I might have arbitrary languages in fenced blocks e.g.
> ``` haskell
> -- code here
> ```
> ``` tex
> \equation{ ... }
> ```
> I would like the markdown lexer to embed the appropriate lexers.
> html.lua seems to initialise every lexer that might be embedded in a
> html document. As I receive and generate documents with a wide and
> varied set of languages, doing the same thing for markdown seems a bit
> excessive and possibly inefficient (In vim, I'd have to load the
> relevant parsing even if the current document doesn't embed any code).
> Might it be possible to "dynamically" embed the lexers depending on
> the label on the fence?

It's an interesting idea, but I don't think this is possible. The lexer framework assumes all possible styles (e.g. comment, keyword, string, etc.) are defined when the lexer is initially loaded (this includes sub-lexer styles too). Modifying the lexer after load may result in strange behavior (if it's even possible).

> Naively (I don't know what I'm doing), might it be possible to have
> the code look something like this:
> ---8<--
> local fence_end_rule = #(lexer.newline *^0 * P('```')^-1)
> * P('```')
> local fence_start_rule = #(P('```') *^0 * P(function(input, index)
> _, _, fence_lang = input:find('^%s*(%w+)', index)
> ui.print ("fence_lang = " .. fence_lang)
> if fence_lang then
> local fence_lexer = lexer.load(fence_lang)
> lex:embed(fence_lexer, fence_start_rule, fence_end_rule)
> return index
> end
> end)) * P('```') * lexer.nonnewline^0 * lexer.newline^-1
> ---8<--

It's an interesting idea that would probably fail in interesting ways. I've never thought of doing something like this and did not take this into consideration when writing the lexer framework.

> #2. Folding '#' headers
> - How might I fold markdown sections with '#'-style headers, with the
> fold level determined by the number of hashes? Perhaps someone has
> already written the relevant snippet?

I don't think this has been done before. I think you'd have to write a custom folding function.

   function lex:fold(text, start_pos, start_line, start_level)
     -- ...

You would probably iterate over lines in `text` and check for lines that start with `#`. When you encounter such a line, it would be marked a fold header, and then increment the fold level if encountering one or more `#`s than encountered previously, or decrement the fold level if encountering one or fewer `#`s than encountered previously.

This would be hard to get right. You can reference *lexers/lexer.lua*'s `M.fold()` function for iterating over lines and setting fold levels and headers.


You are subscribed to
To change subscription settings, send an e-mail to
To unsubscribe, send an e-mail to
Received on Sun 03 Jun 2018 - 18:05:57 EDT

This archive was generated by hypermail 2.2.0 : Mon 04 Jun 2018 - 06:35:09 EDT