Re: [code] [scintillua] Match patterns between embedded lexer start/end?

From: Claire Lewis <>
Date: Fri, 4 Oct 2013 23:04:48 +0930


> Due to Scintillua's internals, this technique suffers in the instance
> where you have two or more embedded ranges with different states and jump
> between editing them. The state may not update properly and your end_rule
> may not match properly as a result.

Thanks for the info, this is more or less as a feared when thinking about
the (potentially) non-linear nature of Scintilla's lexing.

> The second method depends on how many possibilities there are. If there
> are only a few, say `n`, then you could duplicate your embedded lexer `n`
> times with `n` different _NAMEs and embed for each possible start and end
> combination rule. Any more than a few and you risk hitting the maximum
> number of patterns allowed in the final grammar (~32700, imposed by LPeg).
> As you can imagine, this method is "safe" and would not rely on state.

Hmm, this might work for this specific case, where I'm attempting to allow a
lexer to handle variations of long-strings in Lua.

One such application is to lex LuaJIT FFI cdef declarations:

    ffi.cdef [[

But I also want to allow:

    ffi.cdef [=[

.. and so forth. I'd be happy enough to limit it to a few versions. Really
those two are probably enough.

The question is, how do I rename the embedded lexer? In this case I'm using
the existing cpp.lua lexer - am I right in thinking I would have to
replicate that multiple times, as opposed to some method where my "meta"
lexer, which uses lua by default and embeds cpp could load the same lexer
from cpp.lua but with multple names?

- Claire

You are subscribed to
To change subscription settings, send an e-mail to
To unsubscribe, send an e-mail to
Received on Fri 04 Oct 2013 - 09:34:48 EDT

This archive was generated by hypermail 2.2.0 : Sat 05 Oct 2013 - 06:49:52 EDT