Re: [code] [scintillua] Match patterns between embedded lexer start/end?

From: Claire Lewis <claire_lewis.att.live.com.au>
Date: Wed, 9 Oct 2013 19:54:46 +1030

Thanks for the patch, unfortunately I think it's going to be somewhat more
complex than that.

As far as I can see from looking at the source, the lexer table returned is
the module itself. So changing _NAME via alt_name doesn't really help, a
the same package/table is of course returned from require in either case.
The name is simply overridden the second time, and so only the second
version works.

In fact, I'm not even sure the _NAME is set unless there's an error with
require.

Anyway, I'll have a bit more of a dig and let you know what I come up with.

Thanks,
- Claire.

-----Original Message-----
From: Mitchell
Sent: Friday, October 04, 2013 11:21 PM
To: code.att.foicica.com
Subject: Re: [code] [scintillua] Match patterns between embedded lexer
start/end?

Hi Claire,

On Fri, 4 Oct 2013, Claire Lewis wrote:

> [snip]
>
>> The second method depends on how many possibilities there are. If
>> there
>> are only a few, say `n`, then you could duplicate your embedded lexer
>> `n`
>> times with `n` different _NAMEs and embed for each possible start and
>> end
>> combination rule. Any more than a few and you risk hitting the maximum
>> number of patterns allowed in the final grammar (~32700, imposed by
>> LPeg).
>> As you can imagine, this method is "safe" and would not rely on state.
>
> Hmm, this might work for this specific case, where I'm attempting to
> allow a lexer to handle variations of long-strings in Lua.
>
> One such application is to lex LuaJIT FFI cdef declarations:
>
> ffi.cdef [[
> ...
> ]]
>
> But I also want to allow:
>
> ffi.cdef [=[
> ...
> ]=]
>
> .. and so forth. I'd be happy enough to limit it to a few versions.
> Really those two are probably enough.
>
> The question is, how do I rename the embedded lexer? In this case I'm
> using the existing cpp.lua lexer - am I right in thinking I would have
> to replicate that multiple times, as opposed to some method where my
> "meta" lexer, which uses lua by default and embeds cpp could load the
> same lexer from cpp.lua but with multple names?

Try the attached patch. You should be able to do something like this from
your "meta" lexer:

    local cpp = l.load('cpp')
    local cpp2 = l.load('cpp', 'cpp2')
    ...

    l.embed_lexer(M, cpp, P('[['), P(']]'))
    l.embed_lexer(M, cpp2, P('[=[', P(']=]')
    ...

You need to wrap your start and end rules in tokens, but you get the idea.

If you manage to get that to work, let me know and I'll commit the patch
to hg so others can take advantage of this technique.

Cheers,
Mitchell

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Wed 09 Oct 2013 - 05:24:46 EDT

This archive was generated by hypermail 2.2.0 : Wed 09 Oct 2013 - 06:54:58 EDT