Re: [code] Factor lexer for Textadept

From: Michael Richter <ttmrichter.att.gmail.com>
Date: Thu, 4 Apr 2013 14:31:38 +0800

Here's an almost-functioning version for nested lexing:

local ws = l.space
local maybe_some = function(p) return p^0 end
local all_but = function(p) return P(1) - p end
local with_ws = function(p) return p * #ws end
local ws_with = function(p) return B(ws) * p end
local ws_with_ws = function(p) return B(ws) * p * #ws end
local text_to = function(t) return maybe_some(all_but(t)) end
local stack_declaration = with_ws(P{
                            'stack';
                            stack = with_ws(P'(') * lpeg.V'open_words'
                                  * ws_with_ws(P'--') * lpeg.V'close_words'
                                  * ws_with(P')'),

                            open_words = ( some(ws)
                                          * lpeg.V'colon_word'
                                          * lpeg.V'open_words' )
                                        + text_to(P'--'),
                            close_words = lpeg.V'colon_word'
                                        + text_to(P')'),
                            colon_word = text_to(P':') * some(ws) *
lpeg.V'stack',
                          })

This will correctly match things like:

( word word -- word word )
( -- word )
( word -- )
( -- )
( word -- word: ( word -- word ) )
( word word: ( word -- word ) -- )

It fails, however, on things like:

( word: ( word -- word ) *failed_word* -- *failed_word* )
( word: ( word -- word ) -- *failed_word* )
( word -- word: ( word -- word ) *failed_word* )

The *bolded* words above are not being properly highlighted as part of the
expression. (The -- and ) are because they're being caught in another
"operator" rule, but the specififed *failed_word* instances are being
treated as identifiers.

To me it looks as if the lexer, after getting a *colon_word* and a
*stack*isn't going back to try out the rest of the rules after the
recursive call,
yet, for example, the open_words rule, to me, reads "there's some
whitespace followed either by a colon_word and then more open_words or text
up to the double-dash". I'm very explicitly having open_words call back
into itself recursively if it matches a colon_word.

This is mystifying me pretty badly.

On 4 April 2013 11:54, Michael Richter <ttmrichter.att.gmail.com> wrote:

> It will indeed. Nothing like working code to show how to get code
> working! :D Thanks again, Mitchell.
>
>
> On 4 April 2013 11:42, Mitchell <m.att.foicica.com> wrote:
>
>> Hi Michael,
>>
>>
>> On Thu, 4 Apr 2013, Michael Richter wrote:
>>
>> OK, I've got stack declarations much more solid now:
>>>
>>> -- some building blocks
>>> local maybe_some = function(p) return p^0 end
>>> local ws = l.space
>>> local with_ws = function(p) return p * #ws end
>>> local ws_with = function(p) return B(#ws) * p end
>>> local ws_with_ws = function(p) return B(l.space) * p * #ws end
>>>
>>> local stack_declaration = with_ws(P'(')
>>> * maybe_some(1 - P'--')
>>> * ws_with_ws(P'--')
>>> * maybe_some(1 - P')')
>>> * ws_with(P')')
>>>
>>> This works for non-nested stack declarations exactly as expected. It, of
>>> course, breaks down on stack declarations like "( value quote: ( word --
>>> word ) -- word )". Attempts to build in nested stack declarations,
>>> however, have failed because I'd have to define stack declarations in
>>> terms
>>> of stack declarations recursively which makes the lexer get all upset and
>>> throw up its hands while refusing to recognize *anything* at all.
>>>
>>>
>>> Am I correct in reading http://www.inf.puc-rio.br/~**
>>> roberto/lpeg/#grammar <http://www.inf.puc-rio.br/~roberto/lpeg/#grammar>to
>>> think that I now have to bite the bullet and figure out how to use LPEG
>>> grammar tables?
>>>
>>
>> Nested and recursive patterns generally involve grammars, yes. The
>> hypertext lexer uses lpeg.P{} and lpeg.V() when defining attributes
>> recursively. Perhaps that may help.
>>
>>
>> Cheers,
>> Mitchell
>> --
>> You are subscribed to code.att.foicica.com.
>> To change subscription settings, send an e-mail to code+help.att.foicica.com.
>> To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
>>
>>
>
>
> --
> "Perhaps people don't believe this, but throughout all of the discussions
> of entering China our focus has really been what's best for the Chinese
> people. It's not been about our revenue or profit or whatnot."
> --Sergey Brin, demonstrating the emptiness of the "don't be evil" mantra.
>

-- 
"Perhaps people don't believe this, but throughout all of the discussions
of entering China our focus has really been what's best for the Chinese
people. It's not been about our revenue or profit or whatnot."
--Sergey Brin, demonstrating the emptiness of the "don't be evil" mantra.
-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Thu 04 Apr 2013 - 02:31:38 EDT

This archive was generated by hypermail 2.2.0 : Thu 04 Apr 2013 - 06:36:24 EDT