[code] Re: Debugging language modules

From: Arnel <jalespring.att.gmail.com>
Date: Tue, 29 Mar 2016 00:42:37 +0800

On Mon, 28 Mar 2016 09:15:46 -0400 (EDT), Mitchell <m.att.foicica.com> wrote:
> >>> - Is there a better way to debug lexer modules? The Racket lexer I'm working on
> >>> was based off the Scheme lexer file provided with TA. I've read somewhere in
> >>> the API manual that troubleshooting lexers can be tricky and it's recommended
> >>> to run TA in the terminal to get the error messages. I tried this but I didn't
> >>> get any. Those who have written lexers for other languages before - any
> >>> pointers? Anything on seeing what's actually captured by the LPEG expressions
> >>> would be great.
> >>
> >> If you don't see any error messages in the terminal by default, then that
> >> means your lexer is well formed and is processing text just fine. However,
> >> that doesn't mean your lexer is processing text as you'd expect! Robert
> >> already mentioned using Scintillua as a library (which is an idea I hadn't
> >> thought of!). Normally I just use:
> >>
> >> P(function(input, index)
> >> _G.print(...)
> >> return index
> >> end
> >>
> >> and put that in a pattern I'm debugging. The "return index" line ensures
> >> that debug function "matches" so that text matching can continue.
> >
> > Could you elaborate further how I can add this to any pattern? Say I have
> > something like:
> >
> > local keywords = token(l.KEYWORD, word_match({
> > '#%app', '#%datum', '#%declare', '#%expression', '#%module-begin',
> > ;; ...
> > }, '!#%*+-./:=>?_'))
> >
> > How do I use that to print the captured text (or index as it were)?
> >
> > (I tried the example given in Scintillua for using it as a library, but for
> > some reason I'm getting an error message about not seeing 'lpeg' even though
> > I've installed it via luarocks, so I thought I'd try this instead.)
> local keywords = token(l.KEYWORD, word_match({
> '#%app', '#%datum', '#%declare', '#%expression', '#%module-begin',
> ;; ...
> }, '!#%*+-./:=>?_')) * P(function(input,index)
> print('keyword end:', index)
> return index
> end)
> That will print the index of the end of a matched keyword.
> As for your issue with LPeg, it may help to check where LuaRocks installed
> lpeg (`luarocks list`) and open a Lua interactive session (`lua`) and type
> `=package.cpath`. If lpeg isn't in the cpath, then you may be running the
> incorrect version of Lua. For example on my Ubuntu machine, I have Lua
> 5.0, Lua 5.1, and Lua 5.2 installed, but only Lua 5.1 can see LuaRocks
> modules by default. Since `lua` points to Lua 5.2, I have to use `lua5.1`
> in order to 'require' lpeg.

Finally figured out what was causing my Racket lexer not to work.

The list of the built-in keywords and function names were originally generated
by a Racket script given by someone from the Racket IRC channel. I suspect
it was a slightly modified version of the one provided with Emacs's

Anyway, it turned out a couple of the function names contained Unicode
characters (U+2200 and U+018E), and the original script for some reason
provided their names as a pair of "u'new-/c'" strings. When I commented out
those two strings, the lexer started working properly.

There's a small number of functions in Racket which use Unicode characters
(including the lambda symbol). Should I leave all of them out for now? I'm not
completely sure if they will work properly on Windows (unless they install a
font with support for Unicode symbols like DejaVu Sans Mono, maybe).

Apologies for all the Scintillua/LPEG noise.

Thank you,
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Mon 28 Mar 2016 - 12:42:37 EDT

This archive was generated by hypermail 2.2.0 : Tue 29 Mar 2016 - 06:54:33 EDT