Re: [code] Oddity in trying to make a 7.0 lexer

From: Mitchell <m.att.foicica.com>
Date: Sun, 3 Nov 2013 23:12:27 -0500 (EST)

Hi Michael,

On Mon, 4 Nov 2013, Michael Richter wrote:

> I have this rule for keywords:
>
> local keyword = token(l.KEYWORD, word_match{
> 'module', 'end_module', 'interface', 'implementation', 'pred', 'func',
> 'mode', 'det', 'semidet', 'nondet', 'multi', 'pragma', 'foreign_proc',
> 'impure', 'semipure', 'promise_pure', 'promise_semipure', 'foreign_type',
> 'foreign_decl', 'type', 'import_module', 'include_module', 'cc_multi',
> 'initialise', 'finalise', 'initialize', 'finalize', 'foreign_enum',
> })
>
> It works, but perhaps a bit too eagerly. For while it will highlight, as
> expected, "*module*" and "*import_module*" and "*include_module*", it will
> also highlight "general_*module*" (just the "module" part in case the bold
> didn't show through).
>
> How would I get around this? I would think that word_match would match
> whole words only, not word fragments.

It's hard to tell for sure without being able to look at your entire
lexer, but are you matching identifiers properly? (e.g. things with
underscores in them) Keep in mind that anything not matched in `M._rules`
is given a "default" token and the lexer tries matching rules again
starting with the next character. In your case, '_' is likely being
counted as "default" and then "module" is being matched as a keyword.

Cheers,
Mitchell

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Sun 03 Nov 2013 - 23:12:27 EST

This archive was generated by hypermail 2.2.0 : Mon 04 Nov 2013 - 06:54:45 EST