Re: [code] [textadept] Lexer rule for function names

From: Mitchell <>
Date: Wed, 16 Jan 2019 09:22:32 -0500 (EST)

Hi Joshua,

On Tue, 15 Jan 2019, Joshua Krämer wrote:

> Dear list,
> I would like to extend the ansi_c lexer to highlight function names in
> function definitions. As a test, in the following two lines, I would
> like only the word "function" to be highlighted:
> struct test *function () {};
> struct test function () {};
> if () {} else if () {};
> I have tried the following rule (inserted on line 13), but it does not work:
> lex:add_rule('function', -P('}') * (^0 * lexer.word)^1 *
> lpeg.S(' \t*')^1 * token(lexer.FUNCTION, lexer.word) *^0 *
> lexer.delimited_range('()', false, true, true))
> It highlights "struct test *function" on the first line.
> Could somebody please help me with this rule?

Long story short: all text your rule pattern matches should be "tokenized". Textadept considers everything up to, and including your `lexer.FUNCTION` token as a single token. That is why you're seeing multiple words highlighted.

Rather than tokenizing each part of your complex pattern, I'd do something much simpler:

   lex:add_rule('function', token(lexer.FUNCTION, lexer.word) *
     #(S(' \t')^0 * lexer.delimited_range('()') * S(' \t')^0 * '{'))

I would also put that rule just above the 'identifier' rule, so it does not catch control structures like `if`, `switch`, etc. The pattern simply considers a word to be a function if it is preceded by any space, followed by an argument list, followed by any space, followed by the start of a block. It just checks for the suffix; it doesn't match it (note the '#'). Thus, the lexer can properly tokenize it.

You may have to tweak it to allow for newlines if that is your style, but it does the job for the sample code you provided.


You are subscribed to
To change subscription settings, send an e-mail to
To unsubscribe, send an e-mail to
Received on Wed 16 Jan 2019 - 09:22:32 EST

This archive was generated by hypermail 2.2.0 : Thu 17 Jan 2019 - 06:44:23 EST