Re: [code] [scintillua] Markdown lexer

From: Mitchell <m.att.foicica.com>
Date: Thu, 31 Jan 2013 12:22:45 -0500 (Eastern Standard Time)

Robert,

On Wed, 30 Jan 2013, Robert wrote:

> Hi,
>
> in the examples below (from the Markdown docs[1], but I found this with
> a real world markdown file) the indented lines get lexed as blockcode.
> I am not sure if there is a way around this as long as lexing is done by line.
>
> [snip]

Background: Scintill(u)a stores a "last styled" position where lexing is
accurate up until that point. For languages like C or Lua, this position
could be in the middle of a string or comment. In order for the LPeg
lexers to work, they need to "backtrack" from this "last styled" position
until they recognize a style change. In such cases, the lexers would
probably come to whitespace before a string or comment. Starting from this
point (instead of within the middle of a string or comment), the lexers
would be able to lex text appropriately.

Unfortunately I do not see a solution for Robert's problem even if the
Markdown lexer is a normal one. If Markdown was a normal lexer, it would
have to style entire chunks of text as one style (in your example, the
entire numbered paragraph needs to be styled the same, including
whitespace) to be accurate. This excludes the ability to have inline
elements like *emphasis* and `code` since the lexer would not always know
what state it was in before inlined elements. It may be possible to take
advantage of the Scintilla SCI_*LINESTATE API, but I'm not terribly
optimistic :(

Mitchell

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Thu 31 Jan 2013 - 12:22:45 EST

This archive was generated by hypermail 2.2.0 : Fri 01 Feb 2013 - 06:40:25 EST