Re: [code] Enabling ta to edit large files

From: Mitchell <m.att.foicica.com>
Date: Tue, 12 Jun 2018 17:13:26 -0400 (EDT)

Hi Nicholas,

On Fri, 8 Jun 2018, Nicholas Ochiel wrote:

> I attempted to open a 40MB log file and a compiled react app (one
> large js file) and noticed that ta couldn't open/edit these files in a
> reasonable amount of time and without thrashing the cpu the whole time
> the buffer remained open. I notice the "large file" issue has been
> mentioned before but a more detailed technical breakdown of why this
> happens has never been provided as far as I can tell.

Do your files have long lines? Scintilla[1] (Textadept's editing component) has been known to take its time rendering long lines. Try opening your files in another Scintilla-based editor like SciTE[2] and see if you experience the same performance. (You may have to select JS or HTML as your React lexer, which is not perfect, but it's good enough from a benchmarking perspective.)

If another Scintilla-based editor can handle your files okay, then Textadept is to blame.

> Recently, vscode and atom both fixed their problems in this regard by
> implementing piece-table styled data structures for buffers:
> - http://blog.atom.io/2017/10/12/atoms-new-buffer-implementation.html
> - https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation
> - vis (C/Lua) also uses a "piece chain"
> https://github.com/martanne/vis/wiki/Text-management-using-a-piece-chain
> and opens large files instantly.

Textadept loads the entire file into memory at once and then displays it. While this initial load may be slow, subsequent edits should not be (unless there are really long lines). Scintilla offers a way to load a file piece-meal, but Textadept does not take advantage of this.

Textadept's Lua lexers may also be to blame. Performance may vary from lexer to lexer depending on how complex it is. React is HTML embedded in JS right? Perhaps Textadept's React lexer is not very efficient in some circumstances. Simple lexers are as fast as Scintilla's native C++ ones, though.

> I'd like to ask:
> - Please could a more technical description be of the ta/scintilla
> buffer data structure and its performance characteristics be provided
> so that the problem can be understood by plebeians such as myself?

Scintilla uses a Gap buffer I think. I don't know much about it, but you may be able to find more by asking on the Scintilla mailing list[3].

> - What is the best solution to enable ta to perform as well as the
> above mentioned editors? Is there a reason why it shouldn't?
>
> - If a solution has already been proposed/considered, would anyone be
> willing to provide mentorship for me to implement the solution? If
> one hasn't been discussed, please could I be pointed in the right
> direction on
> 1) where in the codebase of ta/scintilla to focus my attention.
> 2) The best approach to profiling large file performance.

We can answer these perhaps after identifying which component to blame.

> - Would such a patch be accepted?

It depends. If Scintilla needs to be patched, then Neil (Scintilla's author) would have the final say. If it's a Lua lexer issue, then yes I would accept a patch. If it's something else, we'll see.

Cheers,
Mitchell

[1]: https://scintilla.org
[2]: https://scintilla.org/SciTE.html
[3]: http://groups.google.com/group/scintilla-interest

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Tue 12 Jun 2018 - 17:13:26 EDT

This archive was generated by hypermail 2.2.0 : Wed 13 Jun 2018 - 06:51:15 EDT