Re: improvements for opening large text files

From: Alexandru Draghina <alexdragh....at.gmail.com>
Date: Tue, 31 May 2011 00:11:36 -0700 (PDT)

On May 30, 7:46pm, mitchell <c....at.caladbolg.net> wrote:
> Hi,
>
>
>
>
>
> On Tue, 24 May 2011, Alexandru Draghina wrote:
>
> > On May 24, 1:25�pm, Robert <ro....at.web.de> wrote:
> >> On Tue, May 24, 2011 at 12:16 PM, Alexandru Draghina
>
> >> <alexdragh....at.gmail.com> wrote:
> >>> Thanks Steve for the info.
>
> >>> It's a little strange - because Notepad++ is built on Scintilla too,
> >>> if i remember correctly; but probably it has a different 'max.size'
> >>> limit.
>
> >>> I agree, a text editor (programmer's editor) is not supposed to open
> >>> such large files - but on the other hand, I need to analyze these
> >>> logs, I need syntax highlighting, regular expressions, split views and
> >>> so on - things that should be done by a programmer editor, not a Hex
> >>> editor.
>
> >>> I'll try to see if I can change the memory limit in Textadept - with
> >>> the risk of running out of virtual memory, or moving slow. Because it
> >>> stops itself with the "not enough memory" message, I'm sure there is a
> >>> size limit hardcoded somewhere. Otherwise, maybe I can find a method
> >>> to load large files in smaller chunks (haven't done this before, but..
> >>> who knows?)
>
> >>> Regards,
> >>> Alex
>
> >>> On May 24, 12:51�pm, steve donovan <steve.j.dono....at.gmail.com> wrote:
> >>>> On Tue, May 24, 2011 at 11:14 AM, Alexandru Draghina
>
> >>>> <alexdragh....at.gmail.com> wrote:
> >>>>> Does anyone has any ideas about how to make some improvements (in what
> >>>>> areas..) so I'll be able to open such large files?
>
> >>>> It is mostly a limitation of the underlying Scintilla editing control.
>
> >>>> Here's Neil Hodgson (Scintilla author) on the subject:
>
> >>>> http://www.mail-archive.com/scite-inter...@lyra.org/msg01376.html
>
> >>>> steve d.
>
> >>> --
>
> >> I tried opening a ~500 MB file with Scite and Textadept. Scite
> >> displays it after maybe 25 seconds. Trying it with TA almost freezes
> >> my machine.
> >> Textadept loads the file with a call to io.read('*all'), I don't know
> >> what Scintilla does.
> >> You can definitely use Textadept to build your own customized tools,
> >> to just load
> >> a few lines.
>
> >> Robert
>
> > Yes, I saw that the 'out of memory' error is given by io.read("*all"),
> > when lua reaches about 780MB of ram (from the windows task manager). I
> > also did a test with Lua only, and it's a limitation from Lua. I'm
> > looking for a "workaround"..
>
> One thing that might be worth investigating is to use file:read() while
> not EOF and appending each read to the Scintilla buffer. file:read('*l')
> does not include EOL so file:read([number]) should probably be used. I
> don't know what a good number to use is though.
>
> mitchell

hello,

I've managed to open a large file (~350MB), by using file:read() with
2MB chunks until EOF, without any crash. First chunk is used to check
for the encoding, and line endings, then the next chunks are just
appended with ADD_TEXT. I've tried with different chunk sizes(from
128KB up to 6MB - but on my system 2MB was just right. Even so - it
took quite a while to load it.

I also had to add a few calls to Scintilla, trying to 'mimic' the
procedures done by Scite (SCI_BEGINUNDOACTION, SCI_CLEARALL and
especially SCI_ALLOCATE) - before reading the file.

I could browse up and down in the file, but the moment I made any
change (just tried to erase a line), it crashed.

I'll investigate further these days, but currently I'm quite busy at
Received on Tue 31 May 2011 - 03:11:36 EDT

This archive was generated by hypermail 2.2.0 : Thu 08 Mar 2012 - 12:09:03 EST