Re: [code] [textadept] Encoding and display

From: Gabriel Bertilson <arboreous.philologist.att.gmail.com>
Date: Thu, 25 Jul 2019 12:57:35 -0500

I think CP1252 would be a better choice as the default single-byte codepage
in Textadept given that ISO-8859-1 in web pages is actually CP1252[1] and
ISO-8859-1 is the most common single-byte encoding in websites. The most
significant difference is that CP1252 has printable characters in the range
80-9F where the original ISO-8859-1 (and the ISO-8859-1 in Textadept) has
control codes or no characters.

— Gabriel

[1] https://encoding.spec.whatwg.org/#ref-for-windows-1252%E2%91%A0

On Wed, Jul 24, 2019 at 8:38 PM Mitchell <m.att.foicica.com> wrote:

> Hi,
>
> On Wed, 24 Jul 2019, Qwerky wrote:
>
> > Hi Mitchell,
> >
> > Thanks. Don't know how I missed the status bar display of encodings!
> >
> > Anyway, I added the line " io.encodings[#io.encodings + 1] = 'CP1252' "
> to my
> > init.lua, following the example in the api, and reset, but when opening
> my
> > file, it wasn't auto-detected?
>
> You can try making it the first encoding tried after UTF-8:
>
> table.insert(io.encodings, 2, 'CP1252')
>
> Textadept tries detecting encodings in order. As soon as it finds an
> encoding that doesn't throw an error, it assumes the encoding is "correct".
> However, the human eye may discern it is not correct, but Textadept
> wouldn't know it :)
>
> Encoding detection is a very tricky thing that is hard to get right. Sure,
> Textadept could employ the help of a multi-megabyte library whose sole job
> is to identify the encoding of a chunk of text thrown at it, but that is
> not very minimalist! Anyway, we've mostly settled on UTF-8 and UTF-16 for
> most text, and ASCII and ISO-8859-1 for programming languages. That's why
> Textadept looks for them primarily.
>
> Cheers,
> Mitchell

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Thu 25 Jul 2019 - 13:57:35 EDT

This archive was generated by hypermail 2.2.0 : Fri 26 Jul 2019 - 06:43:49 EDT