Re: [code] [textadept] Could you please give me advice to configure encoding menu correctly?

From: Mitchell <m.att.foicica.com>
Date: Sun, 23 Jul 2017 14:15:08 -0400 (EDT)

Hi Yuki,

On Sat, 22 Jul 2017, Outlook Yuki wrote:

> Hi, I'm Yuki
>
>
> I'm trying to add CP932 and CP936 in encoding menu, but I can not correctly load files of these encoding by using the menus.
> Could you please give me advice to configure encoding menu correctly?
>
> I configured textadept like following.
>
>
> 1. ~\.textadept\init.lua
> Added encoding to io.encodings table
>
> table.insert(io.encodings, 3, 'CP932')
> table.insert(io.encodings, 3, 'CP936')
> ui.set_theme('light', {font = 'Monospace', fontsize = 10})
>
> 2. ~\.textadept\modules\textadept\menu.lua
> Added items under Encding menu
>
> {
> title = _L['E_ncoding'],
> {_L['_UTF-8 Encoding'], function() set_encoding('UTF-8') end},
> {_L['_ASCII Encoding'], function() set_encoding('ASCII') end},
> {_L['_ISO-8859-1 Encoding'], function() set_encoding('ISO-8859-1') end},
> {_L['UTF-1_6 Encoding'], function() set_encoding('UTF-16LE') end},
> {_L['CP932 Encoding'], function() set_encoding('CP932') end},
> {_L['Shift_JIS Encoding'], function() set_encoding('Shift_JIS') end},
> {_L['CP936 Encoding'], function() set_encoding('CP936') end},
> {_L['GB2312 Encoding'], function() set_encoding('GB2312') end}
> },
>
>
>
> As a result, I can open cp936 file without gabled characters, but get gabled characters in cp932 file.

Thanks for your detailed message. Based on your configuration, Textadept appears to be attempting to pick an encoding from the following list:

   * UTF-8
   * ASCII
   * CP936
   * CP932
   * ISO-8859-1
   * ...

When you open a CP932 file, Textadept tries UTF-8, followed by ASCII, both of which should fail. Then it tries CP936, which "works" in that there is no encoding failure. (This is why Textadept shows "CP936" in the statusbar.) If CP932 came before CP936 in the encoding list, you would get a correct encoding detection. However, it looks like the two encodings can be used interchangeably, so now your CP936 file would not be detected properly.

I do not know the best way to handle this automatically. Textadept was designed as an editor for source code, not plain text, so it's a bit weak when it comes to properly handling encodings, especially if they are ambiguous.

What you can try to do is manually set `buffer.encoding = 'CP932'` via the command entry, and then try and select the "CP932" encoding from the menu in order to "reset" the display encoding. I cannot verify this myself because the characters in your attachments did not appear to come through properly.

Cheers,
Mitchell

-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Sun 23 Jul 2017 - 14:15:08 EDT

This archive was generated by hypermail 2.2.0 : Mon 24 Jul 2017 - 06:54:32 EDT