Re: [code] spellchecker and accented character

From: Olivier Guibé <olivier.guibe.att.univ-rouen.fr>
Date: Thu, 29 Sep 2016 09:03:08 +0200

Hi Mitchell

Thanks for your answer and test.
Even if I have zero-knowledge in lua, C++, makefile, etc
yesterday I tried to understand (hunspell.cxx for example).

To save space all (common) words are stored with lower case characters,
if "Définition" is sent to hunspell then the word is transformed to
the lower case version, then hunspell verifies if it is in the
dictionary, etc.
I modified hunspell.cxx and remarked that with Textadept spell checker
1) for "Maison" the spell process is OK
2) for "Définition", the word is not transformed into lower case, and
then not verified (there is case NOCAP, CAP, etc..)

I tested a compiled version of hunspell and launch hunspell_modifiy -D
my_dic to verify that "Définition" is well transformed.

When you add "Définition" to the personal dictionary, everything is
fine (there is a correspondance with a word in the dictionary).

Another fact : with textadept spell checker the encoding of the
dictionary is not well detected. Indeed in fr_utf8.dic there is a
TRY string, (TRY aeioàéèêîâsinrtlcdugmphbyfvkwôûëöïù½ ) :
1) textadept spell checker : utf8 is not detected and in place of
aeioàéèêîâsinrtlcdugmphbyfvkwôûëöïù½ I have some strange characters
(the accented character àéè are replaced)
2) hunspell (modified) : utf8 is detected and print
aeioàéèêîâsinrtlcdugmphbyfvkwôûëöïù½

So it is an encoding issue...

Regards
O.G.

Le Sat, 24 Sep 2016 18:48:43 -0400 (EDT),
Mitchell <m.att.foicica.com> a écrit :

> Hi Olivier,
>
> On Sat, 24 Sep 2016, Olivier Guibe wrote:
>
> >
> > Hello
> >
> > I tested today Textadept v8.7 (and also the hg source version) with
> > the spellchecker (from the Wiki), under Linux Debian Sid 64 bits.
> > With an UTF-8 dictionnary (I convert to isolatin1 to utf8 to have
> > the correct spell checking), I have a strange behavior: a correct
> > word starting by a capital letter and contains and accented
> > character is misspelled. For example 'Maison' is well spelled
> > 'définition' is well spelled but
> > 'Définition' (a correct word) is misspelled. Moreover with F7, the
> > suggestions contain "Définition". Choosing it and then S-F7 again
> > give a misspelled "Définition".
>
> Hmm, it sounds like there's still encoding issue.
>
> I added both 'définition' and 'Définition' (in UTF-8) to my user
> dictionary and neither show up as misspelled in a buffer. I then
> appended an 'a' to 'définition' (to make 'définitiona'). That showed
> up as misspelled, so I triggered suggestions and replaced with
> 'Définition'. Still no misspelling.
>
> Cheers,
> Mitchell

-- 
Olivier Guibé
Laboratoire de Mathématiques Raphaël CNRS -- Université de Rouen 
Avenue de l'Université, BP.12
76801 Saint-Étienne du Rouvray (France)
tel: (33) [0]2 32 95 52 56
fax: (33) [0]2 32 95 52 86
-- 
You are subscribed to code.att.foicica.com.
To change subscription settings, send an e-mail to code+help.att.foicica.com.
To unsubscribe, send an e-mail to code+unsubscribe.att.foicica.com.
Received on Thu 29 Sep 2016 - 03:03:08 EDT

This archive was generated by hypermail 2.2.0 : Thu 29 Sep 2016 - 06:25:40 EDT