Re: Encodings

From: mitchell <>
Date: Fri, 13 Mar 2009 07:36:46 -0700 (PDT)


> It is very important, though, NOT to treat something as UTF8 unless
> you are very confident it is in fact UTF8 - remember TA crashing when
> opening binary files? That's bad. So, the current solution is looking
> for null bytes, and it works. Apparently it is "easy" to detect UTF8
> without using BOM - perhaps this technique could be used to tell if a
> file is UTF8 instead of telling that it is NOT UTF8 by looking for
> null bytes?

The only encoding detection I have is for UTF16, UTF32, and UTF8 BOM
as per your suggestion earlier. Otherwise null bytes are looked for in
binary files or the file is assumed UTF8. I don't add BOM to new UTF8
Received on Fri 13 Mar 2009 - 10:36:46 EDT

This archive was generated by hypermail 2.2.0 : Thu 08 Mar 2012 - 11:37:38 EST