[ic] Non-US keys = UTF-8 issue?

Jordan Adler jordan at endpoint.com
Thu Feb 7 18:48:10 EST 2008


On Thu, February 7, 2008 18:22, Grant wrote:
> I have a Swedish address stored in mysql and it is full of strange
> characters.  Some stars, some paragraph symbols, etc.  Should I be
> declaring UTF-8 somehow in dbconf/mysql/orders.mysql to avoid this?
> Is there any way to recover the correct data at this point?
>

Well, first you need to isolate whether or not it's correct in the
database, and not through Interchange.

MySQL has two configurations with regards to characters: character sets
and collation.  Character sets define the encoding for the bytes, whereas
collation defines how to manipulate said data through sorting and other
means.  See http://dev.mysql.com/doc/refman/5.0/en/charset-general.html
for more information.

As far as Interchange goes, I'm not entirely sure what the state of UTF-8
support is, but from what I understand, it's less than ideal.  It may not
be functional at all.

If the data encoded in UTF-8 Unicode was saved in MySQL under a different
character set, it may not be recoverable whatsoever.  Your first tactic
will be to attempt to change the character set for that field (or table)
alone, and hope MySQL has not changed it.  Something like:

ALTER TABLE foo CHARACTER SET utf8 COLLATE utf8_swedish_ci;

should suffice.  Note also that your client's encoding will need to be
utf8 (SET NAMES 'utf8';), as well as any terminal emulator you may be
using, and any software such as screen or ssh that will transmit the data.
 I'm actually not sure that ssh needs specific instructions to encode
characters, but your terminal emulator (PuTTy, xterm, rxvt, etc.) will
definitely need proper configuration.  Oh, and two more things:  MySQL
will need to be compiled with support for utf8, and you will need to have
that locale available (assuming Linux).

Gee, isn't Unicode fun?!  (To be fair, it's the software support for it
that is lacking.  Don't believe me?  Try to edit a utf8 encoded file in
vim in screen on OpenBSD.  Just *try* to.)

-- 
Thanks,

Jordan M. Adler
End Point Corporation
jordan at endpoint.com


More information about the interchange-users mailing list