[ic] IC 5.7.4 , GDBM & UTF8

Gert van der Spoel gert at 3edge.com
Sat Mar 26 20:11:37 UTC 2011


> Does anybody have a working setup with Interchange, GDBM and UTF8?
>
> I have the following situation:
> - locale.txt  -> UTF8
> - pages with includes (textfiles) -> UTF8
>
> Scenario 1)
>
> catalog.cfg  contains:
> Variable MV_HTTP_CHARSET UTF-8
> Variable MV_UTF8 1
> DatabaseDefault GDBM_ENABLE_UTF8 1
>
> - remove locale.gdbm
> - start interchange
> - creates locale.gdbm
>
> Result: includes are displayed correctly .. data from locale.gdbm  is not
>
> Scenario 2)
> catalog.cfg  contains:
> Variable MV_HTTP_CHARSET UTF-8
> Variable MV_UTF8 1
>
> - remove locale.gdbm
> - start interchange
> - creates locale.gdbm
>
> Result: includes are displayed correctly .. data from locale.gdbm  is not
>
> Scenario 3)
> catalog.cfg contains:
> Variable MV_HTTP_CHARSET UTF-8
> Variable MV_UTF8 1
> DatabaseDefault GDBM_ENABLE_UTF8 1
>
> - keep locale.gdbm from Scenario 2)
> - start interchange
> - NO new locale.gdbm created, old one still there
>
> Result: includes and data from locale.gdbm are displayed correctly
>
>
> So in order to get my GDBM data to be displayed correctly after an update
to my .txt file
> is to disable DatabaseDefault GDBM_ENABLE_UTF8 1 , start interchange, then
> change my cfg to enable DatabaseDefault GDBM_ENABLE_UTF8 1  and restart
> interchange.
>
> Can anybody reproduce this?
> 

Fun oh fun to be migrating your sites and running into the same thing as
over a year ago (by now at 5.7.6 ... ).

I've solved my problem by commenting out line 69 in lib/Vend/Table/GDBM.pm:

apply_utf8_filters($dbm) if $config->{GDBM_ENABLE_UTF8};

My theory is the following:
- the TXT files have to be in UTF8 encoding already, else they are not
usable when you have dutch, english, greek etc in it.
- when the above line is in the code (which is in 'sub create') it will
actually double encode the UTF8 that is in the TXT files.

So I believe this line should not be there.

Ps. In the demo site you will also notice an encoding problem here when
picking German. The locale file is UTF8 (because of the Slovenian / Croatian
translations in it), but the demo is not using UTF8 to run - yet ... 

This causes the German to not display correctly when chosen.

CU,

Gert
















More information about the interchange-users mailing list