[ic] Call for testers
Gert van der Spoel
gert at 3edge.com
Wed Jun 17 20:47:58 UTC 2009
> -----Original Message-----
> From: interchange-users-bounces at icdevgroup.org [mailto:interchange-
> users-bounces at icdevgroup.org] On Behalf Of David Christensen
> Sent: Wednesday, April 29, 2009 5:27 PM
> To: interchange-users at icdevgroup.org
> Subject: Re: [ic] Call for testers
>
>
> On Apr 29, 2009, at 8:18 AM, Jon Jensen wrote:
>
> > On Wed, 29 Apr 2009, Stefan Hornburg (Racke) wrote:
> >
> >> The most recent change bails out with:
> >>
> >> Unrecognized/unsupported MV_HTTP_CHARSET: 'utf-8'.
> >> ulisses config error: Unrecognized/unsupported MV_HTTP_CHARSET:
> >> 'utf-8'.
> >>
> >> Any idea why?
> >
> > Wild guess, but have you tried "utf8" instead of "utf-8"? They're
> > not the
> > same in Perl. But if you were using "utf-8" before and it worked, I
> > don't
> > know.
>
>
> No, this definitely is a regression. I suspect this may be due to the
> Global::UTF8 variable logic introduced when I merged upstream CVS, but
> I'll have to hunt it down to be sure. Either utf-8 or utf8 are
> acceptable here, one gets resolved to "strict" utf8, but both are
> valid. (And any aliasable encoding works here; this message is what
> appears when we can't resolve the alias. (An artifact of require/
> import vs use, perhaps?)
I have the following variables in my catalog.cfg:
Variable MV_HTTP_CHARSET UTF-8
Variable MV_UTF8 1
Now if I open the file:
http://dev.allcarmodels.com/acm/locale.html
it will tell me that it has the following Content-type/charset:
Content-Type: text/html; charset=utf-8-strict
However my FF 3.0 will display it as Cyrillic, and IE leaves it garblegarble
The same file:
http://dev.allcarmodels.com/locale.html
not passed through Interchange will show me the following
Content-type/charset:
Content-Type: text/html; charset=UTF-8
And this is shown accurately as UTF-8 in both FF as IE
The place this happens is in Vend/Config.pm:
# check MV_HTTP_CHARSET against a valid encoding
if (my $enc = $C->{Variable}->{MV_HTTP_CHARSET}) {
if (my $norm_enc = Vend::CharSet::validate_encoding($enc)) {
if ($norm_enc ne $enc) {
config_warn("Provided MV_HTTP_CHARSET '$enc' resolved to
'$norm_enc'. Continuing.");
$C->{Variable}->{MV_HTTP_CHARSET} = $norm_enc;
}
}
else {
config_error("Unrecognized/unsupported MV_HTTP_CHARSET: '%s'.",
$enc);
delete $C->{Variable}->{MV_HTTP_CHARSET};
}
}
So:
IN: Variable MV_HTTP_CHARSET UTF-8
OUT: Variable MV_HTTP_CHARSET utf-8-strict
Now as far as I know the following page shows the character sets that can be
used on the internet:
http://www.iana.org/assignments/character-sets
utf-8-strict does not exist in this list.
See also: http://search.cpan.org/dist/Encode/Encode.pm
Finding IANA Character Set Registry names
The canonical name of a given encoding does not necessarily agree with IANA
Character Set Registry, commonly seen as Content-Type: text/plain;
charset=whatever. For most cases canonical names work but sometimes it does
not (notably 'utf-8-strict').
Therefore as of Encode version 2.21, a new method mime_name() is added.
use Encode;
my $enc = find_encoding('UTF-8');
warn $enc->name; # utf-8-strict
warn $enc->mime_name; # UTF-8
See also: Encode::Encoding
Anyway patch attached, no thoroughly tested today .. tomorrow is another
day.
In the patch also a no strict subs ... but that one I am not sure if that is
the cure or removing symptoms ...
CU,
Gert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CharSet.pm.patch
Type: application/octet-stream
Size: 776 bytes
Desc: not available
Url : http://www.icdevgroup.org/pipermail/interchange-users/attachments/20090617/270a6e00/attachment.obj
More information about the interchange-users
mailing list