[ic] Call for testers

Gert van der Spoel gert at 3edge.com
Wed Jun 17 20:47:58 UTC 2009


> -----Original Message-----
> From: interchange-users-bounces at icdevgroup.org [mailto:interchange-
> users-bounces at icdevgroup.org] On Behalf Of David Christensen
> Sent: Wednesday, April 29, 2009 5:27 PM
> To: interchange-users at icdevgroup.org
> Subject: Re: [ic] Call for testers
> 
> 
> On Apr 29, 2009, at 8:18 AM, Jon Jensen wrote:
> 
> > On Wed, 29 Apr 2009, Stefan Hornburg (Racke) wrote:
> >
> >> The most recent change bails out with:
> >>
> >> Unrecognized/unsupported MV_HTTP_CHARSET: 'utf-8'.
> >> ulisses config error: Unrecognized/unsupported MV_HTTP_CHARSET:
> >> 'utf-8'.
> >>
> >> Any idea why?
> >
> > Wild guess, but have you tried "utf8" instead of "utf-8"? They're
> > not the
> > same in Perl. But if you were using "utf-8" before and it worked, I
> > don't
> > know.
> 
> 
> No, this definitely is a regression.  I suspect this may be due to the
> Global::UTF8 variable logic introduced when I merged upstream CVS, but
> I'll have to hunt it down to be sure.  Either utf-8 or utf8 are
> acceptable here, one gets resolved to "strict" utf8, but both are
> valid.  (And any aliasable encoding works here; this message is what
> appears when we can't resolve the alias.  (An artifact of require/
> import vs use, perhaps?)

I have the following variables in my catalog.cfg:
Variable MV_HTTP_CHARSET UTF-8
Variable MV_UTF8 1


Now if I open the file:
http://dev.allcarmodels.com/acm/locale.html

it will tell me that it has the following Content-type/charset:
Content-Type: text/html; charset=utf-8-strict

However my FF 3.0 will display it as Cyrillic, and IE leaves it garblegarble

The same file:
http://dev.allcarmodels.com/locale.html

not passed through Interchange will show me the following
Content-type/charset:
Content-Type: text/html; charset=UTF-8

And this is shown accurately as UTF-8 in both FF as IE


The place this happens is in Vend/Config.pm:
 # check MV_HTTP_CHARSET against a valid encoding
   if (my $enc = $C->{Variable}->{MV_HTTP_CHARSET}) {
        if (my $norm_enc = Vend::CharSet::validate_encoding($enc)) {
            if ($norm_enc ne $enc) {
                config_warn("Provided MV_HTTP_CHARSET '$enc' resolved to
'$norm_enc'.  Continuing.");
                $C->{Variable}->{MV_HTTP_CHARSET} = $norm_enc;
            }
        }
        else {
            config_error("Unrecognized/unsupported MV_HTTP_CHARSET: '%s'.",
$enc);
            delete $C->{Variable}->{MV_HTTP_CHARSET};
        }
   }


So:
IN: Variable MV_HTTP_CHARSET UTF-8
OUT: Variable MV_HTTP_CHARSET utf-8-strict

Now as far as I know the following page shows the character sets that can be
used on the internet:
http://www.iana.org/assignments/character-sets

utf-8-strict does not exist in this list.

See also: http://search.cpan.org/dist/Encode/Encode.pm

Finding IANA Character Set Registry names

The canonical name of a given encoding does not necessarily agree with IANA
Character Set Registry, commonly seen as Content-Type: text/plain;
charset=whatever. For most cases canonical names work but sometimes it does
not (notably 'utf-8-strict').

Therefore as of Encode version 2.21, a new method mime_name() is added.

  use Encode;
  my $enc = find_encoding('UTF-8');
  warn $enc->name;      # utf-8-strict
  warn $enc->mime_name; # UTF-8

See also: Encode::Encoding


Anyway patch attached, no thoroughly tested today .. tomorrow is another
day.
In the patch also a no strict subs ... but that one I am not sure if that is
the cure or removing symptoms ...


CU,

Gert











-------------- next part --------------
A non-text attachment was scrubbed...
Name: CharSet.pm.patch
Type: application/octet-stream
Size: 776 bytes
Desc: not available
Url : http://www.icdevgroup.org/pipermail/interchange-users/attachments/20090617/270a6e00/attachment.obj 


More information about the interchange-users mailing list