[ic] Call for testers

Peter peter at pajamian.dhs.org
Fri Mar 13 04:06:20 UTC 2009


On 03/12/2009 08:32 PM, David Christensen wrote:
> I've done something like this before, and it ends up looking something  
> like this:
> 
> for my $enc (qw/utf-8 cp-1252 latin-1/) {
>      my $decoded_data = eval { decode($data, $enc) };
>      last if defined $decoded_data;
> }
> 
> We could abstract out the list of encodings so we could potentially  
> return different values if needed depending on context, but I'd guess  
> for the most part the pages/components/templates, etc. will be in a  
> single encoding, so this would boil down to ("utf8",  
> $fallback_encoding).  (Hey, I'm optimistic... :-D)

I like that code above.  We could have a directive, say PageEncoding
where you can actually list the primary encoding and each fallback to
try in order in your above loop.  This could default to "utf8 latin-1".
 Again, we completely ignore this directive if MV_UTF8 is not set.

> My thought here was that if MV_UTF8 was set but the data  
> failed to decode as utf8 and no "fallback" encoding was provided, then  
> turn off the utf-8 flag, i.e., treat the data as raw octets 0-255.

We can have this be the final default mechanism if we fall off the end
of the PageEncoding list.  To make this the only action other than utf8
one can simply set PageEncoding to utf8.

> I agree that they'd work better as directives.  I have a patch in the  
> ic-utf8 tree that handles verification/resolution of the  
> MV_HTTP_CHARSET variable (based on a suggestion on the list some time  
> ago).  We could presumably support both the variables and the  
> directives simultaneously for a while, prioritizing the directive if  
> it exists over the variable.

I don't think these variables have been around that long (they were
introduced towards the end of 5.5, iirc which would be last year).  I
think it may be best to just nip them in the bud now and make a note in
the UPGRADE file.  This is one that I would want Mike's input on, though
as he may well feel differently.

> We could also make the database-specific  
> encodings directives as well, so-as-to hook into the save encoding  
> validation functionality.

The database-specific encodings are already under the Database and
DatabaseDefault directives which is where they belong, imo.


Peter




More information about the interchange-users mailing list