[ic] Call for testers

David Christensen david at endpoint.com
Wed Apr 29 16:16:38 UTC 2009

On Apr 29, 2009, at 10:23 AM, Mike Heins wrote:

> Quoting David Christensen (david at endpoint.com):
>> On Apr 29, 2009, at 9:33 AM, Mike Heins wrote:
>>> Quoting David Christensen (david at endpoint.com):
>>>> On Apr 29, 2009, at 8:18 AM, Jon Jensen wrote:
>>>>> On Wed, 29 Apr 2009, Stefan Hornburg (Racke) wrote:
>>>>>> The most recent change bails out with:
>>>>>> Unrecognized/unsupported MV_HTTP_CHARSET: 'utf-8'.
>>>>>> ulisses config error: Unrecognized/unsupported MV_HTTP_CHARSET:
>>>>>> 'utf-8'.
>>>>>> Any idea why?
>>>>> Wild guess, but have you tried "utf8" instead of "utf-8"? They're
>>>>> not the
>>>>> same in Perl. But if you were using "utf-8" before and it  
>>>>> worked, I
>>>>> don't
>>>>> know.
>>>> No, this definitely is a regression.  I suspect this may be due to
>>>> the
>>>> Global::UTF8 variable logic introduced when I merged upstream CVS,
>>>> but
>>>> I'll have to hunt it down to be sure.  Either utf-8 or utf8 are
>>>> acceptable here, one gets resolved to "strict" utf8, but both are
>>>> valid.  (And any aliasable encoding works here; this message is  
>>>> what
>>>> appears when we can't resolve the alias.  (An artifact of require/
>>>> import vs use, perhaps?)
>>> Yes, I think it is having problems because of the Encode::PERLQQ and
>>> other quasi-constant subroutines. This is a bit maddening, because  
>>> it
>>> appears there is no way to have a conditional namespace and use  
>>> those
>>> types of methods.
>> Perhaps a string-eval "use Encode" would be in order?
> I have never done that. If it works and gives us all the namespace
> equivalents, I am all for it.
>>> If Encode didn't pollute regexes, it would be fine. Is there some
>>> trigger for that, something like the old &sawampersand?
>> Can you explain the regexp-polluting behavior?  I'm not sure I
>> understand.
> The regex polluting behavior is attaching Encode behavior to /i
> and other modifiers that might determine case-sensitivity (which you
> have to UTF8-ify, of course). This is what kills Safe.

I solved the Safe issue entirely in one of my commits to the ic-utf8  
repo; basically I created a wrapper class, Vend::Safe and replaced all  
instances of new Safe in the codebase with new Vend::Safe, and put  
common initialization behavior in the wrapper class.  So is the Safe  
*breakage* the issue, or are there performance issues with utf8 that  
is the concern?

> In older 5.4 and earlier versions, there was a behavior associated
> with the & atom (results of previous search). It set a variable called
> sawampersand, which would greatly slow down certain types of regex
> substitutiions.

Okay, I know what you're talking about here, basically the global  
penalty for any regex using $&.


David Christensen
End Point Corporation
david at endpoint.com

More information about the interchange-users mailing list