[ic] Filters with UTF-8 body

David Christensen david at endpoint.com
Thu Mar 12 19:28:37 UTC 2009


On Mar 12, 2009, at 2:10 PM, Peter wrote:

> On 03/12/2009 06:12 AM, David Christensen wrote:
>> On Mar 12, 2009, at 5:31 AM, Peter wrote:
>>
>>> On 03/12/2009 03:17 AM, Peter wrote:
>>>> On 03/12/2009 03:04 AM, Stefan Hornburg wrote:
>>>>> Peter Ajamian suggested that the following code in Interpolate.pm
>>>>> causes the problem:
>>>>>
>>>>> '_filter'               => qr($T{_filter}\s+($Some)\]($Some)),
>>>>> my $Some = '[\000-\377]*?';
>>>> More specifically $Some, $All, $XSome and $XAll will only parse 8  
>>>> bit
>>>> characters in the range \000-\377.  Not positive about this, but I
>>>> think
>>>> that changing them to the following will work:
>>>> my $All = '(?:(?s).*)';
>>>> my $Some = '(?:(?s).*?)';
>>>> my $XAll = qr{(?:(?s).*)};
>>>> my $XSome = qr{(?:(?s).*?)};
>>>
>>> On further reflection this would probably work just as well and is
>>> less
>>> complex looking:
>>> my $All = '[.\n]*';
>>> my $Some = '[.\n]*?';
>>> my $XAll = qr{[.\n]*};
>>> my $XSome = qr{[.\n]*?};
>>
>>
>> Heh, one problem:
>>
>> $ perl -e 'print "matches!" if "foo" =~ /[.\n]/'
>> $ perl -e 'print "matches!" if "foo" =~ /(.|[\n])/'
>> matches!
>
> Strange.  So may as well go with the (?s) solution as Jon says.  To  
> add
> a few more tests:
> peter at peter-desktop:~$ perl -le 'print $1 if "foo\nbar" =~ /((?:(? 
> s).*))/'
> foo
> bar
> peter at peter-desktop:~$ perl -le 'print $1 if "foo\nbar" =~ /((?:.| 
> \n)*)/'
> foo
> bar
> peter at peter-desktop:~$ perl -le 'print $1 if "foo\nbar" =~ /(.*)/'
> foo
> peter at peter-desktop:~$


I have a commit queued to fix all instances of explicit ranges,  
however, there was something I found which I'm not sure is a wart or  
not.  From dist/lib/UI/Primitive.pm:

45:$DECODE_CHARS = qq{&[<"\000-\037\177-\377};

This variable is exported, but not used anywhere in the codebase that  
I could see, including further in that same file.  Anyone know if it's  
still needed?

Regards,

David
--
David Christensen
End Point Corporation
david at endpoint.com
212-929-6923
http://www.endpoint.com/






More information about the interchange-users mailing list