[ic] Finally fixing ESCAPE_CHARS::std (ATTN; Stefan)

Jon Jensen jon at endpoint.com
Sat Jan 9 04:33:41 UTC 2016


On Fri, 22 May 2015, Peter wrote:

> On 05/22/2015 07:50 AM, Josh Lavin wrote:
>> It seems that a bug was introduced back in 2007 by the addition of
>> "\X" as an 'escape char' for HTML::Entities. When it was added, the
>> string was single-quoted, which is appropriate for "\X", but the
>> string was later changed back to double-quoting, which fixes the "\n"
>> and "\t" it also contains, but also breaks the "\X" and causes the
>> following warning:
>>
>>     unrecognized escape \X
>>
>> I believe the following double-backslash for \X will fix this:
>>
>>     -$ESCAPE_CHARS::std = qq{^\n\t\X !\#\$%\'-;=?-Z\\\]-~};
>>     +$ESCAPE_CHARS::std = qq{^\n\t\\X !\#\$%\'-;=?-Z\\\]-~};
>>
>> It eliminates the warning, but I am not quite sure on the thought behind
>> using \X as an escape char, so before I push this patch, somebody please
>> check me on this.
>
> That was commit #3f45ec14 by Stefan that added the \X and changed from
> double quotes to single quotes.  The git log references ticket #58 from
> the RT system so I would imagine that there is much more details of the
> reasoning behind the changes in there.  I can't seem to find the old RT
> system to look it up anymore but most of the tickets have been moved to
> the github issue tracker since then, unfortunately #58 has not been.
>
> Stefan, can you comment on the reason for the change, or perhaps dig up
> the old RT entry for the ticket?

This question seems to have gotten dropped, and the fix never got 
committed.

Raw \X was there before when it was single-quoted, and when it became 
double-quoted that turned into an invalid string escape. It seems pretty 
clear it should be changed to \\X as Josh Lavin suggested.

(For the record, \X in a regex is an atom for "Match Unicode "eXtended 
grapheme cluster" as per the perlre manpage.)

If nobody objects soon, I propose you just commit it, Josh.

Jon


-- 
Jon Jensen
End Point Corporation
https://www.endpoint.com/



More information about the interchange-users mailing list