[ic] URL encoding bug in Vend::Interpolate::esc

David Christensen david at endpoint.com
Sun May 16 17:53:07 UTC 2010


On May 16, 2010, at 11:59 AM, Rok Ruzic wrote:

> On Sun, 16 May 2010 16:34:29 +0200
> "Stefan Hornburg (Racke)" <racke at linuxia.de> wrote:
>> 
>> This reference is outdated. Please look at this:
>> 
>> http://labs.apache.org/webarch/uri/rfc/rfc3986.html#unreserved
>> 
>> Please adjust your patch accordingly.
> 
> Yes, it makes sense to restrict our non-escaped character class
> strictly to the characters explicitly mentioned as unreserved. I append
> the revised patch.
> 
> I suggest we wait until monday, when members will probably have more
> comments on this issue.


It's worth noting that URL-encoding is only on octets, so if you're relying on encoding Unicode characters, you need to standardize on some character encoding to encode to, and then url-escape the encoded octets.  As should come as no surprise, I see no reason this should not be UTF-8.  :-)  I haven't looked at the patch, so can't comment on that specifically, but if it's using ord(), etc. as opposed to the encoding functions, I'd be against doing it that way.

Regards,

David
--
David Christensen
End Point Corporation
david at endpoint.com







More information about the interchange-users mailing list