[ic] Call for testers

Stefan Hornburg racke at linuxia.de
Fri Mar 13 07:53:22 UTC 2009

Gert van der Spoel wrote:
>> -----Original Message-----
>> From: interchange-users-bounces at icdevgroup.org [mailto:interchange-
>> users-bounces at icdevgroup.org] On Behalf Of Peter
>> Sent: Thursday, March 12, 2009 10:28 PM
>> To: interchange-users at icdevgroup.org
>> Subject: Re: [ic] Call for testers
>> On 03/12/2009 12:30 PM, David Christensen wrote:
>>> On Mar 12, 2009, at 2:15 PM, Peter wrote:
>>>> On 03/12/2009 05:32 AM, David Christensen wrote:
>>>>> <snip>
>>>>>> One thing which also annoys me is the internal server error caused
>>>>>> by
>>>>>> non UTF-8 characters:
>>>>>> ZobI6Yf4: - [12/March/2009:09:24:20 +0100]
>>>>>> ulisses
>>>>>> /cgi-bin/ic/ulisses/index Runtime error: Malformed UTF-8 character
>>>>>> (fatal)
>>>>>> at /usr/lib/interchange/Vend/Parser.pm line 112.
>>>>> What is the text on the index page?  I'm assuming this was in some
>>>>> legacy encoding and that MV_UTF8 was set to 1.  If MV_UTF8 is off,
>>>>> this is a bug that should be addressed, as breaking legacy
>> encodings
>>>>> when MV_UTF8 is off is a Bad Thing.  One of the consequences of
>>>>> setting MV_UTF8 is that it expects all of your pages, etc to be in
>>>>> the
>>>>> utf-8 encoding.
>>>> While this is true, I don't think it's right to bring down a website
>>>> because a page contains an invalid UTF8 character.  It should be
>>>> logged
>>>> as an error and dealt with as gracefully as possible.  One solution
>> is
>>>> to use the Encode module to convert invalid characters to something
>>>> like
>>>> a ? or alternatively to just encode them as (invalid) html entities
>>>> and
>>>> push the problem off to the browser.
>>> Yeah, fatal is a bad result, we could see if there's a more forgiving
>>> IO layer that can just log those and continue.  I believe most of
>>> these cases are ushered through Vend::Util::read_file, so we may be
>>> able to centralize decisions there.
>> Well, if all we wanted to do was log and continue then all that is
>> needed is to wrap a few lines of perl in an eval.  Unfortunately, we
>> also have to decide how to process the bad text that is causing the
>> problem since if we just leave it then (1) we will have large chunks of
>> text missing of the resulting page as a result and (2) it is likely to
>> fail again on the same string elsewhere.  I think we really need to do
>> something to sanitize the illegal characters (which may just be one or
>> two chars) out of the text, then we can log and continue.
> I think that 'log & continue'  is good enough. I do not think Interchange is
> responsible for sanitizing the garbage someone puts in. To prevent filling
> up the error log we could split the issue in a logError and logDebug
> notification where the logError entry is smaller with a hint to logDebug for
> more info.

I agree with this. Nothing really fancy needed. 

This will usually happen if someone uploads a file without proper encoding.


LinuXia Systems => http://www.linuxia.de/
Expert Interchange Consulting and System Administration
ICDEVGROUP => http://www.icdevgroup.org/
Interchange Development Team

More information about the interchange-users mailing list