[ic] ic-utf8 readfile/writefile patch

Peter peter at pajamian.dhs.org
Sun Mar 22 00:11:30 UTC 2009


On 03/21/2009 05:51 AM, Stefan Hornburg (Racke) wrote:
> Stefan Hornburg (Racke) wrote:
>> Yes, it doesn't crash anymore :-).
>>
>> One problem that might be related to this issue is the delivery of "binary" 
>> content stored in a UTF8 database.
>>
>> Currently the files produced are corrupted, and the data in the db is
>> definitely correct. And it works with UTF8 inactive (as per bug #259).
>>
>> The code is as follows:
>>
>> my $data = $Db{transaction_documents}->field($td_code, 'content');
>> $data = $Tag->filter({op => 'decode_base64', body => $data});
> 
> Putting at this point:
> 
> Encode::_utf8_on($data);
> 
> "solves" the problem.
> 
>> $Tag->deliver({type => 'application/pdf', 						body => $data});


This may "solve the problem" but it doesn't look like the right solution
to me.  If it's really "binary" data then the utf8 flag should be off
for it, not on and the data should not be treated as utf8 in any way.
Presumably this would contain something like a picture or a sound file
which should certainly not be utf8 encoded.

It almost looks to me like perl was reading in the raw binary data
(after converting from base64) performing a utf8 conversion on it, and
then *not* setting the utf8 flag.  Alternatively, it may be converting
the data to utf8 prior to writing it out to the db in the first place so
when it reads it back it gets utf8 data which is not flagged as such.

...or am I missing something here?


Peter



More information about the interchange-users mailing list