[ic] Error while interpolating page etc/report: UTF-8 Issue

Gert van der Spoel gert at 3edge.com
Tue Jun 2 14:12:34 UTC 2009


> -----Original Message-----
> From: interchange-users-bounces at icdevgroup.org [mailto:interchange-
> users-bounces at icdevgroup.org] On Behalf Of IC
> Sent: Tuesday, June 02, 2009 4:45 PM
> To: interchange-users at icdevgroup.org
> Subject: Re: [ic] Error while interpolating page etc/report: UTF-8
> Issue
> 
> > This is the byteordermark I think:
> >
> > http://en.wikipedia.org/wiki/Byte-order_mark
> >
> > "While UTF-8 does not have byte order issues, a BOM encoded in UTF-8
> may
> > nonetheless be encountered. A UTF-8 BOM is explicitly allowed by the
> > Unicode
> > standard[1], but is not recommended[2], as it only identifies a file
> as
> > UTF-8 and does not state anything about byte order.[3] Many Windows
> > programs
> > (including Windows Notepad) add BOM's to UTF-8 files. However in
> Unix-like
> > systems (which make heavy use of text files for file formats as well
> as
> > for
> > inter-process communication) this practice is not recommended, as it
> will
> > interfere with correct processing of important codes such as the
> shebang
> > at
> > the start of an interpreted script.[4] It may also interfere with
> source
> > for
> > programming languages that don't recognise it. For example, gcc
> reports
> > stray characters at the beginning of a source file, and in PHP, if
> output
> > buffering is disabled, it has the subtle effect of causing the page
> to
> > start
> > being sent to the browser, preventing custom headers from being
> specified
> > by
> > the PHP script. The UTF-8 representation of the BOM is the byte
> sequence
> > EF
> > BB BF, which appears as the ISO-8859-1 characters  in most text
> editors
> > and web browsers not prepared to handle UTF-8."
> >
> > You have to remove this from the files on your unix machine as they
> do not
> > offer anything useful.
> >
> >
> >
> > > Rest of message looked ok.
> > >
> > > I have now had to encode the templates back to ISO-8859.
> > >
> > > To stop the initial problem of "Error while interpolating page
> > > etc/report:"
> > > I have had to remove dofile from safeuntrap in interchange.cfg,
> this
> > > now
> > > leaves the following error with one regex but at least things seem
> to
> > > be
> > > working:-
> > >
> > > Safe: 'do "file"' trapped by operation mask at
> > > /usr/lib/perl5/5.8.8/utf8_heavy.pl line 182.
> > > Compilation failed in require at /usr/lib/perl5/5.8.8/utf8.pm line
> 17.
> > >
> > > It seems to be in quite a mess, any suggestions?
> >
> >
> > I think dofile is supposed to be in SafeUntrap .. I have not heard
> that
> > being a solution to anything so far.
> >
> >
> >
> 
> Thanks for the replies, what is the best method to convert the encoding
> on
> linux from the command line, I used a windows program called Edit Pad
> Pro
> which added these characters after encoding. I tried iconv but the file
> still shows as ASCII after running iconv -f ASCII -t UTF-8 report

Perhaps:

iconv -f iso8859-1 -t utf-8 report > report.utf8

that's what I always use.


> Andy.
> 
> 
> _______________________________________________
> interchange-users mailing list
> interchange-users at icdevgroup.org
> http://www.icdevgroup.org/mailman/listinfo/interchange-users




More information about the interchange-users mailing list