[ic] vlink display of UTF8 characters

David Christensen david at endpoint.com
Mon Aug 15 14:50:14 UTC 2016


> On Aug 15, 2016, at 7:07 AM, Peter <peter at pajamian.dhs.org> wrote:
> 
> We recently found we were having issues with the display of UTF8
> characters (I honestly don't know why this issue is coming up now, this
> shop has been live for years).  At any rate, the problem, after hours
> spent debugging, turned out to be coming from the vlink script.
> 
> I switched over to vlink.pl and the problem persists, but I was able to
> "fix" vlink.pl with the following patch:
> 
> --- a/vlink.pl	2010-08-20 17:29:56.000000000 -0700
> +++ b/vlink.pl	2016-08-15 03:54:37.000000000 -0700
> @@ -141,6 +141,7 @@
> eval { alarm $LINK_TIMEOUT; };
> 
> socket(SOCK, PF_UNIX, SOCK_STREAM, 0)	or die "socket: $!\n";
> +binmode(SOCK, ':utf8');
> 
> my $ok;

What charset was the site in originally and what were the other settings (Apache’ DefaultCharset, MV_HTTP_CHARSET, MV_UTF8, perl/Encode.pm versions, etc)?

I wonder if this was properly configured in the IC side, as I’d just expect the vlink to pass-thru the octets regardless of encoding.  In any case, this doesn’t feel correct to me, so I’d like to see what other information we can gather.

C-wise, you’d have to write your own equivalent to the PerlIO layer to encode input data as UTF8, which is another reason I think this is just misconfigured, not fundamentally broken at this layer.  We’ve had quite a few sites use the IC UTF-8 layer without ever having to resort to vlink modifications.

HTH,

David
--
David Christensen
End Point Corporation
david at endpoint.com
785-727-1171






More information about the interchange-users mailing list