[ic] HTML Special Entities

David Christensen david at endpoint.com
Thu May 13 13:35:09 UTC 2010

On May 13, 2010, at 6:28 AM, Peter wrote:

> On 13/05/10 20:02, Stefan Hornburg (Racke) wrote:
>> On 05/12/2010 04:51 PM, Jon Jensen wrote:
>>> On Wed, 12 May 2010, Stefan Hornburg (Racke) wrote:
>>>> we are working on getting a site properly HTML validated. One of the
>>>> problems are menus or categories from the database containing an
>>>> ampersand like Dungeons & Dragons.
>>>> I wrote the following filter for this and the other HTML special
>>>> entities:
>>> Isn't that what the encode_entities filter is for?
>> This filter encodes way too much.
> Perhaps, but how many different filters should we have that encode
> entities?  The next person who comes along may want to code a different
> set of characters maybe, then we have three encode entity filters?
> How about modifying the existing filter so that you can pass a custom
> list of characters to encode instead?

Perhaps an argument specifying sets would be useful, too; i.e., basic/all for the proposed/current escapes, and then we have the option to define more sets as we see fit.


[encode_entities basic]<foo bar="baz">naïve</foo>[/encode_entities] will leave "naïve" alone.

Some of the issues with using encode_entities can be seen when using multibyte chars without MV_UTF8 set; it'll encode each octet of the char separately, which will not be what the desired output is.  Is this kind of situation what triggered this change?


David Christensen
End Point Corporation
david at endpoint.com

More information about the interchange-users mailing list