html2text — transform basic HTML input to plain-text
The filter performs simple replacement of input HTML —
it strips the
<b>
,
<i>
and
<u>
tags, and replaces
line breaks (<br>
) and
paragraphs (<p>
)
with newlines.
Example: Filter example
[filter html2text] <p> Perl is <b>a lot</b> of <u>fun</u>! </p> <p> Interesting tricks with <i>the language</i> can be seen at: <br> MJD's <a href="http://perl.plover.com/">plover.com</a>. </p> <p> Programming is an art. </p> [/filter]
Support for stripping
<b>
,
<i>
and
<u>
tags was added
in Interchange 5.5.2.
For more information on Perl Regular Expressions, pattern matching and character classes, see perlre(1).
Interchange 5.9.0:
Source: code/Filter/html2text.filter
Lines: 18
# Copyright 2002-2009 Interchange Development Group and others # Copyright 1996-2002 Red Hat, Inc. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. See the LICENSE file for details. CodeDef html2text Filter CodeDef html2text Description Simple html2text CodeDef html2text Routine <<EOR sub { my $val = shift; $val =~ s%\s*<(?:br\s*/?|/?p[^>]*)>\s*%\n%gi; $val =~ s%<[/!a-zA-Z].*?>%%gs; return $val; } EOR