[ic] Froogle.google.com anyone using this yet?

Jonathan Clark interchange-users@icdevgroup.org
Fri Dec 20 10:18:01 2002


> > The problem you have is the HTML in the database.  That makes
> > it really hard to reuse.  You might want to consider ways of
> > getting HTML out of your raw data.
> >
> A quick test script for you:
>
> ----------------------------------------------------------------------
> use HTML::TreeBuilder;
> use HTML::FormatText;
> use strict;
>
> my $text =<<'EOB';
>     <body>
>         <p>
>             This is a test blah blah.&nbsp;
>             <a href="foobar.html">What's this, a link?</a>.
>         </p>
>         <p>
>             Let's have some text in <font color="#FF0000">red</font>.
>         </p>
>         <p>
>             Some &quot;entities&quot; will make another test case.
>         </p>
>     </body>
> EOB
>
> my $tree = new HTML::TreeBuilder;
> $tree->parse($text);
>
> my $formatter = new HTML::FormatText(
>     leftmargin => 4,
>     rightmargin => 74,
> );
> $text = $formatter->format($tree);
> print $text;
> ----------------------------------------------------------------------
>
> The output is:
>
>     This is a test blah blah.  What's this, a link?.
>
>     Let's have some text in red.
>
>     Some "entities" will make another test case.

Excellent! This is just what I need for Helpem's text file output format
(near the top of the todo).

thanks Kevin.

Jonathan
Webmaint.