[ic] How would you search, store, and display documents

Stefan Hornburg (Racke) racke at linuxia.de
Fri Aug 12 07:01:58 UTC 2011


On 08/12/2011 10:23 AM, Paul Jordan wrote:
>
> I'm tasked with building a pretty complex training/educational system for one of my clients. This would be bring their paper manual and assets, html newsletters, FAQ, videos, how-to's, etc, etc all into one intuitive "knowledgebase" if you will.
>
> I know how I want it to work and look, but what I don't know ATM is what is the best format to use. My main concern is the searchability and storage of the main body of each article. This text will arbitrarily contain html for formatting, images, div's for quotes, or tables for data, and the like (everything will be styled with css of course)
>
> It seems to me there are several paths...
>
> #1 Store the text page with any html needed for the article in table and assuming html doesn't play well with fulltext searches, work around that by saving text-only into a second field used for searching only.

Fulltext search engines like Lucene are able to parse HTML and adjust the weight according to the position of the words (HTML title, etc).

>
> #2 Delve into xml/xsl.
>
> #3 Create some sort of wiki parser to use in conjunction with IC. I *really* would have liked to use Kevins system, and improved upon that, but that doesn't seem likely.

I hacked on Wiki stuff based on Wiki::Toolkit, it is inside the WellWell repository. If you have already Wiki formatted text, you can use this for HTML formatting
and display:

http://git.icdevgroup.org/?p=wellwell.git;a=blob;f=lib/Vend/Wiki.pm;h=8cb8499ef669ed60d26867db04a41cfba3a641e8;hb=HEAD

>
> #4 Have a parser like Kevin's made, extend it to handle images, and create a simple online "editor" for it.
>

An editor for Wiki text shouldn't be hard to come up with.

Regards
	Racke


-- 
LinuXia Systems => http://www.linuxia.de/
Expert Interchange Consulting and System Administration
ICDEVGROUP => http://www.icdevgroup.org/
Interchange Development Team




More information about the interchange-users mailing list