nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John X <j...@neasys.com>
Subject Re: servlet Cached.java
Date Wed, 23 Mar 2005 17:12:50 GMT
On Wed, Mar 23, 2005 at 11:19:36AM +0100, Andrzej Bialecki wrote:
> John X wrote:
> >Hi, All,
> >
> >Attached please find servlet Cached.java that serves raw Content
> >of any mime type. Current cached.jsp handles mime type text/* only.
> >If no objection, it is going to be committed in a few days.
> 
> I think this would be quite useful.
> 
> However, what I think is ultimately needed to match the features of 
> other search engines is not the ability to return the cached non-html 
> content (there might even be copyright issues with this function...), 

Intranet has less concern with this?

> but an html rendering of non-html content, a la Google's "View as HTML" 
> function.

We already have text.jsp (view as text). It can be made into "view as html".
However, to do so, we need to introduce new ParseHTML (better ParseXML)
or convert ParseText into one?

John

> 
> -- 
> Best regards,
> Andrzej Bialecki
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
__________________________________________
http://www.neasys.com - A Good Place to Be
Come to visit us today!

Mime
View raw message