nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ferdy Galema <ferdy.gal...@kalooga.com>
Subject Re: Suitable Nutch 2.0 Project Description
Date Wed, 13 Jun 2012 12:45:24 GMT
Hi,

I would remove the 'experimental' notion. Aside from that it's fine with me.

Ferdy.

On Wed, Jun 13, 2012 at 2:29 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi,
>
> Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask
> about a suitable project descriptor.
>
> So far on trunk we have
>
> ** Apache Nutch is an open source web-search software project.
> Stemming from Apache Lucene, it now builds on Apache Solr adding
> web-specifics, such as a crawler, a link-graph database and parsing
> support handled by Apache Tika for HTML and and array other document
> formats.
>
> This is merely a pot shot, but I was thinking for Nutch 2.0, something like
>
> ** Apache Nutch 2.X is an experimental branch of the Apache Nutch open
> source web-search software project. It builds on Apache Gora for data
> persistence and Apache Solr for indexing adding web-specifics, such as
> a crawler, a link-graph database and parsing support handled by Apache
> Tika for HTML and and array other document formats.
>
> Although there are not many changes here I just wanted to run it by
> you folks...?
>
> Thanks
> Lewis
>
> --
> Lewis
>

Mime
View raw message