nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Nioche <lists.digitalpeb...@gmail.com>
Subject Re: Suitable Nutch 2.0 Project Description
Date Wed, 13 Jun 2012 14:40:13 GMT
" and and array other document " looks like a typo, rest is fine

On 13 June 2012 13:45, Ferdy Galema <ferdy.galema@kalooga.com> wrote:

> Hi,
>
> I would remove the 'experimental' notion. Aside from that it's fine with
> me.
>
> Ferdy.
>
>
> On Wed, Jun 13, 2012 at 2:29 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Hi,
>>
>> Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask
>> about a suitable project descriptor.
>>
>> So far on trunk we have
>>
>> ** Apache Nutch is an open source web-search software project.
>> Stemming from Apache Lucene, it now builds on Apache Solr adding
>> web-specifics, such as a crawler, a link-graph database and parsing
>> support handled by Apache Tika for HTML and and array other document
>> formats.
>>
>> This is merely a pot shot, but I was thinking for Nutch 2.0, something
>> like
>>
>> ** Apache Nutch 2.X is an experimental branch of the Apache Nutch open
>> source web-search software project. It builds on Apache Gora for data
>> persistence and Apache Solr for indexing adding web-specifics, such as
>> a crawler, a link-graph database and parsing support handled by Apache
>> Tika for HTML and and array other document formats.
>>
>> Although there are not many changes here I just wanted to run it by
>> you folks...?
>>
>> Thanks
>> Lewis
>>
>> --
>> Lewis
>>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Mime
View raw message