nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ferdy Galema <ferdy.gal...@kalooga.com>
Subject Re: VOTE Apache Nutch 2.0 RC1
Date Wed, 13 Jun 2012 09:00:22 GMT
Findings about Nutch-2.0 RC 1.

The Nutch job jar is not present in the binary archive. This means
distributed running of jobs is not supported. I'm not sure if this is a
problem (since users can always build one themselves), merely pointing it
out. The recently released 1.5 also lacks this job jar, so at least no
difference there.

Parse text is limited to 100 characters for html. We noticed this when our
index wasn't showing enough terms for some documents. This is a pretty
severe bug that I will commit a fix for right away.

Building runtime with the default SqlStore and HBaseStore works fine. Will
perform some more functionality tests when there is a new RC.

Ferdy.

On Wed, Jun 13, 2012 at 4:24 AM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hey Guys,
>
> #2 is probably reason enough for a respin.
>
> Lewis if you don't have time to do it before Thursday, I could probably
> give it a whack. Let me know.
>
> Cheers,
> Chris
>
> On Jun 12, 2012, at 3:33 PM, Sebastian Nagel wrote:
>
> > Hi Lewis,
> >
> > my first steps with 2.0 (to be continued, still struggling).
> >
> > Two points (I'll try to give a final vote tomorrow):
> >
> > 1 some guidance would be nice. README.txt points
> > to http://wiki.apache.org/nutch/NutchTutorial which refers to 1.x
> > (I'm using
> http://sujitpal.blogspot.de/2012/01/exploring-nutch-gora-with-cassandra.html
> )
> >
> > 2 the package contains your nutch-site.xml:
> >    <name>http.agent.email</name>
> >    <value>lewismc@apache.org</value>
> > I guess that's not intended :)
> >
> > Cheers,
> > Sebastian
> >
> > On 06/12/2012 10:16 PM, Lewis John Mcgibbney wrote:
> >> Hi Everyone,
> >>
> >> I appreciate that most of the core dev's are using trunk, however I
> >> would appeal to you guys to at least check out the artifacts and check
> >> sigs, tests, license headers if possible. Although this does not fully
> >> satisfy the requirements of a thoroughly reviewed RC, hopefully the
> >> thorough stuff can be undertaken by those directly using the artifacts
> >> and code in development/production.
> >>
> >> Thanks very much in advance
> >>
> >> Best
> >>
> >> Lewis
> >>
> >> On Fri, Jun 8, 2012 at 3:49 PM, lewis john mcgibbney <
> lewismc@apache.org> wrote:
> >>> Good Evening Everyone,
> >>>
> >>> A candidate for the Apache Nutch 2.0 RC1 is available at:
> >>>
> >>> http://people.apache.org/~lewismc/nutch-2.0
> >>>
> >>> The release candidate is a src.zip, bin.zip, src.tar.gz and bin.tar.gz
> >>> archive of the sources in:
> >>>
> >>> http://svn.apache.org/repos/asf/nutch/tags/release-2.0rc1
> >>>
> >>> Further, a staged Maven repository of the 2.0 jar, sources.jar and
> >>> javadoc.jar is available here:
> >>>
> >>> https://repository.apache.org/content/repositories/orgapachenutch-215
> >>>
> >>> Please vote on releasing this package as Apache Nutch 2.0.
> >>> The vote is open for the next 72 hours and passes if a majority of at
> >>> least three +1 Nutch PMC votes are cast.
> >>>
> >>> [ ] +1 Release this package as Apache Nutch 2.0
> >>> [ ] -1 Do not release this package because...
> >>>
> >>> Many Thanks and heres to plenty more.
> >>>
> >>> Have a great weekend, Kind Regards,
> >>> Lewis
> >>>
> >>> P.S. Here's my +1.
> >>
> >>
> >>
> >
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>

Mime
View raw message