nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <>
Subject [ANNOUNCE] Apache Nutch 1.2 released
Date Fri, 24 Sep 2010 22:21:16 GMT
(...apologies for the cross posting...)

The Apache Nutch project is pleased to announce the release of Apache Nutch
1.2. The release contents have been pushed out to the main Apache release
site so the releases should be available as soon as the mirrors get the

Apache Nutch, one of the six new Apache TLPs as a result of the April 2010
Board Meeting, is an extensible framework for building out large-scale
web-based search. Layered on top of fellow Apache projects Hadoop,
Lucene/Solr, and Tika, Nutch provides an out of the box platform for
fetching web pages, pdf files, word documents, and more. Nutch parses the
content and its relevant information, indexes its metadata, and makes it
available for efficient query and retrieval over modern Internet protocols.

Apache Nutch 1.2 contains a number of improvements and bug fixes. Details
can be found in the changes file:

Apache Nutch is available in source and binary form from the following
download page:

In the initial 48 hours, the release may not be available on all mirrors.
When downloading from a mirror site, please remember to verify the downloads
using signatures found on the Apache site:

For more information on Apache Nutch, visit the project home page:

-- Chris Mattmann (on behalf of the Apache Nutch community)

Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

View raw message