Hi
> Currently nutch isn't very friendly to windows users as it requires cygwin
> to run and there are a lot of issues with Hadoop 1.x branch, which nutch
> bundles with it, due to the "set tmp permission" issue.
>
> What do you think about doing two things:
> 1. Move to Hadoop 2.4 to support windows/linux and the new map reduce api
>
it already works on Linux. Am pretty sure there already is a JIRA for the
port to the new map reduce API. As for windows, feel free to contribute an
alternative set of scripts if you want to.
> 2. Create bash scripts to run crawls with
>
what's wrong with src/bin/crawl.sh?
Julien
> Relevant JIRA Issues:
>
>
--
Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble
|