nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earl Cahill <>
Subject Re: nutch/mapred tutorial
Date Wed, 07 Sep 2005 08:17:01 GMT
Though, my last email was more about documenting the
whole setup process, it looks like the error I
mentioned was fixed by creating a directory and
putting a urls file in that directory.  It also looks
like the name of the file doesn't matter.  So I made a
myurls directory, put a urls file in there and then

bin/nutch crawl myurls -dir crawl.test -depth 3

But, yeah, would like to put such steps in a tutorial.

It looks like the front page got hit, and that's about
it, so there is more to do.


--- Earl Cahill <> wrote:

> howdy,
> I have been looking around for a nutch/mapred
> tutorial
> and haven't had much luck.  I found this one
> which did help me get a crawl going on trunk, but no
> such luck in branches/mapred.  I set the urls file
> and
> the filter in the same way that I did for trunk and
> I
> get 
> 050907 013817 parsing
> No input files in:
> [;@32b0bad7
>         at
>         at
>         at
> Guess I am wondering if a detailed tutorial for
> mapred
> exists.  Seems like doug was saying that it didn't. 
> I
> would be up for walking through getting a crawl
> going
> and documenting my steps, but won't dive in if one
> already exists.  Also wondering if I would/could put
> my doc on the wiki.
> Thanks,
> Earl
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam
> protection around 

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

View raw message