nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf ...@media-style.com>
Subject Re: [Nutch-dev] Plugins problems
Date Thu, 03 Mar 2005 10:01:28 GMT
Do you have may different processes running or do you had stop an  
informer process?
just remove the webdb.new folder and try it again.
In genereral you will have more control when you start the processes  
manually by using the nutch command then using the crawl all in one  
command. :-)

HTH
Stefan

PS. Please move the discussion to the apache list.

Am 03.03.2005 um 09:52 schrieb Christophe Noel:

> Hello,
>
> I need to know more about the parse-ext plugin ... what can it do for  
> example ?
>
> Then, I get the following error when I crawl with index-more plugin :
>
> 050302 183540 Updating /nutch-0.6/agoria.2mar/db
> 050302 183540 Updating for  
> /nutch-0.6/agoria.2mar/segments/20050302183116
> 050302 183540 Processing document 0
> 050302 183541 Finishing update
> 050302 183542 Processing pagesByURL: Sorted 2931 instructions in 0.915  
> seconds.
> 050302 183542 Processing pagesByURL: Sorted 3203.27868852459  
> instructions/second
> Exception in thread "main" java.io.IOException: already exists:  
> /nutch-0.6/agoria.2mar/db/webdb.new/pagesByURL
>        at net.nutch.io.MapFile$Writer.<init>(MapFile.java:67)
>        at  
> net.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java: 
> 536)
>        at net.nutch.db.WebDBWriter.close(WebDBWriter.java:1531)
>        at  
> net.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:301)
>        at  
> net.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:351)
>        at net.nutch.tools.CrawlTool.main(CrawlTool.java:128)
>
> Thanks for help.
>
> Christophe.
>
>
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real  
> users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Nutch-developers mailing list
> Nutch-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nutch-developers
>
>
-----------information technology-------------------
company:     http://www.media-style.com
forum:           http://www.text-mining.org
blog:	             http://www.find23.net


Mime
View raw message