nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Briggs <>
Subject Lock file problems...
Date Thu, 07 Jun 2007 15:20:36 GMT
I am getting these lock file errors all over the place when indexing
or even creating crawldbs.  It doesn't happen all the time, but
sometimes it happens continuously.  So, I am not quite sure how these
locks are getting in there, or why they aren't getting removed.

I am not sure where to go from here.

My current application is designed for crawling individual domains.
So, I have multiple custom crawlers that work concurrently.  Each one
basically does:

1) fetch
2) invert links
3) segment merge
4) index
5) deduplicate
6) merge indexes

Though, I am still not 100% sure of what the "indexes" directory is truly for. Lock obtain timed out:
        at org.apache.lucene.index.IndexReader.aquireWriteLock(
        at org.apache.lucene.index.IndexReader.deleteDocument(
        at org.apache.nutch.indexer.DeleteDuplicates.reduce(
        at org.apache.hadoop.mapred.LocalJobRunner$

So, has anyone seen this come up on their own implementations?

View raw message