nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "taknev ivrok (JIRA)" <j...@apache.org>
Subject [jira] Created: (NUTCH-630) Error caused by index-more plugin in the latest svn revision - 652259
Date Wed, 30 Apr 2008 15:21:56 GMT
Error caused by index-more plugin  in the latest svn revision - 652259 
-----------------------------------------------------------------------

                 Key: NUTCH-630
                 URL: https://issues.apache.org/jira/browse/NUTCH-630
             Project: Nutch
          Issue Type: Bug
            Reporter: taknev ivrok


This problem is reported in the user mailng list: http://www.nabble.com/index-more-problem--td16757538.html
Upon running bin/nutch  crawl urls -dir crawl  in the latest svn version the following error
occurs. 

Note: This error does not happen after I remove index-more plugin from plugin.includes in
the conf/nutch-site.xml file. 

Fetcher: done
CrawlDb update: starting
CrawlDb update: db: crawlfs/crawldb
CrawlDb update: segments: [crawlfs/segments/20080430051112]
CrawlDb update: additions allowed: true
CrawlDb update: URL normalizing: true
CrawlDb update: URL filtering: true
CrawlDb update: Merging segment data into db.
CrawlDb update: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawlfs/segments/20080430051126
Generator: filtering: true
Generator: topN: 100000
Generator: jobtracker is 'local', generating exactly one partition.
Generator: 0 records selected for fetching, exiting ...
Stopping at depth=2 - no more URLs to fetch.
LinkDb: starting
LinkDb: linkdb: crawlfs/linkdb
LinkDb: URL normalize: true
LinkDb: URL filter: true
LinkDb: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051112
LinkDb: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051053
LinkDb: done
Indexer: starting
Indexer: linkdb: crawlfs/linkdb
Indexer: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051112
Indexer: adding segment:
file:/home/admin/nutch-2008-04-30_04-01-32/crawlfs/segments/20080430051053
IFD [Thread-102]: setInfoStream
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@1cfd3b2
IW 0 [Thread-102]: setInfoStream:
dir=org.apache.lucene.store.FSDirectory@/tmp/hadoop-admin/mapred/local/index/_1406110510
autoCommit=true
mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@1536eec
mergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@9770a3
ramBufferSizeMB=16.0 maxBuffereDocs=50 maxBuffereDeleteTerms=-1
maxFieldLength=10000 index=
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:894)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:311)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:144) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message