nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Furkan KAMACI (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (NUTCH-2271) Solr indexer Failed
Date Wed, 01 Jun 2016 08:09:59 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Furkan KAMACI reassigned NUTCH-2271:
------------------------------------

    Assignee: Furkan KAMACI

> Solr indexer Failed 
> --------------------
>
>                 Key: NUTCH-2271
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2271
>             Project: Nutch
>          Issue Type: Bug
>          Components: indexer
>    Affects Versions: 1.12
>         Environment: Hadoop 2.7.2 , Solr 6.0.0 , Nutch 1.12 on Single node 
>            Reporter: narendra
>            Assignee: Furkan KAMACI
>
> When i run this command
>   bin/nutch solrindex http://localhost:8983/solr/#/devel1 crawl_Test1/crawldb -linkdb
crawl_Test1/linkdb  crawl_Test1/segments/*
> 16/05/31 22:21:47 WARN segment.SegmentChecker: The input path at * is not a segment...
skipping
> 16/05/31 22:21:47 INFO indexer.IndexingJob: Indexer: starting at 2016-05-31 22:21:47
> 16/05/31 22:21:47 INFO indexer.IndexingJob: Indexer: deleting gone documents: false
> 16/05/31 22:21:47 INFO indexer.IndexingJob: Indexer: URL filtering: false
> 16/05/31 22:21:47 INFO indexer.IndexingJob: Indexer: URL normalizing: false
> 16/05/31 22:21:47 INFO plugin.PluginRepository: Plugins: looking in: /tmp/hadoop-unjar8621976524622577403/classes/plugins
> 16/05/31 22:21:47 INFO plugin.PluginRepository: Plugin Auto-activation mode: [true]
> 16/05/31 22:21:47 INFO plugin.PluginRepository: Registered Plugins:
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Regex URL Filter (urlfilter-regex)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Html Parse Plug-in (parse-html)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	HTTP Framework (lib-http)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	the nutch core extension points (nutch-extensionpoints)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Basic Indexing Filter (index-basic)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Anchor Indexing Filter (index-anchor)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Tika Parser Plug-in (parse-tika)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Basic URL Normalizer (urlnormalizer-basic)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Regex URL Filter Framework (lib-regex-filter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Regex URL Normalizer (urlnormalizer-regex)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	CyberNeko HTML Parser (lib-nekohtml)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	OPIC Scoring Plug-in (scoring-opic)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Pass-through URL Normalizer (urlnormalizer-pass)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Http Protocol Plug-in (protocol-http)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	SolrIndexWriter (indexer-solr)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: Registered Extension-Points:
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch Content Parser (org.apache.nutch.parse.Parser)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch URL Filter (org.apache.nutch.net.URLFilter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch Scoring (org.apache.nutch.scoring.ScoringFilter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch Protocol (org.apache.nutch.protocol.Protocol)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch URL Ignore Exemption Filter (org.apache.nutch.net.URLExemptionFilter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch Index Writer (org.apache.nutch.indexer.IndexWriter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter)
> 16/05/31 22:21:47 INFO plugin.PluginRepository: 	Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
> 16/05/31 22:21:47 INFO indexer.IndexWriters: Adding org.apache.nutch.indexwriter.solr.SolrIndexWriter
> 16/05/31 22:21:47 INFO indexer.IndexingJob: Active IndexWriters :
> SOLRIndexWriter
> 	solr.server.url : URL of the SOLR instance
> 	solr.zookeeper.hosts : URL of the Zookeeper quorum
> 	solr.commit.size : buffer size when sending to SOLR (default 1000)
> 	solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
> 	solr.auth : use authentication (default false)
> 	solr.auth.username : username for authentication
> 	solr.auth.password : password for authentication
> 16/05/31 22:21:47 INFO indexer.IndexerMapReduce: IndexerMapReduce: crawldb: crawl_Test1/crawldb
> 16/05/31 22:21:47 INFO indexer.IndexerMapReduce: IndexerMapReduce: linkdb: crawl_Test1/linkdb
> 16/05/31 22:21:48 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
> 16/05/31 22:21:48 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
> 16/05/31 22:21:54 INFO mapred.FileInputFormat: Total input paths to process : 2
> 16/05/31 22:21:54 INFO mapreduce.JobSubmitter: number of splits:3
> 16/05/31 22:21:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1464692893405_0045
> 16/05/31 22:21:55 INFO impl.YarnClientImpl: Submitted application application_1464692893405_0045
> 16/05/31 22:21:55 INFO mapreduce.Job: The url to track the job: http://localhost:9046/proxy/application_1464692893405_0045/
> 16/05/31 22:21:55 INFO mapreduce.Job: Running job: job_1464692893405_0045
> 16/05/31 22:22:16 INFO mapreduce.Job: Job job_1464692893405_0045 running in uber mode
: false
> 16/05/31 22:22:16 INFO mapreduce.Job:  map 0% reduce 0%
> 16/05/31 22:22:28 INFO mapreduce.Job:  map 100% reduce 0%
> 16/05/31 22:22:33 INFO mapreduce.Job: Task Id : attempt_1464692893405_0045_r_000000_0,
Status : FAILED
> Error: Bad return type
> Exception Details:
>   Location:
>     org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;)Lorg/apache/http/impl/client/CloseableHttpClient;
@57: areturn
>   Reason:
>     Type 'org/apache/http/impl/client/SystemDefaultHttpClient' (current frame, stack[0])
is not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
>   Current Frame:
>     bci: @57
>     flags: { }
>     locals: { 'org/apache/solr/common/params/SolrParams', 'org/apache/solr/common/params/ModifiableSolrParams',
'org/apache/http/impl/client/SystemDefaultHttpClient' }
>     stack: { 'org/apache/http/impl/client/SystemDefaultHttpClient' }
>   Bytecode:
>     0x0000000: bb00 0359 2ab7 0004 4cb2 0005 b900 0601
>     0x0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
>     0x0000020: b600 0a2b b600 0bb6 000c b900 0d02 00b8
>     0x0000030: 000e 4d2c 2bb8 000f 2cb0               
>   Stackmap Table:
>     append_frame(@47,Object[#143])
> 16/05/31 22:22:40 INFO mapreduce.Job: Task Id : attempt_1464692893405_0045_r_000000_1,
Status : FAILED
> Error: Bad return type
> Exception Details:
>   Location:
>     org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;)Lorg/apache/http/impl/client/CloseableHttpClient;
@57: areturn
>   Reason:
>     Type 'org/apache/http/impl/client/SystemDefaultHttpClient' (current frame, stack[0])
is not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
>   Current Frame:
>     bci: @57
>     flags: { }
>     locals: { 'org/apache/solr/common/params/SolrParams', 'org/apache/solr/common/params/ModifiableSolrParams',
'org/apache/http/impl/client/SystemDefaultHttpClient' }
>     stack: { 'org/apache/http/impl/client/SystemDefaultHttpClient' }
>   Bytecode:
>     0x0000000: bb00 0359 2ab7 0004 4cb2 0005 b900 0601
>     0x0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
>     0x0000020: b600 0a2b b600 0bb6 000c b900 0d02 00b8
>     0x0000030: 000e 4d2c 2bb8 000f 2cb0               
>   Stackmap Table:
>     append_frame(@47,Object[#143])
> 16/05/31 22:22:46 INFO mapreduce.Job: Task Id : attempt_1464692893405_0045_r_000000_2,
Status : FAILED
> Error: Bad return type
> Exception Details:
>   Location:
>     org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;Lorg/apache/http/conn/ClientConnectionManager;)Lorg/apache/http/impl/client/CloseableHttpClient;
@58: areturn
>   Reason:
>     Type 'org/apache/http/impl/client/DefaultHttpClient' (current frame, stack[0]) is
not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
>   Current Frame:
>     bci: @58
>     flags: { }
>     locals: { 'org/apache/solr/common/params/SolrParams', 'org/apache/http/conn/ClientConnectionManager',
'org/apache/solr/common/params/ModifiableSolrParams', 'org/apache/http/impl/client/DefaultHttpClient'
}
>     stack: { 'org/apache/http/impl/client/DefaultHttpClient' }
>   Bytecode:
>     0x0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
>     0x0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
>     0x0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b
>     0x0000030: b800 104e 2d2c b800 0f2d b0            
>   Stackmap Table:
>     append_frame(@47,Object[#143])
> 16/05/31 22:22:53 INFO mapreduce.Job:  map 100% reduce 100%
> 16/05/31 22:22:53 INFO mapreduce.Job: Job job_1464692893405_0045 failed with state FAILED
due to: Task failed task_1464692893405_0045_r_000000
> Job failed as tasks failed. failedMaps:0 failedReduces:1
> 16/05/31 22:22:54 INFO mapreduce.Job: Counters: 37
> 	File System Counters
> 		FILE: Number of bytes read=0
> 		FILE: Number of bytes written=458051
> 		FILE: Number of read operations=0
> 		FILE: Number of large read operations=0
> 		FILE: Number of write operations=0
> 		HDFS: Number of bytes read=17460
> 		HDFS: Number of bytes written=0
> 		HDFS: Number of read operations=12
> 		HDFS: Number of large read operations=0
> 		HDFS: Number of write operations=0
> 	Job Counters 
> 		Failed reduce tasks=4
> 		Launched map tasks=3
> 		Launched reduce tasks=4
> 		Data-local map tasks=3
> 		Total time spent by all maps in occupied slots (ms)=56496
> 		Total time spent by all reduces in occupied slots (ms)=30056
> 		Total time spent by all map tasks (ms)=28248
> 		Total time spent by all reduce tasks (ms)=15028
> 		Total vcore-milliseconds taken by all map tasks=28248
> 		Total vcore-milliseconds taken by all reduce tasks=15028
> 		Total megabyte-milliseconds taken by all map tasks=28925952
> 		Total megabyte-milliseconds taken by all reduce tasks=15388672
> 	Map-Reduce Framework
> 		Map input records=184
> 		Map output records=184
> 		Map output bytes=15037
> 		Map output materialized bytes=15428
> 		Input split bytes=392
> 		Combine input records=0
> 		Spilled Records=184
> 		Failed Shuffles=0
> 		Merged Map outputs=0
> 		GC time elapsed (ms)=758
> 		CPU time spent (ms)=6200
> 		Physical memory (bytes) snapshot=841703424
> 		Virtual memory (bytes) snapshot=5765849088
> 		Total committed heap usage (bytes)=611319808
> 	File Input Format Counters 
> 		Bytes Read=17068
> 16/05/31 22:22:54 ERROR indexer.IndexingJob: Indexer: java.io.IOException: Job failed!
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
> 	at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
> 	at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message