nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zara Parst <edotserv...@gmail.com>
Subject Re: [MASSMAIL]Re: Nutch/Solr communication problem
Date Wed, 20 Jan 2016 09:52:58 GMT
Hi,

Everyone if you check the log file it does talk about the error,  here is
re-briefing of problem.

1. Solr without any authentication => Nutch work successfully and it
populate the solr core say (abc)
2. Solr with protection  and Nutch  solr.auth=false  => unauthorized access
which make sense.
3.   Solr with protection  and Nutch  solr.auth=trur and correct id and
pass in config file => It spit out the error and I have attached the log at
the bottom of this email.

When I use authentication nutch is not able to insert data. However problem
is not related to solr because if I try to populate data with solr having
id and password and nutch with solr.auth=false it does print unauthorized
access and that makes sense. Now with solr.auth=true and id and password in
nutch-default nutch is not able to insert data and below is the error log.
I guess is there any user right like admin or content-admin in solr ??
That too I tried with all kind of users and always same error. If some one
can try and see if they can push the data with protected solr. If you are
not getting error then please tell me what are the configuration you are
using in detail ?? Treat me like novice and then tell me how to do it.
Because I tried all kind of permutation of configuration both in solr and
nutch side without any luck. Please do help me this is a genuine request .
I do understand you guys are pretty busy with your work its not that i am
just bothering you without my homework.


Please see the log

2016-01-20 07:02:15,658 INFO  indexer.IndexWriters - Adding
org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-01-20 07:04:36,366 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2016-01-20 07:04:36,656 INFO  segment.SegmentChecker - Segment dir is
complete:
file:/home/rakesh/Desktop/arima/nutch/runtime/local/yahCrawl/segements/20160119163402.
2016-01-20 07:04:36,658 INFO  segment.SegmentChecker - Segment dir is
complete:
file:/home/rakesh/Desktop/arima/nutch/runtime/local/yahCrawl/segements/20160119163656.
2016-01-20 07:04:36,673 INFO  segment.SegmentChecker - Segment dir is
complete:
file:/home/rakesh/Desktop/arima/nutch/runtime/local/yahCrawl/segements/20160119164952.
2016-01-20 07:04:36,674 INFO  indexer.IndexingJob - Indexer: starting at
2016-01-20 07:04:36
2016-01-20 07:04:36,676 INFO  indexer.IndexingJob - Indexer: deleting gone
documents: false
2016-01-20 07:04:36,676 INFO  indexer.IndexingJob - Indexer: URL filtering:
false
2016-01-20 07:04:36,676 INFO  indexer.IndexingJob - Indexer: URL
normalizing: false
2016-01-20 07:04:37,036 INFO  indexer.IndexWriters - Adding
org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-01-20 07:04:37,036 INFO  indexer.IndexingJob - Active IndexWriters :
SolrIndexWriter
solr.server.type : Type of SolrServer to communicate with (default 'http'
however options include 'cloud', 'lb' and 'concurrent')
solr.server.url : URL of the Solr instance (mandatory)
solr.zookeeper.url : URL of the Zookeeper URL (mandatory if 'cloud' value
for solr.server.type)
solr.loadbalance.urls : Comma-separated string of Solr server strings to be
used (madatory if 'lb' value for solr.server.type)
solr.mapping.file : name of the mapping file for fields (default
solrindex-mapping.xml)
solr.commit.size : buffer size when sending to Solr (default 1000)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication


2016-01-20 07:04:37,039 INFO  indexer.IndexerMapReduce - IndexerMapReduce:
crawldb: yahCrawl/crawldb
2016-01-20 07:04:37,039 INFO  indexer.IndexerMapReduce - IndexerMapReduce:
linkdb: yahCrawl/linkdb
2016-01-20 07:04:37,039 INFO  indexer.IndexerMapReduce - IndexerMapReduces:
adding segment:
file:/home/rakesh/Desktop/arima/nutch/runtime/local/yahCrawl/segements/20160119163402
2016-01-20 07:04:37,045 INFO  indexer.IndexerMapReduce - IndexerMapReduces:
adding segment:
file:/home/rakesh/Desktop/arima/nutch/runtime/local/yahCrawl/segements/20160119163656
2016-01-20 07:04:37,046 INFO  indexer.IndexerMapReduce - IndexerMapReduces:
adding segment:
file:/home/rakesh/Desktop/arima/nutch/runtime/local/yahCrawl/segements/20160119164952
2016-01-20 07:04:37,047 WARN  indexer.IndexerMapReduce - Ignoring linkDb
for indexing, no linkDb found in path: yahCrawl/linkdb
2016-01-20 07:04:38,151 WARN  conf.Configuration -
file:/tmp/hadoop-rakesh/mapred/staging/rakesh1643615475/.staging/job_local1643615475_0001/job.xml:an
attempt to override final parameter:
mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2016-01-20 07:04:38,153 WARN  conf.Configuration -
file:/tmp/hadoop-rakesh/mapred/staging/rakesh1643615475/.staging/job_local1643615475_0001/job.xml:an
attempt to override final parameter:
mapreduce.job.end-notification.max.attempts;  Ignoring.
2016-01-20 07:04:38,312 WARN  conf.Configuration -
file:/tmp/hadoop-rakesh/mapred/local/localRunner/rakesh/job_local1643615475_0001/job_local1643615475_0001.xml:an
attempt to override final parameter:
mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2016-01-20 07:04:38,314 WARN  conf.Configuration -
file:/tmp/hadoop-rakesh/mapred/local/localRunner/rakesh/job_local1643615475_0001/job_local1643615475_0001.xml:an
attempt to override final parameter:
mapreduce.job.end-notification.max.attempts;  Ignoring.
2016-01-20 07:04:39,258 INFO  anchor.AnchorIndexingFilter - Anchor
deduplication is: off
2016-01-20 07:04:40,773 INFO  indexer.IndexWriters - Adding
org.apache.nutch.indexwriter.solr.SolrIndexWriter
*2016-01-20 07:04:40,784 INFO  solr.SolrUtils - Authenticating as: radmin*
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: content
dest: content
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: title dest:
title
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: host dest:
host
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: segment
dest: segment
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: boost dest:
boost
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: digest dest:
digest
2016-01-20 07:04:41,018 INFO  solr.SolrMappingReader - source: tstamp dest:
tstamp
2016-01-20 07:04:41,091 INFO  solr.SolrIndexWriter - Indexing 3 documents
2016-01-20 07:04:41,340 INFO  solr.SolrIndexWriter - Indexing 3 documents
2016-01-20 07:04:41,398 WARN  mapred.LocalJobRunner -
job_local1643615475_0001
java.lang.Exception: java.io.IOException
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.io.IOException
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException(SolrIndexWriter.java:171)
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:157)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:115)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
at
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:502)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:456)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: IOException
occured when talking to server at: http://127.0.0.1:8983/solr/mah
at
org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:153)
... 11 more
*Caused by: org.apache.http.client.ClientProtocolException*
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
... 15 more
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot
retry request with a non-repeatable request entity.
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:208)
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
... 19 more
2016-01-20 07:04:42,430 ERROR indexer.IndexingJob - Indexer:
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)



On Tue, Jan 19, 2016 at 7:44 PM, Roannel Fernández Hernández <roannel@uci.cu
> wrote:

> Hi
>
> I think that your problem is not related with Solr authentication. The
> fields of documents sent by you to Solr and the fields defined in Solr
> schema are differents. Perhaps the Nutch document has a multivalued field
> defined in Solr schema as simple field, or in Solr schema there is a
> required field not sent by Nutch or the primary key has not been sent or ...
>
> Just to confirm that and if it's possible you can remove the Solr
> protection and try it again. If you get the same error, then it is not
> related with Solr authentication and you have to check the fields sent to
> Solr.
>
> Regards
>
>
> ------------------------------
>
> *From: *"Zara Parst" <edotservice@gmail.com>
> *To: *dev@nutch.apache.org
> *Sent: *Monday, January 18, 2016 3:28:29 PM
> *Subject: *[MASSMAIL]Re: Nutch/Solr communication problem
>
>
> I am using solr 5.4 and nutch 1.11
>
> On Tue, Jan 19, 2016 at 1:46 AM, Markus Jelsma <markus.jelsma@openindex.io
> > wrote:
>
>> Hi - it was an answer to your question whether i have ever used it. Yes,
>> i patched and committed it. And therefore i asked if you're using Solr 5 or
>> not. So again, are you using Solr 5?
>>
>> Markus
>>
>>
>> -----Original message-----
>> From: Zara Parst<edotservice@gmail.com>
>> Sent: Monday 18th January 2016 16:16
>> To: dev@nutch.apache.org
>> Subject: Re: Nutch/Solr communication problem
>>
>> Mind to share that patch ?
>>
>> On Mon, Jan 18, 2016 at 8:28 PM, Markus Jelsma <
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io>> wrote:
>> Yes i have used it, i made the damn patch myself years ago, and i used
>> the same configuration. Command line or config work the same.
>>
>> Markus
>>
>> -----Original message-----
>>
>> From: Zara Parst<edotservice@gmail.com <mailto:edotservice@gmail.com>>
>>
>> Sent: Monday 18th January 2016 12:55
>>
>> To: dev@nutch.apache.org <mailto:dev@nutch.apache.org>
>>
>> Subject: Re: Nutch/Solr communication problem
>>
>> Dear Markus,
>>
>> Are you just speaking blindly or what ?? My concern is did you ever try
>> pushing index to solr which is password protected ? If yes can you just
>> tell me what were the config you used , if you did that in config file then
>> let me know or if you did through command then please let me know.
>>
>> thanks
>>
>> On Mon, Jan 18, 2016 at 4:50 PM, Markus Jelsma <
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io> <mailto:
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io>>> wrote:
>>
>> Hi - This doesnt look like a HTTP basic authentication problem. Are you
>> running Solr 5.x?
>>
>> Markus
>>
>> -----Original message-----
>>
>> From: Zara Parst<edotservice@gmail.com <mailto:edotservice@gmail.com>
>> <mailto:edotservice@gmail.com <mailto:edotservice@gmail.com>>>
>>
>> Sent: Monday 18th January 2016 11:55
>>
>> To: dev@nutch.apache.org <mailto:dev@nutch.apache.org> <mailto:
>> dev@nutch.apache.org <mailto:dev@nutch.apache.org>>
>>
>> Subject: Re: Nutch/Solr communication problem
>>
>> SolrIndexWriter
>>
>>         solr.server.type : Type of SolrServer to communicate with
>> (default http however options include cloud, lb and concurrent)
>>
>>         solr.server.url : URL of the Solr instance (mandatory)
>>
>>         solr.zookeeper.url : URL of the Zookeeper URL (mandatory if cloud
>> value for solr.server.type)
>>
>>         solr.loadbalance.urls : Comma-separated string of Solr server
>> strings to be used (madatory if lb value for solr.server.type)
>>
>>         solr.mapping.file : name of the mapping file for fields (default
>> solrindex-mapping.xml)
>>
>>         solr.commit.size : buffer size when sending to Solr (default 1000)
>>
>>         solr.auth : use authentication (default false)
>>
>>         solr.auth.username : username for authentication
>>
>>         solr.auth.password : password for authentication
>>
>> 2016-01-17 19:19:42,973 INFO  indexer.IndexerMapReduce -
>> IndexerMapReduce: crawldb: crawlDbyah/crawldb
>>
>> 2016-01-17 19:19:42,973 INFO  indexer.IndexerMapReduce -
>> IndexerMapReduce: linkdb: crawlDbyah/linkdb
>>
>> 2016-01-17 19:19:42,973 INFO  indexer.IndexerMapReduce -
>> IndexerMapReduces: adding segment: crawlDbyah/segments/20160117191906
>>
>> 2016-01-17 19:19:42,975 WARN  indexer.IndexerMapReduce - Ignoring linkDb
>> for indexing, no linkDb found in path: crawlDbyah/linkdb
>>
>> 2016-01-17 19:19:43,807 WARN  conf.Configuration -
>> file:/tmp/hadoop-rakesh/mapred/staging/rakesh2114349538/.staging/job_local2114349538_0001/job.xml:an
>> attempt to override final parameter:
>> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>
>> 2016-01-17 19:19:43,809 WARN  conf.Configuration -
>> file:/tmp/hadoop-rakesh/mapred/staging/rakesh2114349538/.staging/job_local2114349538_0001/job.xml:an
>> attempt to override final parameter:
>> mapreduce.job.end-notification.max.attempts;  Ignoring.
>>
>> 2016-01-17 19:19:43,963 WARN  conf.Configuration -
>> file:/tmp/hadoop-rakesh/mapred/local/localRunner/rakesh/job_local2114349538_0001/job_local2114349538_0001.xml:an
>> attempt to override final parameter:
>> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>
>> 2016-01-17 19:19:43,980 WARN  conf.Configuration -
>> file:/tmp/hadoop-rakesh/mapred/local/localRunner/rakesh/job_local2114349538_0001/job_local2114349538_0001.xml:an
>> attempt to override final parameter:
>> mapreduce.job.end-notification.max.attempts;  Ignoring.
>>
>> 2016-01-17 19:19:44,260 INFO  anchor.AnchorIndexingFilter - Anchor
>> deduplication is: off
>>
>> 2016-01-17 19:19:45,128 INFO  indexer.IndexWriters - Adding
>> org.apache.nutch.indexwriter.solr.SolrIndexWriter
>>
>> 2016-01-17 19:19:45,148 INFO  solr.SolrUtils - Authenticating as: radmin
>>
>> 2016-01-17 19:19:45,318 INFO  solr.SolrMappingReader - source: content
>> dest: content
>>
>> 2016-01-17 19:19:45,318 INFO  solr.SolrMappingReader - source: title
>> dest: title
>>
>> 2016-01-17 19:19:45,318 INFO  solr.SolrMappingReader - source: host dest:
>> host
>>
>> 2016-01-17 19:19:45,319 INFO  solr.SolrMappingReader - source: segment
>> dest: segment
>>
>> 2016-01-17 19:19:45,319 INFO  solr.SolrMappingReader - source: boost
>> dest: boost
>>
>> 2016-01-17 19:19:45,319 INFO  solr.SolrMappingReader - source: digest
>> dest: digest
>>
>> 2016-01-17 19:19:45,319 INFO  solr.SolrMappingReader - source: tstamp
>> dest: tstamp
>>
>> 2016-01-17 19:19:45,360 INFO  solr.SolrIndexWriter - Indexing 2 documents
>>
>> 2016-01-17 19:19:45,507 INFO  solr.SolrIndexWriter - Indexing 2 documents
>>
>> 2016-01-17 19:19:45,526 WARN  mapred.LocalJobRunner -
>> job_local2114349538_0001
>>
>> java.lang.Exception: java.io.IOException
>>
>>         at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>>
>>         at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>>
>> Caused by: java.io.IOException
>>
>>         at
>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException(SolrIndexWriter.java:171)
>>
>>         at
>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:157)
>>
>>         at
>> org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:115)
>>
>>         at
>> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
>>
>>         at
>> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:502)
>>
>>         at
>> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:456)
>>
>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>>
>>         at
>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: org.apache.solr.client.solrj.SolrServerException: IOException
>> occured when talking to server at: http://127.0.0.1:8983/solr/yah <
>> http://127.0.0.1:8983/solr/yah> <http://127.0.0.1:8983/solr/yah <
>> http://127.0.0.1:8983/solr/yah>> <http://127.0.0.1:8983/solr/yah <
>> http://127.0.0.1:8983/solr/yah> <http://127.0.0.1:8983/solr/yah <
>> http://127.0.0.1:8983/solr/yah>>>
>>
>>         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566)
>>
>>         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
>>
>>         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
>>
>>         at
>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
>>
>>         at
>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:153)
>>
>>         ... 11 more
>>
>> Caused by: org.apache.http.client.ClientProtocolException
>>
>>         at
>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
>>
>>         at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
>>
>>         at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
>>
>>         at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
>>
>>         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
>>
>>         ... 15 more
>>
>> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot
>> retry request with a non-repeatable request entity.
>>
>>         at
>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:208)
>>
>>         at
>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
>>
>>         at
>> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
>>
>>         at
>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
>>
>>         at
>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
>>
>>         ... 19 more
>>
>> 2016-01-17 19:19:46,055 ERROR indexer.IndexingJob - Indexer:
>> java.io.IOException: Job failed!
>>
>>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>>
>>         at
>> org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>>
>>         at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>>
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>
>>         at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
>>
>> On Mon, Jan 18, 2016 at 4:15 PM, Markus Jelsma <
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io> <mailto:
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io>> <mailto:
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io> <mailto:
>> markus.jelsma@openindex.io <mailto:markus.jelsma@openindex.io>>>>
wrote:
>>
>> Hi - can you post the log output?
>>
>> Markus
>>
>> -----Original message-----
>>
>> From: Zara Parst<edotservice@gmail.com <mailto:edotservice@gmail.com>
>> <mailto:edotservice@gmail.com <mailto:edotservice@gmail.com>> <mailto:
>> edotservice@gmail.com <mailto:edotservice@gmail.com> <mailto:
>> edotservice@gmail.com <mailto:edotservice@gmail.com>>>>
>>
>> Sent: Monday 18th January 2016 2:06
>>
>> To: dev@nutch.apache.org <mailto:dev@nutch.apache.org> <mailto:
>> dev@nutch.apache.org <mailto:dev@nutch.apache.org>> <mailto:
>> dev@nutch.apache.org <mailto:dev@nutch.apache.org> <mailto:
>> dev@nutch.apache.org <mailto:dev@nutch.apache.org>>>
>>
>> Subject: Nutch/Solr communication problem
>>
>> Hi everyone,
>>
>> I have situation here, I am using nutch 1.11 and solr 5.4
>>
>> Solr is protected by user name and password  I am passing credential to
>> solr using following command
>>
>> bin/crawl -i -Dsolr.server.url=http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc>> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc>>> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc>> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc <
>> http://localhost:8983/solr/abc>>>>  -D solr.auth=true
>>  -Dsolr.auth.username=xxxx  -Dsolr.auth.password=xxx  url crawlDbyah 1
>>
>> and always same problem , please help me how to feed data to protected
>> solr.
>>
>> Below is error message.
>>
>> Indexer: starting at 2016-01-17 19:01:12
>>
>> Indexer: deleting gone documents: false
>>
>> Indexer: URL filtering: false
>>
>> Indexer: URL normalizing: false
>>
>> Active IndexWriters :
>>
>> SolrIndexWriter
>>
>>         solr.server.type : Type of SolrServer to communicate with
>> (default http however options include cloud, lb and concurrent)
>>
>>         solr.server.url : URL of the Solr instance (mandatory)
>>
>>         solr.zookeeper.url : URL of the Zookeeper URL (mandatory if cloud
>> value for solr.server.type)
>>
>>         solr.loadbalance.urls : Comma-separated string of Solr server
>> strings to be used (madatory if lb value for solr.server.type)
>>
>>         solr.mapping.file : name of the mapping file for fields (default
>> solrindex-mapping.xml)
>>
>>         solr.commit.size : buffer size when sending to Solr (default 1000)
>>
>>         solr.auth : use authentication (default false)
>>
>>         solr.auth.username : username for authentication
>>
>>         solr.auth.password : password for authentication
>>
>> Indexing 2 documents
>>
>> Indexing 2 documents
>>
>> Indexer: java.io.IOException: Job failed!
>>
>>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>>
>>         at
>> org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>>
>>         at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>>
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>
>>         at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
>>
>> I also tried username and password in nutch-default.xml but again same
>> error. Please help me out.
>>
>>
>>
>
>
>
>

Mime
View raw message