lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Unable to index rich-text documents in Solr Cloud
Date Wed, 18 Mar 2015 15:19:26 GMT
Shot in the dark, but is the PDF file significantly larger than the
others? Perhaps your simply exceeding the packet limits for the
servlet container?

Best,
Erick

On Wed, Mar 18, 2015 at 12:22 AM, Zheng Lin Edwin Yeo
<edwinyeozl@gmail.com> wrote:
> Hi everyone,
>
> I'm having some issues with indexing rich-text documents from the Solr
> Cloud. When I tried to index a pdf or word document, I get the following
> error:
>
>
> org.apache.solr.common.SolrException: Bad Request
>
>
>
> request: http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2
>         at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>         at java.lang.Thread.run(Unknown Source)
>
>
> I'm able to index .xml and .csv files in Solr Cloud with the same configuration.
>
> I have setup Solr Cloud using the default zookeeper in Solr 5.0.0, and
> I have 2 shards with the following details:
> Shard1: 192.168.2.2:8983
> Shard2: 192.168.2.2:8984
>
> Prior to this, I'm already able to index rich-text documents without
> the Solr Cloud, and I'm using the same solrconfig.xml and schema.xml,
> so my ExtractRequestHandler is already defined.
>
> Is there other settings required in order to index rich-text documents
> in Solr Cloud?
>
>
> Regards,
> Edwin

Mime
View raw message