lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Unable to index rich-text documents in Solr Cloud
Date Thu, 19 Mar 2015 06:37:26 GMT
Hi Charlee,

I've followed the setup from the Solr In Action book, and assign port 8983
to shard1 and port 8984 to shard2. Will it cause any issues?

Regards,
Edwin

On 19 March 2015 at 13:02, Charlee Chitsuk <charlee.ch@gmail.com> wrote:

> The http://192.168.2.2:8984/solr/
> <
> http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.23.72%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2
> >
> ,
> the port number 8984 may be an HTTPS. The HTTP port should be 8983.
>
> Hope this help.
>
> --
>    Best Regards,
>
>    Charlee Chitsuk
>
> =======================
> Application Security Product Group
> *Summit Computer Co., Ltd.* <http://www.summitthai.com/>
> E-Mail: charlee@summitthai.com
> Tel: +66-2-238-0895 to 9 ext. 164
> Fax: +66-2-236-7392
> =======================
> *@ Your Success is Our Pride*
> ------------------------------------------
>
> 2015-03-19 11:49 GMT+07:00 Damien Kamerman <damienk@gmail.com>:
>
> > It sounds like https://issues.apache.org/jira/browse/SOLR-5551
> > Have you checked the solr.log for all nodes?
> >
> > On 19 March 2015 at 14:43, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
> > wrote:
> >
> > > This is the logs that I got from solr.log. I can't seems to figure out
> > > what's wrong with it. Does anyone knows?
> > >
> > >
> > >
> > > ERROR - 2015-03-18 15:06:51.019;
> > > org.apache.solr.update.StreamingSolrClients$1; error
> > > org.apache.solr.common.SolrException: Bad Request
> > >
> > >
> > >
> > > request:
> > >
> > >
> >
> http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2
> > > <
> > >
> >
> http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.23.72%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2
> > > >
> > > at
> > >
> > >
> >
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241)
> > > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> > > at java.lang.Thread.run(Unknown Source)
> > > INFO  - 2015-03-18 15:06:51.019;
> > > org.apache.solr.update.processor.LogUpdateProcessor; [logmill]
> > webapp=/solr
> > > path=/update/extract params={literal.id
> > > =C:\Users\edwin\solr-5.0.0\example\exampledocs\solr-word.pdf&
> > resource.name
> > > =C:\Users\edwin\solr-5.0.0\example\exampledocs\solr-word.pdf}
> > > {add=[C:\Users\edwin\solr-5.0.0\example\exampledocs\solr-word.pdf]} 0
> > 1252
> > > INFO  - 2015-03-18 15:06:51.029;
> > > org.apache.solr.update.DirectUpdateHandler2; start
> > >
> > >
> >
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > INFO  - 2015-03-18 15:06:51.029;
> > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
> > > Skipping IW.commit.
> > > INFO  - 2015-03-18 15:06:51.029; org.apache.solr.core.SolrCore;
> > > SolrIndexSearcher has not changed - not re-opening:
> > > org.apache.solr.search.SolrIndexSearcher
> > > INFO  - 2015-03-18 15:06:51.039;
> > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
> > > INFO  - 2015-03-18 15:06:51.039;
> > > org.apache.solr.update.processor.LogUpdateProcessor; [logmill]
> > webapp=/solr
> > > path=/update params={waitSearcher=true&distrib.from=
> > >
> > >
> >
> http://192.168.2.2:8983/solr/logmill/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false
> > > }
> > > {commit=} 0 10
> > > INFO  - 2015-03-18 15:06:51.039;
> > > org.apache.solr.update.processor.LogUpdateProcessor; [logmill]
> > webapp=/solr
> > > path=/update params={commit=true} {commit=} 0 10
> > >
> > >
> > >
> > > Regards,
> > > Edwin
> > >
> > >
> > > On 19 March 2015 at 10:56, Damien Kamerman <damienk@gmail.com> wrote:
> > >
> > > > I suggest you check your solr logs for more info as to the cause.
> > > >
> > > > On 19 March 2015 at 12:58, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Erick,
> > > > >
> > > > > No, the PDF file is a testing file which only contains 1 sentence.
> > > > >
> > > > > I've managed to get it to work by removing startup="lazy" in
> > > > > the ExtractingRequestHandler and added the following lines:
> > > > >       <str name="uprefix">ignored_</str>
> > > > >       <str name="captureAttr">true</str>
> > > > >       <str name="fmap.a">links</str>
> > > > >       <str name="fmap.div">ignored_</str>
> > > > >
> > > > > Does the presence of startup="lazy" affect the function of
> > > > > ExtractingRequestHandler , or is it one of the str name values?
> > > > >
> > > > > Regards,
> > > > > Edwin
> > > > >
> > > > >
> > > > > On 18 March 2015 at 23:19, Erick Erickson <erickerickson@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Shot in the dark, but is the PDF file significantly larger than
> the
> > > > > > others? Perhaps your simply exceeding the packet limits for
the
> > > > > > servlet container?
> > > > > >
> > > > > > Best,
> > > > > > Erick
> > > > > >
> > > > > > On Wed, Mar 18, 2015 at 12:22 AM, Zheng Lin Edwin Yeo
> > > > > > <edwinyeozl@gmail.com> wrote:
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'm having some issues with indexing rich-text documents
from
> the
> > > > Solr
> > > > > > > Cloud. When I tried to index a pdf or word document, I
get the
> > > > > following
> > > > > > > error:
> > > > > > >
> > > > > > >
> > > > > > > org.apache.solr.common.SolrException: Bad Request
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > request:
> > > > > >
> > > > >
> > > >
> > >
> >
> http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2
> > > > > > >         at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241)
> > > > > > >         at
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> > > > > > Source)
> > > > > > >         at
> > > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> > > > > > Source)
> > > > > > >         at java.lang.Thread.run(Unknown Source)
> > > > > > >
> > > > > > >
> > > > > > > I'm able to index .xml and .csv files in Solr Cloud with
the
> same
> > > > > > configuration.
> > > > > > >
> > > > > > > I have setup Solr Cloud using the default zookeeper in
Solr
> > 5.0.0,
> > > > and
> > > > > > > I have 2 shards with the following details:
> > > > > > > Shard1: 192.168.2.2:8983
> > > > > > > Shard2: 192.168.2.2:8984
> > > > > > >
> > > > > > > Prior to this, I'm already able to index rich-text documents
> > > without
> > > > > > > the Solr Cloud, and I'm using the same solrconfig.xml and
> > > schema.xml,
> > > > > > > so my ExtractRequestHandler is already defined.
> > > > > > >
> > > > > > > Is there other settings required in order to index rich-text
> > > > documents
> > > > > > > in Solr Cloud?
> > > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > > Edwin
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Damien Kamerman
> > > >
> > >
> >
> >
> >
> > --
> > Damien Kamerman
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message