lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "S.L" <>
Subject Re: SolrCloud 4.7 not doing distributed search when querying from a load balancer.
Date Fri, 17 Oct 2014 00:27:36 GMT

Please find the answers to your questions.

1. Java Version :java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

CentOS Linux release 7.0.1406 (Core)

3. Everything is 64 bit , OS , Java , and CPU.

4. Java Args.

5. Zookeeper ensemble has 3 zookeeper instances , which are external and
are not embedded.

6. Container : I am using Tomcat Apache Tomcat Version 7.0.42

*Additional Observations:*

I queries all docs on both replicas with distrib=false&fl=id&sort=id+asc,
then compared the two lists, I could see by eyeballing the first few lines
of ids in both the lists ,I could say that even though each list has equal
number of documents i.e 96309 each , but the document ids in them seem to
be *mutually exclusive* ,  , I did not find even a single  common id in
those lists , I tried at least 15 manually ,it looks like to me that the
replicas are disjoint sets.


On Thu, Oct 16, 2014 at 1:41 AM, Shawn Heisey <> wrote:

> On 10/15/2014 10:24 PM, S.L wrote:
>> Yes , I tried those two queries with distrib=false , I get 0 results for
>> first and 1 result  for the second query( (i.e. server 3 shard 2 replica
>> 2)  consistently.
>> However if I run the same second query (i.e. server 3 shard 2 replica 2)
>> with distrib=true, I sometimes get a result and sometimes not , should'nt
>> this query always return a result when its pointing to a core that seems
>> to
>> have that document regardless of distrib=true or false ?
>> Unfortunately I dont see anything particular in the logs to point to any
>> information.
>> BTW you asked me to replace the request handler , I use the select request
>> handler ,so I cannot replace it with anything else , is that  a problem ?
> If you send the query with distrib=true (which is the default value in
> SolrCloud), then it treats it just as if you had sent it to
> /solr/collection instead of /solr/collection_shardN_replicaN, so it's a
> full distributed query. The distrib=false is required to turn that behavior
> off and ONLY query the index on the actual core where you sent it.
> I only said to replace those things as appropriate.  Since you are using
> /select, it's no problem that you left it that way. If I were to assume
> that you used /select, but you didn't, the URLs as I wrote them might not
> have worked.
> As discussed, this means that your replicas are truly out of sync.  It's
> difficult to know what caused it, especially if you can't see anything in
> the log when you indexed the missing documents.
> We know you're on Solr 4.10.1.  This means that your Java is a 1.7
> version, since Java7 is required.
> Here's where I ask a whole lot of questions about your setup. What is the
> precise Java version, and which vendor's Java are you using?  What
> operating system is it on?  Is everything 64-bit, or is any piece (CPU, OS,
> Java) 32-bit?  On the Solr admin UI dashboard, it lists all parameters used
> when starting Java, labelled as "Args".  Can you include those?  Is
> zookeeper external, or embedded in Solr?  Is it a 3-server (or more)
> ensemble?  Are you using the example jetty, or did you provide your own
> servlet container?
> We recommend 64-bit Oracle Java, the latest 1.7 version.  OpenJDK (since
> version 1.7.x) should be pretty safe as well, but IBM's Java should be
> avoided.  IBM does very aggressive runtime optimizations.  These can make
> programs run faster, but they are known to negatively affect Lucene/Solr.
> Thanks,
> Shawn

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message