lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SOLR-5821) Search inconsistency on SolrCloud replicas
Date Fri, 07 Mar 2014 17:41:43 GMT

     [ https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erick Erickson resolved SOLR-5821.
----------------------------------

    Resolution: Invalid

First, please raise issues like this on the user's before
raising a JIRA to be sure you are really seeing a bug 
rather than simply misunderstanding.

If your hypothesis is true, try specifying a secondary
known ordering. If scores are tied, then Solr/Lucene
will return the document in internal Lucene ID order,
and you're quite correct that the internal order may be
different in different shards.

Testing this should be as simple as specifying something
similar to 
&sort=score desc, id asc


> Search inconsistency on SolrCloud replicas
> ------------------------------------------
>
>                 Key: SOLR-5821
>                 URL: https://issues.apache.org/jira/browse/SOLR-5821
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.6.1
>         Environment: SolrCloud:
> 1 shard, 2 replicas
> Both instances/replicas have identical hardware/software:
> CPU(s): 4
> RAM: 8Gb
> HDD: 100Gb
> OS: CentOS 6.5
> ZooKeeper 3.4.5
> Tomcat 8.0.3
> Solr 4.6.1
> Servers are utilized to run Solr only.
>            Reporter: Maxim Novikov
>            Priority: Critical
>              Labels: cloud, inconsistency, replica, search
>
> We use the following infrastructure:
> SolrCloud with 1 shard and 2 replicas. The index is built using DataImportHandler (importing
data from the database). The number of items in the index can vary from 100 to 100,000,000.
> After indexing part of the data (not necessarily all the data, it is enough to have a
small number of items in the search index), we can observe that Solr instances (replicas)
return different results for the same search queries. I believe it happens because some of
the results have the same scores, and Solr instances return those in a random order.
> PS This is a critical issue for us as we use a load balancer to scale Solr through replicas,
and as a result of this issue, we retrieve various results for the same queries all the time.
They are not necessarily completely different, but even a couple of items that differ is a
deal breaker.
> The expected behaviour would be to always get identical results for the same search queries
from all replicas. Otherwise, this "cloud" thing works just unreliably.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message