lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: SolrCloud MatchAllDocsQuery returning different number of docs each request
Date Thu, 02 Aug 2012 17:27:31 GMT

On Aug 2, 2012, at 11:08 AM, Timothy Potter <thelabdude@gmail.com> wrote:

> Just starting to get into SolrCloud using 4.0.0-ALPHA and am very
> impressed so far ...
> 
> I have a 12-shard index with ~104M docs with each shard having
> 1-replica (so 24 Solr servers running)
> 
> Using the Query form on the Admin panel, I issue the MatchAllDocsQuery
> (*:*) and each time I send the request the value for numFound in the
> result is different. It's always close but not exactly the same as I
> would expect? Can anyone shed some light on this issue? I also tried a
> real query, such as "#olympics lochte" and same thing - different
> numFound each time. The first page of actual docs returned is the same
> so maybe I should just ignore the numFound issue?
> 
> Note that while experiencing this behavior, I am not adding any docs
> to the index and all docs have been committed with waitFlush=true and
> waitSearcher=true on the commit. Also, not doing soft commits at this
> point. In addition, after having committed all 104M docs, I hit the
> optimize button the panel so I have only 1 segment. In other words,
> the index is not being updated and has been optimized at this point.


How are you adding docs? Eg what client and what method in particular (what is your line of
code that actually adds the doc).

You can find the numFound result for each node by passing the param distrib=false. What does
this tell you? Are your replicas in sync with the leader? What does the count for each shard
add up to?

I would not ignore the issue - something must be off. It may somehow be user error, it may
be a bug that has been fixed since the alpha, or it may be something new.

Are you sure every shard you are issuing the query *from* is active and live according to
ZooKeeper? Eg when you look at the cloud admin view and look at the cluster visualization,
are all the nodes green?

- Mark Miller
lucidimagination.com












Mime
View raw message