lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddhartha Singh Sandhu <sandhus...@gmail.com>
Subject Re: SolrCloud Shard console shows roughly same number of documents?
Date Sat, 28 May 2016 16:58:42 GMT
Still struggling with this. Bump. :)

On Thu, May 26, 2016 at 3:53 PM, Siddhartha Singh Sandhu <
sandhusolr@gmail.com> wrote:

> Hi Erick,
>
> Thank you for the reply. What I meant was suppose I have the config:
>
> 2 shards each with 1 replica.
>
> Hence, on both servers I have
> 1.  shard1_replica1
> 2 . shard2_replica1
>
> Suppose I have 50 documents then,
> shard1_replica1 + shard2_replica1 = 50 ?
>
> or shard2_replica1 = 50 && shard1_replica1 = 50 ?
>
> Regards,
>
> Sid.
>
> On Thu, May 26, 2016 at 2:30 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> Q1: Not quite sure what you mean. Let's say I have 2 shards, 3
>> replicas each 16 docs on each.I _think_ you're
>> talking about the "core selector", which shows the docs on that
>> particular core, 16 in our case not 48.
>>
>> Q2: Yes, that's how SolrCloud is designed. It has to be for HA/DR.
>> Every replica in a shard has all the docs, 16 as above. Otherwise if
>> one of your machines went down there could be no guarantee even
>> attempted about there not being data loss.
>>
>> Q3: Yes, indexing will be slower when there is more than one replica
>> per shard since the raw document is forwarded from the leader to all
>> followers before acking back. In distributed situations, you will have
>> a bunch (potentially) more machines doing indexing so total throughput
>> can be faster.
>>
>> Why do you care? Is there a problem or is this just general background
>> info? There are a number of techniques for speeding up indexing, the
>> first is to use SolrJ and CloudSolrClient and send batches of docs at
>> once rather than one-at-a-time.
>>
>> Best,
>> Erick
>>
>> On Wed, May 25, 2016 at 1:54 PM, Siddhartha Singh Sandhu
>> <sandhusolr@gmail.com> wrote:
>> > Hi,
>> >
>> > I recently moved to a SolrCloud config. I had a few questions:
>> >
>> > Q1. Does a shard show cumulative number of documents or documents
>> present
>> > in that particular shard on the admin console of respective shard?
>> >
>> > Q2. If 1's answer is non-cumulative then my shards(on different servers)
>> > are indexing all the documents on each instance of shard. Is this
>> natural?
>> > I created the shards with compositeId.
>> >
>> > Q3. If the answer to 1 is cumulative then my indexing was slower then a
>> > single core instance which was on the same machine of which I have 2
>> >  now(my shards). What could I be missing while configuring Solr?
>> >
>> >
>> > I am using Solr 6.0.0 on Ubuntu 14.04 with external zookeeper.
>> >
>> > Regards,
>> >
>> > Sid.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message