lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: Taking Solr to production
Date Fri, 22 Jan 2016 23:05:52 GMT
"1 Leader & 3 Replicas"

SolrCloud does not distinguish leaders from replicas - that's old
master-slave terminology. The leader is just one of the replicas.

So, are you really talking about 2 shards with 4 replicas each or 2 shards
with 2 replicas each?

Putting multiple replica instances on each machine isn't buying you
anything, just making it more complicated to manage.

Number of shards is determined by amount of data and whether query latency
can be achieved - use more shards if the query latency is too high.

2.5 million (2,500,000) documents is rather small, so unless your queries
are running really slow, it's not clear you even need sharding, but we
don't know your document and query complexity. Heavy faceting or complex
function queries?

Number of replicas is determined by query load - number of simultaneous
query requests, as well as HA availability requirements.




-- Jack Krupansky

On Fri, Jan 22, 2016 at 5:45 PM, Toke Eskildsen <te@statsbiblioteket.dk>
wrote:

> Aswath Srinivasan (TMS) <aswath.srinivasan@toyota.com> wrote:
> > *         Totally about 2.5 million documents to  be indexed
> > *         Documents average size is 512 KB - pdfs and htmls
>
> > This being said I was thinking I would take the Solr to production with,
> > *         2 shards, 1 Leader & 3 Replicas
>
> > Do you all think this set up will work? Will this server me 150 QPS?
>
> It certainly helps that you are batch updating. What is missing in this
> estimation is how large the documents are when indexed, as I guess the ½MB
> average is for the raw files? If they are your everyday short PDFs with
> images, meaning not a lot of text, handling 2M+ of them is easy. If they
> are all full-length books, it is another matter.
>
> Your document count is relatively low and if your index data end up being
> not-too-big (let's say 100GB), then you ought to consider having just a
> single shard with 4 replicas: There is a non-trivial overhead going from 1
> shard to more than one, especially if you are doing faceting.
>
> - Toke Eskildsen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message