lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <>
Subject Re: Solr 4.1 Solr Cloud Shard Structure
Date Fri, 01 Mar 2013 00:55:55 GMT
100 shards on a node will almost certainly be slow, but at least it would be scalable. 7TB
of data on one node is going to be slow regardless of how you shard it.

I might choose a number with more useful divisors than 100, perhaps 96 or 144.


On Feb 28, 2013, at 4:25 PM, Mark Miller wrote:

> You will pay some in performance, but it's certainly not bad practice. It's a good choice
for setting up so that you can scale later. You just have to do some testing to make sure
it fits your requirments. The Collections API even has built in support for this - you can
specify more shards than nodes and it will overload a node. See the documentation. Later you
can start up a new replica on another machine and kill/remove the original.
> - Mark
> On Feb 28, 2013, at 7:10 PM, Chris Simpson <> wrote:
>> Dear Lucene / Solr Community-
>> I recently posted this question on Stackoverflow, but it doesnt seem to be going
too far. Then I found this mailing list and was hoping perhaps to have more luck:
>> Question-
>> If I plan on holding 7TB of data in a Solr Cloud, is it bad practice to begin with
1 server holding 100 shards and then begin populating the collection where once the size grew,
each shard ultimately will be peeled off into its own dedicated server (holding ~70GB ea with
its own dedicated resources and replicas)?
>> That is, I would start the collection with 100 shards locally, then as data grew,
I could peel off one shard at a time and give it its own server -- dedicated w/plenty of resources.
>> Is this okay to do -- or would I somehow incur a massive bottleneck internally by
putting that many shards in 1 server to start with while data was low?
>> Thank you.
>> Chris

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message