From David Santamauro <>
Subject Re: combining cores into a collection
Date Thu, 02 Jan 2014 18:06:26 GMT
On 01/02/2014 12:44 PM, Chris Hostetter wrote:
> : Not really ... uptime is irrelevant because they aren't in production. I just
> : don't want to spend the time reloading 1TB of documents.
> terminologiy confusion: you mean you don't wnat to *reindex* all of the
> documents ... in solr "reloading" a core means something specific &
> different from what you are talking about, and is what michael.boom was
> refering to.

quite correct, sorry. reindex the core(s), not reload the core(s).

> : I want to bring them all into a cloud collection. Assume I have 3 cores/shards
> :
> :   core1
> :   core2
> :   core3
> You can't convert arbitrary cores into shards of a new collection, because
> the document routing logic (which dictates what shard a doc lives in based
> on it's uniqueKey) won't make sense.

I guess this is the heart if the issue.

I managed to assign the individual cores to a collection using the 
collection API to create the collection and then the solr.xml to define 
the core(s) and it's collection. This *seemed* to work. I even test 
indexed a set of documents checking totals before and after as well as 
content. Again, this *seemed* to work.

Did I get lucky that all 5k documents were coincidentally found in the 
appropriate core(s)? Have I possibly corrupted one or more cores? They 
are a working copy so nothing would be lost.

> : I want to be able to address all three as if they were shards of a collection,
> : something like.
> w/o reindexing, one thing you could do is create a single collection for
> each of your cores, and then create a collection alias over all three of
> these collections...

Yes this works but isn't this really just a convenient way to avoid the 
shard parameter on /select?

> if you want to just be able to shove docs to a single "collection" in solr
> cloud and have them replace docs with the same uniqueKey then you're

yes, this is what I was hoping I could do.

> going to need to either: re-index using SolrCloud so the default document
> routing is done properly up front; implement a custom doc router that
> knows about whatever rules you used to decide what would be in core1,
> core2, core3.

I was afraid of that, but see question above about what I've done and 
index consistency.

Thanks for the insight.


