lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Per Steffensen (JIRA)" <>
Subject [jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server
Date Thu, 06 Dec 2012 13:40:59 GMT


Per Steffensen commented on SOLR-4114:

I verified that the removal of my "controlled" instance-dir and data-dir OverseerCollectionProcessor.createCollection
is ok. I needed to do some investigations on how instance-dir and data-dir works. Now I know
and can see that the "controlled" instance-dir and data-dir was a bad idea. Thanks for being
so thorough, Mark.

During my investigations of instance-dir and data-dir I came up with an additional test for
BasicDistributedZkTest.testCollectionAPI, namely to do a test making sure that when you have
created a lot of collections you will not end up with any two (or more) shards using the same
index-dir - that was actually what I was affraid would happen when you (Mark) removed the
"controlled" instance- and data-dir. This additional test-part will run very fast (200 ms
on my local machine), so it will not extend the run-time of the test noticeably to include
it. Instead of sending a patch I will just explain what to do to get this additional testing
into BasicDistributedZkTest (this description works on 4.0, but I couldnt imagine that it
wouldnt on 5.x or 4.x):
* Add this method somewhere in BasicDistributedZkTest
  private void checkNoTwoShardsUseTheSameIndexDir() throws Exception {
    Map<String, Set<String>> indexDirToShardNamesMap = new HashMap<String,
    List<MBeanServer> servers = new LinkedList<MBeanServer>();
    for (final MBeanServer server : servers) {
      Set<ObjectName> mbeans = new HashSet<ObjectName>();
      mbeans.addAll(server.queryNames(null, null));
      for (final ObjectName mbean : mbeans) {
        Object value;
        Object indexDir;
        Object name;
        try {
          if (((value = server.getAttribute(mbean, "category")) != null && value.toString().equals(Category.CORE.toString()))
              ((value = server.getAttribute(mbean, "source")) != null && value.toString().contains(SolrCore.class.getSimpleName()))
              ((indexDir = server.getAttribute(mbean, "indexDir")) != null) &&
              ((name = server.getAttribute(mbean, "name")) != null)) {
              if (!indexDirToShardNamesMap.containsKey(indexDir.toString())) {
                indexDirToShardNamesMap.put(indexDir.toString(), new HashSet<String>());
        } catch (Exception e) {
          // ignore, just continue - probably a "category" or "source" attribute not found
    assertTrue("Something is broken in the assert for no shards using the same indexDir -
probably something was changed in the attributes published in the MBean of " + SolrCore.class.getSimpleName(),
indexDirToShardNamesMap.size() > 0);
    for (Entry<String, Set<String>> entry : indexDirToShardNamesMap.entrySet())
      if (entry.getValue().size() > 1) {
        fail("We have shards using the same indexDir. E.g. shards " + entry.getValue().toString()
+ " all use indexDir " + entry.getKey());
* Add a call to this method (checkNoTwoShardsUseTheSameIndexDir();) at the end of BasicDistributedZkTest.testCollectionsAPI
* Add the line "lst.add("indexDir", getIndexDir());" to SolrCore.getStatistics() so that index-dir
will also be part of the information exposed in the MBean of SolrCore

Please consider including the additional test. It scans all SolrCores in the system to see
if any of them share index-dir. I do the scanning by accessing MBean info from SolrCores -
the simplest way I could come up with. It means that SolrCore will now also expose index-dir
through its MBean, but I guess no one would have anything against that.

Regards, Per Steffensen
> Collection API: Allow multiple shards from one collection on the same Solr server
> ---------------------------------------------------------------------------------
>                 Key: SOLR-4114
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Solr 4.0.0 release
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: collection-api, multicore, shard, shard-allocation
>         Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch,
> We should support running multiple shards from one collection on the same Solr server
- the run a collection with 8 shards on a 4 Solr server cluster (each Solr server running
2 shards).
> Performance tests at our side has shown that this is a good idea, and it is also a good
idea for easy elasticity later on - it is much easier to move an entire existing shards from
one Solr server to another one that just joined the cluter than it is to split an exsiting
shard among the Solr that used to run it and the new Solr.
> See dev mailing list discussion "Multiple shards for one collection on the same Solr

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message