lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Per Steffensen (JIRA)" <>
Subject [jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server
Date Wed, 05 Dec 2012 14:52:58 GMT


Per Steffensen commented on SOLR-4114:

bq. Its not working out: and its exactly due to this attitude (which must be changed)

Its also my impression that is does not work out, we just do not share opinion about reason.

bq. you contradict yourself. you say only the "seen from the outside" matters, yet discourage
the unit tests that make it easier to develop new APIs, and the unit tests that "test" actual
usage of the API. This is critical to refactoring.

No I do not contradict myself - you misunderstand me. If an API can be used "from the outside"
it certainly needs test-coverage - whether or not you see it as unit or behavioural tests.
The collection-creation/reload/removal featurs of OverseerCollectionProcesser (the one i question
here) is not accessable from "the outside". The Collection API is, which is triggerede by
sending a request getting handled by CollectionHandler etc. eventaully submitting a "job"
for the OverseerCollectionProcessor. This should (also) be tested by sending such request
"from the outside" to a complete Solr cluster, and assert that the expected collection/shards
ends up being created/reloaded/removed, and in case of create that you can index data into
the new collection and search the data up again, and i case of remove that the collections/shards
disappear from the system (ZK and that data-dirs are delete if that is what you asked for)

If you want to supplement with tests directly on OverseerCollectionProcessor that is fine.
But such tests are mainly usefull during the development process, and not to ensure that no
one in the future breaks the feature you introduced. The feature seen from "the outside" is
typically unchanged during refactoring, and the feature seen from "the outside" is what matters.
Say we some day decides that collection creation shouldnt really be handled asynchronously
by the Overseer, but that we want to handle it synchronously before responding to the one
that sent the collection creation request. In that case OverseerCollectionProcessor will probably
be deleted (yes most of the code will still remain, but will probably be moved/restructured
to other classes/units somewhere else), and there will be tests in a OverseerCollectionProcessorTest
that needs to be moved and changed, and it is not certain that the one doing the refactor
"gets" (the reason or point) of all the tests of OverseerCollectionProcessor or that he is
able to tweak them to "simular" tests of the new components handling the same aspects. The
behavioral test does not need to change, as it ensures that the feature seen "from the outside"
did not change, and that is very important doing refactoring.

As I said, I am certainly not against "unit"-tests, but is is mainly a "working"-tool for
the developers. Behavioural tests are the ones that ensure, that your system as a whole works
as it is supposed to - and whether you want it or not, it IS what creating a system is all
Guess you would realize a thing or two by working on a project for a real custumor setting
up real demands. Those demands are all requirements to the system as a whole seen from "the
outside". He doesnt care about the internal working of the system. Our testers work a lot
with the custumor (or the PO representing him) to work out behavioural tests to make sure
we fulfill his needs and requirements. We do lots of unit-tests in our project also, but have
NO problem what-so-ever refactoring all the time. So I can tell you that behavioural tests
and daring to do constant refactoring can go hand in hand. It is a little harder when you
have only unit-tests.

Start thinking about Solr as a product you need to deliver to some "artificial" customer,
and try to think the way he would think - only system-level behaviour matters to him. "Unit"-tests
are "working"-tools for the developers.

bq. I'm also against more tests with sleeps

As I said "me too". But rather test a feature the way it can be tested, than not testing it.

bq. you can expect to see me vote on commits that have huge sleeps

Uhh I hope not, you just said you where against sleeps. But I guess mean "you cannot..." :-)
> Collection API: Allow multiple shards from one collection on the same Solr server
> ---------------------------------------------------------------------------------
>                 Key: SOLR-4114
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Solr 4.0.0 release
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: collection-api, multicore, shard, shard-allocation
>         Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch,
> We should support running multiple shards from one collection on the same Solr server
- the run a collection with 8 shards on a 4 Solr server cluster (each Solr server running
2 shards).
> Performance tests at our side has shown that this is a good idea, and it is also a good
idea for easy elasticity later on - it is much easier to move an entire existing shards from
one Solr server to another one that just joined the cluter than it is to split an exsiting
shard among the Solr that used to run it and the new Solr.
> See dev mailing list discussion "Multiple shards for one collection on the same Solr

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message