lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Per Steffensen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server
Date Tue, 04 Dec 2012 12:00:59 GMT

    [ https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509687#comment-13509687
] 

Per Steffensen commented on SOLR-4114:
--------------------------------------

bq. We have to do something else other than a one minute sleep in the test.

In checkForCollection we are willing to wait up to 120 secs (used to be 60 secs) for the collection
to be created (correctly). So it is kinda obvious to also wait at least 60 secs to verify
that nothing related to a collection was actually created in cases where you want to be sure
that nothing is created. In the case of waiting for a collection being created correctly you
can skip the waiting as soon as you see that is has been created correctly. In the case where
you want to see that nothing gets created you have no chance to do that. I think the 60 sec
wait fits well into the way the test already works.

bq. I'd like the default data dir to remain /data. If you are not doing anything special that
works fine.

Thats also fine, but as soon at you create more than one shard on the same node, dont you
need different data-dirs? You cant have two shards using the same folder for lucene-index-files,
right? Did I get the instance-dir and data-dir concepts wrong? Or is it the instance-dir you
want to vary between different shards? I would say that running only one shard on a node IS
"doing something special".

bq. I'd also like the ability to override the data dir on per shard basis, but I'm not sure
what the best API for that is.

Yes, me too, but that has to be another ticket/issue. For now, until the API allowes to control
the data-dir-naming, it should be ok for the algorithm to decide for something reasonable
by itself.

bq. So I'd like to support what you want, but I don't want to change the default behavior.

I agree, but if I got the concepts correct you cannot use "/data" as data-dir for more than
one shard on each node. So default behaviour about using "/data" as data-dir will not work
as soon as you run more than one shard on a node. I probably got something wrong - please
try to explain.

bq. My latest patch - I'll commit this soon and we can iterate from there.

Well I would prefer you committed my patch, and we can iterate from there :-) It will also
make it much easier to get SOLR-4120 in, which I hope you will also consider.

Had a quick peek at your patch and have the following comments
* I see that you removed the "auto-reduce replica-per-shard to never have more than one replica
of the same shard on the same node"-feature and just issue a warning instead in OverseerCollectionProcessor
(the "if (numShardsPerSlice > nodeList.size())"-thingy). It is ok for me, eventhough I
believe it is pointless to replicate data to the same machine and under the same Solr instance.
But then you probably need to change the BasicDistrbutedZkTest also - in checkCollectionExceptations
I believe you'll need to change from
{code}
int expectedShardsPerSlice = (Math.min(clusterState.getLiveNodes().size(), numShardsNumReplicaList.get(1)
+ 1));
{code}
to
{code}
int expectedShardsPerSlice = numShardsNumReplicaList.get(1) + 1;
{code}
* You removed the following lines (because you just want default-values for instance-dir and
data-dir)
{code}
params.set(CoreAdminParams.INSTANCE_DIR, ".");
params.set(CoreAdminParams.DATA_DIR, shardName + "_data");
{code}
I still do not understand how that will work, but hopefully you will explain
* You didnt like my rename of variable "replica" to "nodeUrlWithoutProtocolPart" in OverseerCollectionProcessor.createCollection.
As I said on mailing-list I dont like the term "replica" as a replacement for what we used
to call "shards", because I think it will cause misunderstandings, as "replica" is also (in
my mind) a role played at runtime. But getting the terms right and reflect them correctly
in API, variable-names etc. across the code must be another issue/ticket. But here in this
specific example "replica" is just a very bad name, because the variable is not even containing
a "replica-url", which would require the shard/replica-name to be postfixed to the string.
So this "replica"-variable is closest to being an node-url (without the protocol part) - NOT
a shard/replica-url. I would accept my name-change if I where you, but I have a motto of "carefully
choosing my fights" and this is a fight I will not fight for very long :-)
* I see that you did not include my changes to HttpShardHandler, making "shardToUrls"-map
(renamed) concurrency protected through getURLs-method (renamed to getFullURLs), so that you
do not have to use the map so carefully outside. I understand that it has very little to do
with this issue SOLR-4114, but it is a nice modification. Please consider committing it -
maybe related to another issue/ticket. It is little bit of a problem that good refactoring
does not easy get in as part of issues/tickets not requiring the refactor. If refactoring
is hard to get in, it might be the cause that the code base is a mess (and I think it is)
 
                
> Collection API: Allow multiple shards from one collection on the same Solr server
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-4114
>                 URL: https://issues.apache.org/jira/browse/SOLR-4114
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Solr 4.0.0 release
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: collection-api, multicore, shard, shard-allocation
>         Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch,
SOLR-4114_trunk.patch
>
>
> We should support running multiple shards from one collection on the same Solr server
- the run a collection with 8 shards on a 4 Solr server cluster (each Solr server running
2 shards).
> Performance tests at our side has shown that this is a good idea, and it is also a good
idea for easy elasticity later on - it is much easier to move an entire existing shards from
one Solr server to another one that just joined the cluter than it is to split an exsiting
shard among the Solr that used to run it and the new Solr.
> See dev mailing list discussion "Multiple shards for one collection on the same Solr
server"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message