lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Potter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-5473) Make one state.json per collection
Date Thu, 24 Apr 2014 18:19:22 GMT

    [ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980053#comment-13980053
] 

Timothy Potter commented on SOLR-5473:
--------------------------------------

Thought I'd add my 2 cents on this one as I've worked on some of this code and want to get
a better sense of how to move forward. Reverting and moving out to a branch sounds like a
good idea.

In general, I think it would be good to split the discussion about this topic into 3 sections:
1) overall design / architecture, 2) implementation and impact on public API, 3) testing.
Moving forward we should start with identifying where we have common ground in these areas
and which aspects are more controversial and need more hashing out between us. 

Here's what I think I know but please correct where I'm off-base:

1) Overall Design / Architecture

It sounds like we're all on-board with splitting cluster state into a per-collection state
znode. Do we intend to support both formats or do we intend to just migrate to the split approach?
I think the answer is the latter, that going forward, SolrCloud will keep state in a separate
znode per collection.

Noble's idea is that once the state is split, then cores only need to watch the znode for
the collection/shard it's linked to. In other words, each SolrCore watches a specific state
znode and thus does not receive any state change updates for other collections.

In terms of what's watched and what is not watched, this patch includes code from 5474 (as
they were too intimately tied together to keep separated) which doesn't watch collection state
changes on the client side. Instead the client relies on a _stateVer_ check during request
processing and receives an error from the server if the client state is stale. I too think
this is a little controversial / confusing and maybe we don't have to keep that as part of
this solution. It was our mistake to merge those two into a single patch. We originally were
thinking 5474 was needed to keep the number of watchers on a znode to a minimum in the event
of many clients using many collections. However, I do think this feature can be split out
and dealt with in a better way, if at all. In other words, split state znodes are watched
from server and client side. 

Are there any other things design / architecture wise that are controversial?

2) Implementation (and API impact)

This seems like the biggest area of contention right now. The main issue is that the API changes
still give the impression of two state tracking formats, whereas we really only want one format.

The common ground here is that there should be no mention of "external" in any public method
or state format for that matter, right?

Noble: Assuming we're moving forward with stateFormat == 2 and the unified /clusterstate.json
is going away, is it possible to not change any of the existing public methods? In other words,
we're changing the internals of where state is kept, so why does that have to impact the public
API? If not, let's come up with a plan for each change and how we can minimize impact of this.
It seems to me that we need to be more diligent about API impacts of this change and focus
on not breaking the public view of cluster state as much as possible. It would be helpful
to have a bullet list of API impacts that are needed for this so we don't have to scour the
patch looking for all possible changes.

3) Testing

I just wanted to mention that we've been doing a fair amount of integration testing with 100's
of "external" collections per cluster. So I realize this is a big change but we have been
testing this extensively in our QA labs. I only mention this so that others know that have
been concentrating on hardening this feature over the past couple of months. Once we sort
out the API problems, I'm confident that this approach will be solid.

To recap, I see a lot of common ground here and to move forward, we need to move this out
to a branch and off trunk where we'll focus on cleaning up the API impacts of this work, support
only the split format going forward (with a migration plan for existing installations). We
also want to revisit the thinking behind not watching state changes on the client as that
wasn't clear in the patch to this point.



> Make one state.json per collection
> ----------------------------------
>
>                 Key: SOLR-5473
>                 URL: https://issues.apache.org/jira/browse/SOLR-5473
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>             Fix For: 5.0
>
>         Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch,
SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch,
SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch,
SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch,
SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch,
SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch,
SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch,
SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch,
ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log
>
>
> As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json
node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message