lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <>
Subject [jira] [Commented] (SOLR-5872) Eliminate overseer queue
Date Mon, 17 Mar 2014 18:39:45 GMT


Noble Paul commented on SOLR-5872:

People suggest new changes to the system when /where they think it is required. It is important
that we counter suggestions on their own merits/demerits. 

I'm sure you [] /Sami would have abandoned the idea because of some
real issues. I would love to hear them out (when you have time) .The issues may not me insurmountable
 . But , the point is , looking at the code the Overseer queue is seen as quite a bottleneck
and this is the solution that immediately comes to ones mind. 

Anyone who can build up a patch will be a good demonstration of the possibility of such a
solution. People who are testing out their systems in real test environment will be able to
provide invaluable feedback on the viability/issues with the solution. As developers,  we
need to guide/handhold the users who are pushing the envelope . At some point when we develop
enough confidence we can integrate it into the product itself . 

bq.It seems like we still want scalability in both directions (wrt number of collections,
and the size a single collection can be).

Yes, in the current system scaling with multiple collections is much simpler and a first baby
step towards breaking the monolithic clusterstate.json . Eventually we would like to go to
a state per slice so that we can support very large collections. But these new experiments
need to be tried out first before we venture into larger ones

> Eliminate overseer queue 
> -------------------------
>                 Key: SOLR-5872
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
> The overseer queue is one of the busiest points in the entire system. The raison d'ĂȘtre
of the queue is
>  * Provide batching of operations for the main clusterstate,json so that state updates
are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main clusterstate.json,
the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed on the
clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections because batching
would be required for others 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message