lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Woodward (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8862) /live_nodes is populated too early to be very useful for clients -- CloudSolrClient (and MiniSolrCloudCluster.createCollection) need some other ephemeral zk node to knowwhich servers are "ready"
Date Mon, 21 Mar 2016 22:03:25 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205251#comment-15205251
] 

Alan Woodward commented on SOLR-8862:
-------------------------------------

I've tried to dig a bit and see when everything here is run within the Jetty lifecycle, and
it turns out that... it's complicated!
* In a normal Solr setup, running using the Jetty start.jar, the SolrDispatchFilter is instantiated
during startup (Jetty instantiates its Filters, and then its Servlets), and it won't serve
any requests until all filters and servlets are fully constructed and have finished initialising.
 So there could be a significant gap between registering the live_nodes znode and requests
actually being served, particularly if there are other servlets within the container that
take their time in starting up.
* In JettySolrRunner, the SDF is instantiated within a jetty LifecycleListener (of which more
below), which is called *after* Jetty has started listening on its port.  Requests won't be
served via the filter until it has finished instantiating, but the gap here is smaller.

In both cases we have a race.  Ideally, we want to instatiate the filters, and only register
ourselves with the cluster once we know we're serving requests, so we need a way to be notified
that everything is ready to go:
* The standard servlet API exposes ServletContextListeners, but these only get called *before*
startup and shutdown, so these aren't any use.  We need to be notified *after* startup.
* Jetty allows you to register LifecycleListeners that get called before and after startup
and shutdown, which is exactly what we want.  Hurrah!

So what we really need to do here is to separate out CoreContainer construction, loading of
cores, and creation of the live_nodes znode.  The container should be constructed and load
up during server startup, and then register itself in a LifecycleListener.

It's not ideal that we have two different code paths here, one for 'proper' solr running using
start.jar and xml configuration, and one programmatically, but I guess we can live with that
for a while.

On a separate note, SOLR-8323 should help with waiting for collections to be searchable.

> /live_nodes is populated too early to be very useful for clients -- CloudSolrClient (and
MiniSolrCloudCluster.createCollection) need some other ephemeral zk node to knowwhich servers
are "ready"
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8862
>                 URL: https://issues.apache.org/jira/browse/SOLR-8862
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> {{/live_nodes}} is populated surprisingly early (and multiple times) in the life cycle
of a sole node startup, and as a result probably shouldn't be used by {{CloudSolrClient}}
(or other "smart" clients) for deciding what servers are fair game for requests.
> we should either fix {{/live_nodes}} to be created later in the lifecycle, or add some
new ZK node for this purpose.
> {panel:title=original bug report}
> I haven't been able to make sense of this yet, but what i'm seeing in a new SolrCloudTestCase
subclass i'm writing is that the code below, which (reasonably) attempts to create a collection
immediately after configuring the MiniSolrCloudCluster gets a "SolrServerException: No live
SolrServers available to handle this request" -- in spite of the fact, that (as far as i can
tell at first glance) MiniSolrCloudCluster's constructor is suppose to block until all the
servers are live..
> {code}
>     configureCluster(numServers)
>       .addConfig(configName, configDir.toPath())
>       .configure();
>     Map<String, String> collectionProperties = ...;
>     assertNotNull(cluster.createCollection(COLLECTION_NAME, numShards, repFactor,
>                                            configName, null, null, collectionProperties));
> {code}
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message