lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Pendlebury <>
Subject Re: Building a Solr cluster with Maven
Date Tue, 18 Oct 2016 21:53:33 GMT
Thank-you for the replies. Yesterday I finished our build script using the
ZIP via Nexus, but I'd still like to pursue some long-term improvements to
that process. In response to some of the feedback:

"Another option, depending on one's needs, is to pursue Docker..."

>> We had a member of the ops team doing a build somewhat similar to this
maybe 18-24 months ago. He was struggling with some of the issues around
inter-shard communication because nodes write their own addresses into
clusterstate, but the inner docker applications didn't know their external
addresses. He was working through all those problems, but the solutions
were undermining the reasons he chose docker in the first place (letting
the external details bleed into the container). He ultimately backed away
from it all mainly because the rest of the ops team didn't like the overall
approach and the (perceived?) additional complexity it added to our

"Does the scenario you wish to use the assets for relate to testing or some
other use-case?"

>> We run the dashboard on all production hosts... which is probably
redundant, but does come in handy, but beyond the dashboard there are a
couple of files in the same src area (eg. web.xml) that we need to tie
together Jetty and the solr-core classes. The main reason for automating it
is the scaled side of things. Our current cluster (5.1.0... not using this
build) is 60 shards and 2 replicas (120 JVMs) across 12 hosts, We configure
it all so that we can place the distribution of nodes evenly to control SSD
utilisation and CPU loads etc, as well as making sure maintenance
procedures are accounted for (such that X number of hosts can be down for
maintenance and at least one replica is always fully online). The build I
am overhauling right now is for a new cluster coming online later this
year. 96 shards, 2 replicas (184 JVMs) across 16 servers (maybe... we will
test and massage that topology before launch). We aren't particularly
looking forward to manually building/configuring 300 odd JVMs (or 28 server
deployments) every time we bug fix a plugin or do a minor version bump on
Solr, so these scripts are important.

The ops team also wants to make the distribution more tightly controlled to
solve issues they see in production where replication distribution can
sometimes see one host be too strongly a mirror of another host (ie. one
host has all of its replicas on one other host, rather than spread out
through the cluster). This means when that host crashes and comes back
online in recovery it stresses the other host incredibly, rather that
distributing the replication load around the cluster. The added complexity
this new layout brings is something we can solve (have solved... although
it untested at the moment) by scripting the build of the whole cluster.

The developers have always used these scripts to build our single-host
Devel environments because we routinely purge them and start again. We
would then package up the server part for the ops team to use in higher
environments which they augmented with puppet to build all the shards...
but the ops team want to start using Maven to build the whole lot now.
There are deeper parts of this that relate to some in-house tooling which
works very well with Maven and Jetty... but they are not of interest to
anyone that doesn't work here :)

"I haven't upstreamed the changes for the ant tasks thinking there wouldn't
be too much interest in that"

>> This is highly opinionated, but I suspect I would agree with you. I
don't think having the ZIP go into Maven Central is a good idea (if it is
even allowed). I felt bad putting it into our local Nexus repo (it is the
largest artifact in there now), but it got the job done. I avoided the
temptation to use it as a complete distro however. I've setup my main build
to only source things from that ZIP if they are not in the other Maven
artifacts (ie. just the webapp assets), so that if they become available
somewhere else I only have to modify a small part of the build.

" might be helpful to have a lib or "plugin-ins" folder in the zip
that is by default loaded to the classpath as an extension point for users
who are re-building the package?"

>> I agree. We use our own control scripts, but a colleague suggested the
same thing to me Yesterday because the ops team's first fumbling
experiments with 5.5.3 and 6.2.1 had them manually unpacking the ZIP and
deploying our plugins on the classpath. Mistakes they were making in
keeping the locations and version numbers all in alignment between builds
is what led us back to Maven to control all this.


On 19 October 2016 at 03:15, Timothy Rodriguez (BLOOMBERG/ 120 PARK) <> wrote:

> That'd be a helpful step. I think it'd be even better if there was a way
> to generate somewhat customized versions of solr from the artifacts that
> are published already. Publishing the whole zip would be a start,
> downstream builds could add logic to resolve it, explode, tweak, and
> re-publish. The maintain the strict separation from the war, it might be
> helpful to have a lib or "plugin-ins" folder in the zip that is by default
> loaded to the classpath as an extension point for users who are re-building
> the package?
> -Tim
> From: At: 10/18/16 09:52:42
> To:
> Subject: Re: Building a Solr cluster with Maven
> My team has modified the ant scripts to publish all the jars/poms and the
> zip to our local artifactory when we run our build. We have another project
> which pulls down all of these dependencies including the zip to build our
> actual solr deploy and a maven assembly which unpacks the zip file and
> extracts all of the webapp for our real distribution.
> I haven't upstreamed the changes for the ant tasks thinking there wouldn't
> be too much interest in that, but I could put together a patch if there is.
> The changes do the following:
> - Packages the zip along with the parent pom if a flag is set
> - Allows changing group which the poms are published to. For example
> instead of org.apache you can push it as to avoid shadowing
> conflicts in your local repository.
> On Tue, Oct 18, 2016 at 8:42 AM David Smiley <>
> wrote:
>> Thanks for bringing this up, Greg.  I too have felt the pain of this in
>> the move away from a WAR file in a project or two.  In one of the projects
>> that comes to mind, we built scripts that re-constituted a Solr
>> distribution from artifacts in Maven. For anything that wasn't in Maven
>> (e.g. the admin UI pages, Jetty configs), we checked it into source
>> control.  In hind sight... the simplicity of what you list as (1) -- check
>> the distro zip into a Maven repo local to the organization sounds better...
>> but I may be forgetting requirements that led us not to do this.  I look
>> forward to that zip shrinking once the docs are gone.  Another option,
>> depending on one's needs, is to pursue Docker, which I've lately become a
>> huge fan of.  I think Docker is particularly great for integration tests.
>> Does the scenario you wish to use the assets for relate to testing or some
>> other use-case?
>> ~ David
>> On Mon, Oct 17, 2016 at 7:58 PM Greg Pendlebury <
>>> wrote:
>> Are there any developers with a current working maven build for a
>> downstream Solr installation? ie. Not a build for Solr itself, but a build
>> that brings in the core Solr server plus local plugins, third party plugins
>> etc?
>> I am in the process of updating one of our old builds (it builds both the
>> application and various shard instances) and have hit a stumbling block in
>> sourcing the dashboard static assets (everything under /webapp/web in
>> Solr's source).
>> Prior to the move away from being a webapp I could get them by exploding
>> the war from Maven Central.
>> In our very first foray into 5.x we had a local custom build to patch
>> SOLR-2649. We avoided solving this problem then by pushing the webapp into
>> our local Nexus as part of that build... but that wasn't a very good long
>> term choice.
>> So now I'm trying to work out the best long term approach to take here.
>> Ideas so far:
>>    1. Manually download the required zip and add it into our Nexus
>>    repository as a 3rd party artifact. Maven can source and extract anything
>>    it needs from here. This is where I'm currently leaning for simplicity, but
>>    the manual step required is annoying. It does have the advantage of causing
>>    a build failure straight away when a version upgrade occurs, prompting the
>>    developer to look into why.
>>    2. Move a copy of the static assets for the dashboard into our
>>    project and deploy them ourselves. This has the advantage of aligning our
>>    approach with the resources we already maintain in the project (like
>>, schema.xml, solrconfig.xml, logging etc.). But I am
>>    worried that it is really fragile and developers will miss it during a
>>    version upgrade, resulting in the dashboard creeping out-of-date and
>>    (worse) introducing subtle bugs because of a version mismatch between the
>>    UI and the underlying server code.
>>    3. I'd like to think a long term approach would be for the core Solr
>>    build to ship a JAR (or any other assembly) to Maven Central like
>>    'solr-dashboard'... but I'm not sure how that aligns with the move away
>>    from Solr being considered a webapp. It seems a shame that all of the Java
>>    code ends up in Maven central, but the web layer dead-ends in the ant build.
>> I might be missing something really obvious and there is already a way to
>> do this. Is there some other distribution of the dashboard statics? Other
>> than the downloadable zip that is.
>> Ta,
>> Greg
>> --
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> LinkedIn: | Book: http://www.

View raw message