lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: Lucene/Solr 7
Date Fri, 27 Jan 2017 00:48:54 GMT
On 1/26/2017 2:40 PM, Adrien Grand wrote:
> I don't think this statement is accurate. Why would it affect Lucene
> users if they started using points for new indices when they upgraded
> to Lucene 6?

If that's the situation, then there won't be an issue.  I suspect that
there are quite a few users that have extremely large indexes that are
difficult to reindex, and long-standing config/code using the legacy
types.  Those kinds of users will find that they cannot easily upgrade.

> There are two cases: either their current configuration uses points,
> which means the index was created with Lucene 6.0+, which will be fine
> with Lucene 7. Or the index uses legacy numerics, but that means the
> index was created with Lucene 5 so Lucene 7 cannot read it anyway.

No Solr 6.x users have points, because the capability isn't there yet.

Points were first available (in actual released code) with the release
of 6.0 ... but the legacy types were already deprecated before 6.0 was
released.  There was never any overlap where both types were considered
fully viable.  Perhaps Solr should have added points before the 6.0
release, but that didn't happen.

I don't expect users of Solr 5.x to be able to upgrade directly to
version 7, but users of 6.x should be able to.  Right now, that won't be
possible if there are numeric types in the index.  Most indexes have at
least one numeric type.

On my own installs, I never upgrade with an existing index.  I am able
to do this because I've arranged my Solr servers in such a way that I
can always completely rebuild one copy of my index from scratch while
another copy remains online and serving requests, kept current
independently of the rebuilding copy.  Each copy of the index is not not
connected to the others in ANY way -- they can use entirely different
versions and entirely different configs if that's what I need.

Not all users have the luxury that I do.  Users with a typical
replicated SolrCloud 6.x will be faced with a situation where they
cannot do a rolling upgrade of their cloud to 7.x, which is going to
make the upgrade process ugly.  At some point they're going to have to
completely rebuild all of their indexes.  One SolrCloud user that I know
of has *five terabytes* of index data in SolrCloud.  Reindexing is a
logistical nightmare, and something that I'm sure they don't want to
combine with an upgrade.

The situation with ES is probably not quite as bad as what Solr is
facing, but some users will still be impacted.  The impact will be
lessened with one of the two ideas I mentioned.  An operation somewhere
in 6.x that can convert the legacy numeric fields in the index to points
would be REALLY good to have.  The configuration of course will need
changing to match the index.

The other solution, delaying removal until 8.0, doesn't sound like a bad
idea either.  Those old types are very widely used, and quite
fundamental, so keeping them around for an extra major version will
greatly reduce the amount of pain that users must endure when they upgrade.

This might be characterized as Lucene being punished for Solr's lack of
foresight.  I don't really agree with that characterization, but it's
not entirely wrong, either.  How much actual impact on development would
result from keeping the legacy types around a while longer?  Are there
changes planned that would be extremely difficult or impossible with
legacy code still present?

When the decision was made to combine the Lucene and Solr codebases, it
had to be known that there would be a certain amount of shared baggage. 
I'm not sure that I would support that decision if I had any opportunity
to comment, but it was made before I got involved in the code.  If our
community ever thinks about separating the two projects, I would support
that.  I'm sure it would be an enormous amount of work.

Historically, there was a similar situation in Solr where a whole bunch
of old numeric types were deprecated in 3.x, with preference for the
Trie types available much earlier.  Those older types were not actually
removed until 5.0.  Small difference from the situation we're now facing
-- I don't see any Lucene deprecations involved in that.  It was all
done on the Solr side.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message