lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Smiley <>
Subject Re: Feature: Solr implicitly defined field types?
Date Sun, 30 Dec 2018 06:01:46 GMT
Thanks for your thoughtful response Jörn!
On Sat, Dec 29, 2018 at 4:14 AM Jörn Franke <> wrote:

> I think it is a good idea, but I see some potential complexity for
> “deployment” of collections. For instance, in environments where Solr is
> used as a shared platform amongst several stakeholders, every time you
> deploy/modify a collection you need to take care that the platform types
> exist. If it exists in the Test environment then i need to make sure that
> it exists as well in acceptance/production. The problem is that the
> platform type could have been defined by somebody else who has not yet (eg
> due to project/sprint delays) not updated the other environments. Another
> issue is if I move to another Solr cluster in the same environment. Then, I
> have to make sure that all platform types move with me.

RE "the platform type could have been defined by somebody else":  I'm not
imagining it'd be configurable, thus the "somebody else" is the Solr

Otherwise, I think I get your point, but perhaps I don't.  It's the same
point for *any* use of some new feature of Solr.  If you use some new
feature, you have to take care that all Solr instances you deploy your
configuration to can handle that new feature.  That's a fairly generic
point that would apply to just about anything in Solr.

> A (minor) issue is that platform types may change (for whatever reasons)
> and that then potentially all collections have to be reindexed or we have
> different versions of the same platform type making things not easier.

Yes it's possible.  Though I think that point is apart from the feature I
propose.  You're saying that you might want to use an "int" field and then
one day realize you want some newer/better definition of what an "int" is
(e.g. trie -> points).  Sure.  That's true wether the field type is
explicit or implicit.  There's nothing stopping you from explicitly
defining the field type if you want to; the names would not be reserved. If
you want to stick with your current index running the new Solr version,
then you would keep luceneMatchVersion what it was, which would effectively
retain the interpretation of the implicit field types.

> Currently we have all our Schema definitions in a version management
> system (we use the Schema API but the JSON requests are out there) so that
> projects can inspire from each other. Needless to say, that careful type
> engineering requires also some documentation on technical design and may be
> indeed very Collection specific.
> Another issue could be that a platform type may also imply a certain
> platform solrconfig.xml (eg lib directive etc).

I'm imagining platform types would be basic primitive types (int, boolean,
etc. and some special situations like in the issue I referenced).  They
would not depend on contrib libs... though I could imagine one day an
evolution of this in which a contrib could somehow auto-add implicit field

> I am not sure yet what are the exact benefits of referring to types of
> other collections in the Solr runtime itself instead of having a version
> system and letting projects decide if they want to adapt types of other
> collections, but maybe I am overlooking something here.

The notion of implicit field types is not a cross-config (cross-collection)
thing.  Implicit field types are nothing more than built-in shortcuts.

I recall one of my very early observations of Solr's schema was of surprise
to see primitive types defined in the schema.  Consider in SQL DDL
statements that refer to varchar and such.  Your DDL doesn't need to define
what a varchar is!

Happy New Year,
~ David

Am 28.12.2018 um 17:36 schrieb David Smiley <>:
> While working on it
> occurred to me that it would be nice if Solr had implicitly defined field
> types.  This would allow you to define a field in your schema that refers
> to a type that is *not* also in your schema -- at least not explicitly
> (need not explicitly be put in your schema.xml if classic, or need not be
> passed to schema manipulation API if you use that).  The idea would be that
> these types would be Solr platform provided field types that need not be
> defined by you.
> There are multiple ways this loose idea might be conceived / imagined into
> a concrete proposal.
> (A) The main idea I'm kicking around right now is that Solr would _not_
> throw an error at the moment of reading your field definition that it
> doesn't see your type... instead it would see it's a platform type (via
> some built-in hard-coded registry) and then register that type on the fly.
> So if you were to read the schema then you'd see it.  In this way, it's
> kind of a shortcut.  Platform field types that you don't actually refer to
> will never end up being put into your schema.
> (B) A schema could pre-initialize with the platform/implicit types.  This
> is the simplest idea but I don't like it because you may not even need some
> of these types.  I'm not going to go down this path now but wanted to
> mention it.
> I'm exploring (A) right now... I'm hoping to do this for at least a
> "_nest_path_"  field in support of nested documents in 8.0, but conceivably
> the idea would be expanded to lots of things in our base schema right now
> (int, str, etc.)
> --
> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
> LinkedIn: | Book:
> --
Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
LinkedIn: | Book:

View raw message