I think it is a good idea, but I see some potential complexity for “deployment” of collections. For instance, in environments where Solr is used as a shared platform amongst several stakeholders, every time you deploy/modify a collection you need to take care that the platform types exist. If it exists in the Test environment then i need to make sure that it exists as well in acceptance/production. The problem is that the platform type could have been defined by somebody else who has not yet (eg due to project/sprint delays) not updated the other environments. Another issue is if I move to another Solr cluster in the same environment. Then, I have to make sure that all platform types move with me.
A (minor) issue is that platform types may change (for whatever reasons) and that then potentially all collections have to be reindexed or we have different versions of the same platform type making things not easier.
Currently we have all our Schema definitions in a version management system (we use the Schema API but the JSON requests are out there) so that projects can inspire from each other. Needless to say, that careful type engineering requires also some documentation on technical design and may be indeed very Collection specific.
Another issue could be that a platform type may also imply a certain platform solrconfig.xml (eg lib directive etc).
I am not sure yet what are the exact benefits of referring to types of other collections in the Solr runtime itself instead of having a version system and letting projects decide if they want to adapt types of other collections, but maybe I am overlooking something here.
While working on https://issues.apache.org/jira/browse/SOLR-12768
it occurred to me that it would be nice if Solr had implicitly defined field types. This would allow you to define a field in your schema that refers to a type that is not
also in your schema -- at least not explicitly (need not explicitly be put in your schema.xml if classic, or need not be passed to schema manipulation API if you use that). The idea would be that these types would be Solr platform provided field types that need not be defined by you.
There are multiple ways this loose idea might be conceived / imagined into a concrete proposal.
(A) The main idea I'm kicking around right now is that Solr would _not_ throw an error at the moment of reading your field definition that it doesn't see your type... instead it would see it's a platform type (via some built-in hard-coded registry) and then register that type on the fly. So if you were to read the schema then you'd see it. In this way, it's kind of a shortcut. Platform field types that you don't actually refer to will never end up being put into your schema.
(B) A schema could pre-initialize with the platform/implicit types. This is the simplest idea but I don't like it because you may not even need some of these types. I'm not going to go down this path now but wanted to mention it.
I'm exploring (A) right now... I'm hoping to do this for at least a "_nest_path_" field in support of nested documents in 8.0, but conceivably the idea would be expanded to lots of things in our base schema right now (int, str, etc.)
Lucene/Solr Search Committer (PMC), Developer, Author, Speaker