lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Payloads and TrieRangeQuery
Date Sat, 13 Jun 2009 12:58:45 GMT
OK, good points Grant.  I now agree that it's not a simple task,
moving stuff core stuff from Solr -> Lucene.  So summing this all up:

  * Some feel Lucene should only aim to be the core "expert" engine
    used by Solr/Nutch/etc., so things like moving trie to core (with
    consumable naming, good defaults, etc.) are near zero priority.

    While I see & agree that this is indeed what Solr needs of Lucene,
    I still think direct consumbility of Lucene is important and
    Lucene should try to have a consumable API, good names for classes
    methods, good defaults, etc.

    And I don't see those two goals as being in conflict (ie, I don't
    see Lucene having a consumable API as preventing Solr from using
    Lucene's advanced APIs), except for the fact that we all have
    limited time.

  * We have two communities.  Each has its own goal (to make its
    product good), it's own committers, etc.  While technically we
    seem to agree certain things (function queries, NumberUtils,
    highlighters, analyzers, faceted nav, etc.) logically "belong" as
    Lucene modules, the logistics and work required and different
    requirements (both one time, and ongoing) are in fact sizable

    Perhaps once Lucene "modularizes", in the future, such
    consolidation may be easier, ie if/once there are committers
    focused on "analyzers" I could seem them helping out all
    around in pulling all analyzers together.

  * We all are obviously busy and there are more important things to
    work on than "shuffling stuff around".

So now I'm off to scrutinize LUCENE-1313... :)


On Fri, Jun 12, 2009 at 5:33 PM, Grant Ingersoll<> wrote:
> On Jun 12, 2009, at 12:20 PM, Michael McCandless wrote:
>> On Thu, Jun 11, 2009 at 4:58 PM, Yonik Seeley<>
>> wrote:
>>> In Solr land we can quickly hack something together, spend some time
>>> thinking about the external HTTP interface, and immediately make it
>>> available to users (those using nightlies at least).  It would be a
>>> huge burden to say to Solr that anything of interest to the Lucene
>>> community should be pulled out into a module that Solr should then
>>> use.
>> Sure, new and exciting things should still stay private to Solr...
>>> As a separate project, Solr is (and should be) free to follow
>>> what's in it's own best interest.
>> Of course!
>> I see your point, that moving things down into Lucene is added cost:
>> we have to get consensus that it's a good thing to move (but should
>> not be hard for many things), do all the mechanics to "transplant" the
>> code, take Lucene's "different" requirements into account (that the
>> consumability & stability of the Java API is important), etc.
> The problem traditionally has been that people only do the work one way.
> That is, they take it from Solr, but then they never submit patches to Solr
> to use the version in Lucene.  And, since many of the Lucene committers are
> not Solr committers, even if they do the Solr work, they can't see it
> through.
> It seems all the pure Lucene devs want the functionality of Solr, but they
> don't want to do any of the work to remove the duplication from Solr.
>  Additionally, it is often the case that by the time it gets into Lucene,
> some Solr user has come along and improved the Solr version.  The Function
> stuff is example numero uno.
> Wearing my PMC hat, I'd say if people are going to be moving stuff around
> like this, then they better be keeping Solr up to date, too, because it is
> otherwise creating a lot of work for Solr to the detriment of it (because
> that time could be spent doing other things).  Still, I don't think that is
> all that worthwhile, as it will just create a ton of extra work.  People who
> want Solr stuff are free to pull what they need into their project.  There
> is absolutely nothing stopping them.
> And the fact is, that no matter how much is pulled out of Solr, people will
> still contribute things to Solr because it is it's own community and is
> fairly autonomous, a few committers that cross over not withstanding.  I'd
> venture a fair number of Solr committers know little about Lucene internals.
>  Heck, given the amount of work you do, Mike, I'd say a fair number of
> Lucene committers know very little about the internals of Lucene anymore.
>  It has been good to see you over in Solr land at least watching what is
> going on there to at least help coordinate when Solr finds Lucene errors.
>> But, there is a huge benefit to having it in Lucene: you get a wider
>> community involved to help further improve it, you make Lucene
>> stronger which improves its & Solr's adoption, etc.
> That is not always the case.  Pushing things into Lucene from Solr make it
> harder for Solr committers to do their work, unless you are proposing that
> all Solr committers should be Lucene committers.
> As for adoption, most people probably should just be starting with Solr
> anyway.  The fact is that every Lucene committer to the tee will tell you
> that they have built something that more or less looks like Solr.  Lucene is
> great as a low-level Vector Space implementation with some nice contribs,
> but much of the interesting stuff in search these days happens at the layer
> up (and arguably even a layer above that in terms of UI and intelligent
> search, etc).  In Lucene PMC land, that area is Solr and Nutch.  My personal
> opinion is that Lucene should focus on being a really fast, core search
> library and that the outlet for the higher level stuff is in Solr and Nutch.
>  It is usually obvious when things belong in the core, because people bring
> them up in the appropriate place (there are some rare exceptions, that you
> have mentioned)
> -Grant
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message