Wolf Siberski writes (3/31/2005 1:54 AM):
> As some time has passed now since I submitted the Multisearcher
> patch, and no objections have been raised, I would like to ask
> to commit it now. I have put substantial effort into it, and my
> concern is that conflicts with newer patches will emerge if
> the commit is delayed further.
I don't get a vote, but if I did it would be:
+1
I filed the original bug, prepared the first weak attempt at a patch,
and participated in the design discussion that led to Wolf fixing the
problem. I haven't tried running this patch, but did just read it, and
it looks solid.
As we discussed in the design, this seems the best first solution as it
is fairly easy to assert its correctness. Going forward, I would like
to see some performance measurements and expect we'll want to introduce
some optimizations, the most important of which is to cache the
cumulative docFreqs for a large number of terms in a scope much larger
than a single query. This is more difficult as it would require some
type of coordination with the indexing processes on the remote nodes
(although for a large index in most real cases there would not be any
need to keep these completely synchronized, as the the instantaenous
changes in docFreq's are not very important to the relevance ranking;
some kind of periodic synchronization approach, analagous to optimizing
indexes, would be quite sufficient).
A more minor point is that allocations could be reduced in
MultiSearcher.prepareWeight (e.g., a simple one is to use a single
HashMap rather than a HashSet and a HashMap for computing the cumulative
docFreq's for all query terms). But again, I think Wolf did the right
thing in creating the easily-validated correct implementation as the
first step.
I'm sorry to have taken so long to review this. I hope to use it within
the next 3 weaks on a scalability benchmark and will report back the
results. Please do commit it as Wolf requests so that it gets synced up
with other activities. E.g., the changes to BooleanQuery will need to
be integrated with Paul's work assuming that gets committed as well.
Thanks,
Chuck
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
|