lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Buso <nb...@ebi.ac.uk>
Subject Re: TermInSetQuery keep terms order in results
Date Mon, 02 Jul 2018 10:16:36 GMT
Hi Uwe,

as said the sorting is calculated elsewhere upfront and the terms are
provided to Lucene in the order calculated (in any case in an not
ordered Set as by the query API).

I would like an API to keep the input order otherwise I will end up on
the usual problem that I can't re-order afterward because accessing the
results in a paginated way will make impossible this operation.


Nicola

On Mon, 2018-06-25 at 21:49 +0200, Uwe Schindler wrote:
> Hi Nicola,
> 
> if you sort it elsewhere, why do you care about sort order then? What
> you see as result is simple: As there is nothing available for
> scoring a constant score query returns the results in index order.
> That's wanted. There is no way to change this "default" order for a
> TermInSetQuery because it's missing information.
> 
> Uwe
> 
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> > -----Original Message-----
> > From: Nicola Buso <nbuso@ebi.ac.uk>
> > Sent: Monday, June 25, 2018 5:09 PM
> > To: Uwe Schindler <uwe@thetaphi.de>; java-user@lucene.apache.org
> > Subject: Re: TermInSetQuery keep terms order in results
> > 
> > Hi Uwe,
> > 
> > thanks for the reply. TermInSetQuery cover most of my use case:
> > - thousands of term values (also 100,000)
> > - no need for scoring, because it's calculated elsewhere
> > - intersect with normal full text query for further filtering
> > 
> > Using a TermQuery do I risk to hit the
> > BooleanQuery.getMaxClauseCount()
> > limit?
> > 
> > Cheers,
> > 
> > 
> > Nicola
> > 
> > 
> > 
> > On Mon, 2018-06-25 at 16:52 +0200, Uwe Schindler wrote:
> > > Hi,
> > > 
> > > the TermInSetQuery is a so-called Constant Score Query. It is
> > > more
> > > meant as a filter, so you would need some "real" fulltext query
> > > in
> > > parallel. See the term-in-set query more like the SQL "IN"
> > > operator.
> > > It can be used to pass lots of identifiers to filter results
> > > (e.g.
> > > when you apply access rights or group policies for filtering
> > > users to
> > > your main query as a filter).
> > > 
> > > As it is a "set", which is by default unordered, the order of
> > > terms
> > > in the set is undefined. Internally TermInSetQuery reorders the
> > > terms
> > > to improve processing speed.
> > > 
> > > If you need scoring, use TermQuery wrapped by a BooleanQuery.
> > > Then
> > > you can apply some boosts to some terms to improve order (e.g.
> > > boost
> > > term queries coming first) and apply on a field without norms.
> > > 
> > > TermInSetQuery is fast because it neglects scoring and is just
> > > good
> > > at intersecting the terms dict with the given terms set.
> > > 
> > > Uwe
> > > 
> > > -----
> > > Uwe Schindler
> > > Achterdiek 19, D-28357 Bremen
> > > http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > > 
> > > > -----Original Message-----
> > > > From: Nicola Buso <nbuso@ebi.ac.uk>
> > > > Sent: Monday, June 25, 2018 1:23 PM
> > > > To: java-user@lucene.apache.org
> > > > Subject: TermInSetQuery keep terms order in results
> > > > 
> > > > Hi,
> > > > 
> > > > I need to use the TermInSetQuery, but I would like to keep the
> > > > sorting
> > > > of the results based on the term set order provided. Currently
> > > > seems
> > > > using a index documents insertion order in the results.
> > > > 
> > > > Is this already implemented somewhere or do I need to implement
> > > > a
> > > > CustomScoreQuery to calculate this score?
> > > > 
> > > > Cheers,
> > > > 
> > > > 
> > > > Nicola
> > > > 
> > > > 
> > > > --
> > > > Nicola Buso <nbuso@ebi.ac.uk>
> > > > EMBL-EBI
> > > > 
> > > > -------------------------------------------------------------
> > > > ----
> > > > ----
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.o
> > > > rg
> > > 
> > > 
> > 
> > -----------------------------------------------------------------
> > ----
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
-- 
Nicola Buso <nbuso@ebi.ac.uk>
EMBL-EBI

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message