lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atri Sharma <>
Subject Re: Impact and WAND
Date Thu, 11 Jul 2019 07:04:26 GMT
Note that any other scoring mode (COMPLETE or COMPLETE_NO_SCORES) will
mandatorily visit all hits, so there is no scope of skipping and hence
no point of using impacts.

On Thu, Jul 11, 2019 at 8:51 AM Wu,Yunfeng <> wrote:
> @Adrien Grand <<>>. Thanks for your
> The explanation ` skip low-scoring matches` is great,  I  looked up some docs and inspect
some related code.
> I noticed the ` block-max WAND` mode only work when  ScoreMode.TOP_SCORES is used,  
is right?  (The basic TermQuery would generate ImpactDISI with scoreMode is TOP_SCORES.)
> Lucene compute max score per block and then cached in `MaxScoreCache` , this means we
can skip low-scoring block( current one block 128 DocIds)  and in competitive block  still
need to score any docId as seen,   I confused with  `MaxScoreCache#getMaxScoreForLevel(int
level)`, what the level mean? Skip level?  (Somewhere invoke this method pass one Integer
upTo param)
> Thanks Lucene Team
> 在 2019年7月10日,下午10:52,Adrien Grand <<>>
> To clarify, the scoring process is not accelerated because we
> terminate early but because we can skip low-scoring matches (there
> might be competitive hits at the very end of the index).
> CompetitiveImpactAccumulator is indeed related to WAND. It helps store
> the maximum score impacts per block of documents in postings lists.
> Then this information is leveraged by block-max WAND in order to skip
> low-scoring blocks.
> This does indeed help avoid reading norms, but also document IDs and
> term frequencies.
> On Wed, Jul 10, 2019 at 4:10 PM Wu,Yunfeng <<>>
> Hi,
> We discuss some topic from As Atri Sharma
propose discuss with the java dev list.
> Impact `frequency ` and `norm ` just to accelerate the `score process`  which  `terminate
> In impact mode, `CompetitiveImpactAccumulator` will record (freq, norm) pair , would
stored at index level. Also I noted `CompetitiveImpactAccumulator` commented with `This class
accumulates the (freq, norm) pairs that may produce competitive scores`,  maybe related to
> The norm value which produced or consumed by `Lucene80NormsFormat`.
> In this ` Impact way`, we can avoid read norms from `Lucene80NormsProducer` that may
generate the extra IO?  ( the norm value Lucene stored twice.)and take full advantage
of the WAND method?
> --
> Adrien

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message