lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Read DocValue twice
Date Mon, 19 Feb 2018 17:55:50 GMT
Yes, this is the problem. This doc ID is a special sentinel value that
means that the iterator is exhausted. I don't have enough context to know
what the exact problem is but there is a bug in your custom query.

Le lun. 19 févr. 2018 à 16:07, Vadim Gindin <vgindin@detectum.com> a écrit :

> I have the scorer that is similar to DisjunctionScorer.java with
>
> private final DisiPriorityQueue subScorers;
> private final DisjunctionDISIApproximation approximation;
>
> They are initialized in a constructor like that:
>
>    this.subScorers = new DisiPriorityQueue(subScorers.size());
>    for (Scorer scorer : subScorers) {
>        final DisiWrapper w = new DisiWrapper(scorer);
>        this.subScorers.add(w);
>    }
>    this.approximation = new DisjunctionDISIApproximation(this.subScorers);
>
>
>
> I use them in score() and in explain(). In explain() I do
>
>    this.approximation.advance(doc);
>
> And further the same code as in score(). I've also added logging. And
> here is the one string:
>
> explain: doc=2147483647 <(214)%20748-3647>, field=params, maxDoc=67649
>
> doc looks not so good..
>
>
> On Mon, Feb 19, 2018 at 7:32 PM, Adrien Grand <jpountz@gmail.com> wrote:
>
> > Can you add some debug logging to see what the values of topList.doc and
> > reader.maxDoc() are before before you call advanceExact?
> >
> > What do you mean by "I reuse the same DisiPriorityQueue of scorers in
> > score() and explain()". This shouldn't be possible.
> >
> > Le lun. 19 févr. 2018 à 15:23, Vadim Gindin <vgindin@detectum.com> a
> > écrit :
> >
> > > I use these calls in both cases. In score() and explain() I have the
> > > following code:
> > >
> > > SortedNumericDocValues numDocVal = DocValues.getSortedNumeric(reader,
> > > fieldName);
> > > if (numDocVal != null && numDocVal.advanceExact(topList.doc)) {
> > >     long val = numDocVal.nextValue();
> > >
> > >     ..
> > > }
> > >
> > > I reuse the same DisiPriorityQueue of scorers in score() and explain().
> > >
> > > On Mon, Feb 19, 2018 at 6:54 PM, Adrien Grand <jpountz@gmail.com>
> wrote:
> > >
> > > > If you want to read the values again, you need to call setDocument
> > > (Lucene
> > > > < 7.0) or advanceExact (Lucene >= 7.0) before calling nextValue().
> > > >
> > > > Le lun. 19 févr. 2018 à 14:41, Vadim Gindin <vgindin@detectum.com>
a
> > > > écrit :
> > > >
> > > > > Hi all
> > > > >
> > > > > I use DocValue for scoring function. I.e. I have some column with
> > > > integers,
> > > > > that are used in scoring formula. So I have a scorer that
> calculates
> > > > > scoring function twice:
> > > > > - in score()
> > > > > - in explain()
> > > > >
> > > > > I got the following error in explain:
> > > > >
> > > > > Caused by: java.lang.IndexOutOfBoundsException
> > > > >         at java.nio.Buffer.checkIndex(Buffer.java:540)
> > ~[?:1.8.0_161]
> > > > >         at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:253)
> > > > > ~[?:1.8.0_161]
> > > > >         at
> > > > > org.apache.lucene.store.ByteBufferGuard.getByte(
> > > > ByteBufferGuard.java:118)
> > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
> > 8a38550659
> > > -
> > > > > ubuntu - 2017-10-13 16:12:42]
> > > > >         at
> > > > >
> > > > > org.apache.lucene.store.ByteBufferIndexInput$
> > SingleBufferImpl.readByte(
> > > > ByteBufferIndexInput.java:385)
> > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
> > 8a38550659
> > > -
> > > > > ubuntu - 2017-10-13 16:12:42]
> > > > >         at
> > > > >
> > > > > org.apache.lucene.util.packed.DirectReader$DirectPackedReader8.get(
> > > > DirectReader.java:145)
> > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
> > 8a38550659
> > > -
> > > > > ubuntu - 2017-10-13 16:12:42]
> > > > >         at
> > > > >
> > > > >
> > >
> org.apache.lucene.codecs.lucene70.Lucene70DocValuesProducer$3.longValue(
> > > > Lucene70DocValuesProducer.java:481)
> > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
> > 8a38550659
> > > -
> > > > > ubuntu - 2017-10-13 16:12:42]
> > > > >         at
> > > > >
> > > > > org.apache.lucene.index.SingletonSortedNumericDocValues.nextValue(
> > > > SingletonSortedNumericDocValues.java:73)
> > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
> > 8a38550659
> > > -
> > > > > ubuntu - 2017-10-13 16:12:42]
> > > > >
> > > > > I've found the following comment in the source code of
> > > > > SortedNumericDocValues.java:
> > > > >
> > > > > /**
> > > > >  * Iterates to the next value in the current document.  Do not call
> > > > > this more than {@link #docValueCount} times
> > > > >  * for the document.
> > > > >  */
> > > > >
> > > > > public abstract long nextValue() throws IOException;
> > > > >
> > > > >
> > > > > Questions:
> > > > > 1) Why I can't read the values twice?
> > > > > 2) How can I manage this situation?
> > > > > 3) Can it work for NumericDocValues?
> > > > >
> > > > > Regards,
> > > > > Vadim Gindin
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message