lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan McKinley <>
Subject Re: trie* fields and sortMissingLast?
Date Thu, 16 Sep 2010 20:00:38 GMT
On Thu, Sep 16, 2010 at 11:28 AM, Yonik Seeley
<> wrote:
> On Thu, Sep 16, 2010 at 2:20 PM, Ryan McKinley <> wrote:
>> (i changed the subject to see if Uwe perks up)
>> Is it possible to change the FieldCache for Trie* fields so that it
>> knows what fields are missing?  or is there something about the Trie
>> structure that makes that impossible.
> Nope - it is trivial to record that while the entry is being built for
> all of the current FieldCache entry types - it's just not currently
> done.  After it is recorded (via a bitset most likely), it needs to be
> exposed via an API.

Looking at the FieldCache (first time ever), I'm not sure I see an
obvious place to augment the cache with a BitSet for the matching

We could add a function to the FieldCache like:

  public BitSet getMatchingDocs(IndexReader reader, String field )

That would cache the matching docs for a field, however that means you
would have to traverse the terms twice.  The existing API for caching
values stores the values (short[], int[], etc) not the Entry, so
augmenting the cached Entry with a BitSet would get lost.

It seems that this could be done, but would require some rejiggering
to the API.  The API could return an object like:
class ByteValues {
  byte[] values;
  BitSet valid;

public ByteValues  getBytes (IndexReader reader, String field)

Another option (just brainstorming) would be to set the arrays to a
special value to say they are 'missing'  for example
Integer.MIN_VALUE.  The downside of this is that we lose one valid
value in the range.  For int, double, float, this may be OK, but for
byte and short this is a pretty big tradeoff.

Ideas for what may be a good path forward?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message