lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Smiley <>
Subject Re: DocValues, retrieval performance and policy
Date Mon, 24 Sep 2018 13:46:13 GMT
I don't think it makes a difference if some people think docValues should
never be used for value-retrieval.  When that performance drop occurred due
to those changes, I'm sure it would have affected sorting & faceting as
well as value-retrieval.  Some more than others perhaps.  I don't see any
disagreement about improving docValues in the ways you suggest.  If
theoretically you proposed a change that helped the value-retrieval
use-case and hurt the non-controversial use-cases then I could see why you
want to raise this issue more publicly like you're doing here.

~ David

On Mon, Sep 24, 2018 at 8:40 AM Toke Eskildsen <> wrote:

> The Solr 7 switch to iterative API for Doc Values
> meant a severe performance regression for Solr export and document
> retrieval with our web archive index, which is distinguished by having
> quite large segments (300M docs / 900GB) and using primarily doc values
>  to hold field content.
> Technically there is a working patch
> but during discussion of performance measurements elsewhere
> it came up that doc values are not intended for document retrieval and
> as such that Lucene should not be optimized towards that.
> From my point of view, using doc values to build retrieval documents is
> quite natural: The data are there, so making a double representation by
> also making them stored seems a waste of space.
> If this is somehow a misuse of Doc Values, maybe I could be explained
> what the problem is or directed towards more information?
> - Toke Eskildsen, Royal Danish Library
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: | Book:

View raw message