lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Smiley <david.w.smi...@gmail.com>
Subject Re: DocValues, retrieval performance and policy
Date Mon, 24 Sep 2018 13:46:13 GMT
I don't think it makes a difference if some people think docValues should
never be used for value-retrieval.  When that performance drop occurred due
to those changes, I'm sure it would have affected sorting & faceting as
well as value-retrieval.  Some more than others perhaps.  I don't see any
disagreement about improving docValues in the ways you suggest.  If
theoretically you proposed a change that helped the value-retrieval
use-case and hurt the non-controversial use-cases then I could see why you
want to raise this issue more publicly like you're doing here.

~ David

On Mon, Sep 24, 2018 at 8:40 AM Toke Eskildsen <toes@kb.dk> wrote:

> The Solr 7 switch to iterative API for Doc Values
> https://issues.apache.org/jira/browse/LUCENE-7407
> meant a severe performance regression for Solr export and document
> retrieval with our web archive index, which is distinguished by having
> quite large segments (300M docs / 900GB) and using primarily doc values
>  to hold field content.
>
> Technically there is a working patch
> https://issues.apache.org/jira/browse/LUCENE-8374
> but during discussion of performance measurements elsewhere
> https://github.com/mikemccand/luceneutil/issues/23
> it came up that doc values are not intended for document retrieval and
> as such that Lucene should not be optimized towards that.
>
>
> From my point of view, using doc values to build retrieval documents is
> quite natural: The data are there, so making a double representation by
> also making them stored seems a waste of space.
>
> If this is somehow a misuse of Doc Values, maybe I could be explained
> what the problem is or directed towards more information?
>
> - Toke Eskildsen, Royal Danish Library
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Mime
View raw message