lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-12697) pure DocValues support for FieldValueFeature
Date Sat, 18 May 2019 14:26:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-12697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843175#comment-16843175
] 

Erick Erickson commented on SOLR-12697:
---------------------------------------

{quote}do you mean that retrieving just 1 field from stored fields is not optimal
{quote}
Exactly. Unlike the rest of the data for the inverted index, stored data is kept on a per-document
basis in a block. So all the stored=true fields for document Z are essentially concatenated
together and written out to an "_##.fdt" file in a block.

Then to retrieve even one stored (not docValues) field:

1> the stored data for the document is read from disk in one or more 16K block(s).

2> the block(s) of stored data is/are decompressed.

3> the stored value(s) is/are read from the decompressed block(s)

At that point, it's more efficient to read the rest of the values from the stored fields if
possible rather than bounce around the docValues fields. If all the docValues structures are
in the OSs memory it's probably hard to measure, but there's always the possibility that to
read the DV value some data will have to be read from disk 'cause they haven't been MMapped
yet (or were swapped out or... It's an extra step anyway)....

So SolrDocumentFetcher tries to hide all that and "do the right thing". There are two cases:

1> all required fields can be returned from docValues fields. It does so without reading/decompressing
the stored data

2> At least one required field must be fetched from the stored data. It reads/decompresses
the stored data and adds all the required fields from the stored data it can to the solr document.
If any required field is left over, it's added from docValues.

 

Best,

Erick

 

> pure DocValues support for FieldValueFeature
> --------------------------------------------
>
>                 Key: SOLR-12697
>                 URL: https://issues.apache.org/jira/browse/SOLR-12697
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - LTR
>            Reporter: Stanislav Livotov
>            Priority: Major
>         Attachments: SOLR-12697.patch, SOLR-12697.patch, SOLR-12697.patch, SOLR-12697.patch,
SOLR-12697.patch
>
>
> [~slivotov] wrote in SOLR-12688:
> bq. ... FieldValueFeature doesn't support pure DocValues fields (Stored false). Please
also note that for fields which are both stored and DocValues it is working not optimal because
it is extracting just one field from the stored document. DocValues are obviously faster for
such usecases. ...
> (Please see SOLR-12688 description for overall context and analysis results.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message