lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcin Rzewucki <mrzewu...@gmail.com>
Subject Re: DocValues and field requirements
Date Mon, 25 Mar 2013 21:59:38 GMT
Hi Chris,

Thanks for your detailed explanations. The default value is a difficult
limitation. Especially for financial figures. I may try with some
workaround like the lowest possible number for TrieLongField, but would be
better to avoid such :)

Regards.

On 22 March 2013 20:39, Chris Hostetter <hossman_lucene@fucit.org> wrote:

>
> : Thank you for your response. Yes, that's strange. By enabling DocValues
> the
> : information about missing fields is lost, which changes the way of
> sorting
> : as well. Adding default value to the fields can change a logic of
> : application dramatically (I can't set default value to 0 for all
> : Trie*Fields fields, because it could impact the results displayed to the
> : end user, which is not good). It's a pity that using DocValues is so
> : limited.
>
> I'm not really up on docvalues, but i asked rmuir about this a bit on IRC>
>
> the crux of the issue is that there are two differnet docvalue impls, one
> that uses a fixed amount of space per doc (ie: exactly one value per doc)
> and one that alloaws an ordered set of values per doc (ie: multivalued).
>
> the multivalued docvals impl was wired into solr for multivalued fields,
> and the single valued docvals impl was wired in for hte single valued case
> -- but since since the single valued docvals impl *has* to have a value
> for every doc, the schema error you encountered was added if you try to
> use it on a field that isn't required or doesn't have a default value --
> to force you to be explicit about which "default" you want, instead of hte
> low level lucene "0" default coming into play w/o you knowing about it.
> (as Shawn mentioned)
>
> the multivalued docvals impl could concivably be used instead for these
> types of single valued fields (ie: to support 0 or 1 values) but there is
> no sorting support for multivalued docvals, so it would cause other
> problems.
>
> One possible workarround for people who want to take advantage of "sort
> missing first/last" type sorting on a docvals type field would be to mange
> the "missing" information yourself in a distinct field which you also
> leveraged in any filtering or sorting on the docvals field.
>
> ie, have a docvalues field "myfield" which is single valued, with some
> configured default value, and then have a "myfield_exists" boolean field
> which is single valued and required.  when indexing docs, if "myfield"
> does/doesn't have a value set "myfield_exists" to accordingly (this would
> be fairly trivial in an updated processor) and then instead of sorting
> just on "myfield desc" you would sort on "myfield_exists (asc|desc),
> myfield desc" (where you pick hte asc or desc depending on wether you want
> docs w/o values first or last).  you would likewise need to filter on
> myfield_exists:true anytime you did queries against the myfield field.
>
>
> (perhaps someoen could work on patch to inject a synthetic field like this
> automatically for fields that are docValues="true" multiValued="false"
> required="false" w/o a defualtValue?)
>
>
> -Hoss
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message