lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasu Y <vya...@gmail.com>
Subject Re: Sorting on different language fields
Date Wed, 31 Aug 2016 08:25:50 GMT
Thank you Emir. That was helpful.
I understand that we can make sure all doc have values in all fields at
indexing time. But if that's not possible, how can we make sure all docs
have values in all fields query time?
If we missed to provide value for certain fields at index time, how can we
ensure those fields have values at query time?

Thank you,
Vasu

On Wed, Aug 31, 2016 at 12:46 PM, Emir Arnautovic <
emir.arnautovic@sematext.com> wrote:

> Hi Vasu,
>
> It is expected behavior, and you can control it with sortMissingLast and
> sortMissingFirst. Here is comment from schema:
>
> <!-- sortMissingLast and sortMissingFirst attributes are optional
> attributes are currently supported on types that are sorted internally as
> strings and on numeric types. This includes "string","boolean", and, as of
> 3.5 (and 4.x), int, float, long, date, double, including the "Trie"
> variants. - If sortMissingLast="true", then a sort on this field will cause
> documents without the field to come after documents with the field,
> regardless of the requested sort order (asc or desc). - If
> sortMissingFirst="true", then a sort on this field will cause documents
> without the field to come before documents with the field, regardless of
> the requested sort order. - If sortMissingLast="false" and
> sortMissingFirst="false" (the default), then default lucene sorting will be
> used which places docs without the field first in an ascending sort and
> last in a descending sort. -->
>
> In any case it does not seem right to me to have results first just
> because it is declared as French - in some cases it will be same as English
> version and will look strange. You should probably have to make sure all
> doc have values in all fields, either at indexing or query time.
>
> HTH,
> Emir
>
>
> On 31.08.2016 08:07, Vasu Y wrote:
>
>> Hi,
>>   We are indexing a set objects with fields like objectName,
>> objectDescription etc. All the objects have objectName specified in
>> English
>> language; some objects also have their name specified in an additional
>> language like French and indexed to objectName_fr field.
>>
>> When we query from SOLR, we want to display objects sorted by objectName.
>> If the logged-in user's locale is French, we want to display french object
>> name for objects that have this and display english language name for rest
>> of the objects.
>>
>> For sorting in ascending order, I specified: "objectName_collated_fr asc,
>> objectName_lowercase asc" in SOLR query. Here "objectName_lowercase" is of
>> field type "lowercase" (with KeywordTokenizerFactory &
>> LowerCaseFilterFactory) and "objectName_collated_fr" is of field type "
>> ICUCollationField" with locale="fr".
>>
>> When sorting in ascending order, I noticed that all objects with English
>> only object names got displayed first and then those objects with English
>> &
>> French names came at the end. I am assuming that the objects with English
>> only object names came first because these didn't have values for the
>> french field and treated equivalent to null/empty values for french field.
>>
>> But when sort order is changed to descending (objectName_collated_fr desc,
>> objectName_lowercase desc), objects with English & French came first and
>> the objects; again the reason is same as the one mentioned above.
>>
>> Let me know if this behavior is correct.
>>
>> Thanks,
>> Vasu
>>
>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message