lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Faceting : what are the limitations of Taxonomy (Separate index and hierarchical facets) and SortedSetDocValuesFacetField ( flat facets and no sidecar index) ?
Date Wed, 16 Nov 2016 14:06:18 GMT
You store dimension + string (a single value path, since it's not
hierarchical) into SSDVFF so that you can compute facet counts, either
ordinary drill down counts or the drill sideways counts.

You can see examples of drill sideways at
http://jirasearch.mikemccandless.com, e.g. drill down on any of those
fields on the left and you don't lose the previous facet counts for
that field.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 16, 2016 at 8:51 AM, Chitra R <chithu.r111@gmail.com> wrote:
> Hi,
>
> Lucene-Drill sideways
>
> jira_issue:LUCENE-4748
>
>                                  Is this the reason( ie Drill sideways makes
> a very nice faceted search UI because we
> don't "lose" the facet counts after drilling in) behind storing path and
> dimension for the given SSDVF field? Else anything?
>
> Regards,
> Chitra
>
>
>      Hey, thank you so much for the fast response, I agree NRT refresh is
> somewhat costly operations and this is the major pitfall, suppose we use doc
> value faceting.
>
>
>                  While indexing SortedSetDocValuesFacetField , it stores
> path and dimension of the given field internally. So Can we achieve
> hierarchical facets using DrillDownQuery? Hope, purpose of storing path and
> dimension is to achieve hierarchical facets. If yes (ie we can achieve
> hierarchy in SSDVFF) , so what is the need to move over taxonomy?
>  Else I missed anything?
>
>
>                  What is the real purpose to store path and dimension in
> SSDVF field?
>
>
> Kindly post your suggestions.
>
> Regards,
> Chitra
>
>
>
> On Sat, Nov 12, 2016 at 4:03 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>> On Fri, Nov 11, 2016 at 5:21 AM, Chitra R <chithu.r111@gmail.com> wrote:
>>
>> >         i)Hope, when opening SortedSetDocValuesReaderState , we are
>> > calculating ordinals( this will be used to calculate facet count ) for
>> > doc
>> > values field and this only made the state instance somewhat costly.
>> >                       Am I right or any other reason behind that?
>>
>> That's correct.  It adds some latency to an NRT refresh, and some heap
>> used to hold the ordinal mappings.
>>
>> >          ii) During indexing, we are providing facet ordinals in each
>> > doc
>> > and I think it will be useful in search side, to calculate facet counts
>> > only for matching docs.  otherwise, it carries any other benefits?
>>
>> Well, compared to the taxonomy facets, SSDV facets don't require a
>> separate index.
>>
>> But they add latency/heap usage, and they cannot do hierarchical
>> facets yet (though this could be fixed if someone just built it).
>>
>> >          iii) Is SortedSetDocValuesReaderState thread-safe (ie) multiple
>> > threads can call this method concurrently?
>>
>> Yes.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message