lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chitra R <chithu.r...@gmail.com>
Subject Re: Faceting : what are the limitations of Taxonomy (Separate index and hierarchical facets) and SortedSetDocValuesFacetField ( flat facets and no sidecar index) ?
Date Thu, 17 Nov 2016 13:10:10 GMT
Okay. I agree with you, Taxonomy maintains and supports hierarchical facets
during indexing. Hope hierarchical in the sense, we might index the field
Publish date : 2010/10/15 as Publish date: 2010 , Publish date: 2010/10
and Publish date: 2010/10/15 , their facet ordinals are maintained in
sidecar index and it is mapped to the main index.

For example:

                In search-lucene.com , I enter a term (say facet), top
documents and their categories are displayed after performing the search.
Say I drill down through Publish date/2010 to collect its child counts and
after I will pass through publishdate/2010/10 to collect their child
counts. And for each drill down, each search will be performed to collect
its top docs and categories.


               *Even I can achieve this in flat facets by changing the
drill down query. *

Am I right or missed anything? yet I don't know if I missed anything...

So What is the need of hierarchical facets? Could you please explain
it(hierarchical facets) in the real-world use case?


Regards,
Chitra

On Wed, Nov 16, 2016 at 7:36 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> You store dimension + string (a single value path, since it's not
> hierarchical) into SSDVFF so that you can compute facet counts, either
> ordinary drill down counts or the drill sideways counts.
>
> You can see examples of drill sideways at
> http://jirasearch.mikemccandless.com, e.g. drill down on any of those
> fields on the left and you don't lose the previous facet counts for
> that field.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Nov 16, 2016 at 8:51 AM, Chitra R <chithu.r111@gmail.com> wrote:
> > Hi,
> >
> > Lucene-Drill sideways
> >
> > jira_issue:LUCENE-4748
> >
> >                                  Is this the reason( ie Drill sideways
> makes
> > a very nice faceted search UI because we
> > don't "lose" the facet counts after drilling in) behind storing path and
> > dimension for the given SSDVF field? Else anything?
> >
> > Regards,
> > Chitra
> >
> >
> >      Hey, thank you so much for the fast response, I agree NRT refresh is
> > somewhat costly operations and this is the major pitfall, suppose we use
> doc
> > value faceting.
> >
> >
> >                  While indexing SortedSetDocValuesFacetField , it stores
> > path and dimension of the given field internally. So Can we achieve
> > hierarchical facets using DrillDownQuery? Hope, purpose of storing path
> and
> > dimension is to achieve hierarchical facets. If yes (ie we can achieve
> > hierarchy in SSDVFF) , so what is the need to move over taxonomy?
> >  Else I missed anything?
> >
> >
> >                  What is the real purpose to store path and dimension in
> > SSDVF field?
> >
> >
> > Kindly post your suggestions.
> >
> > Regards,
> > Chitra
> >
> >
> >
> > On Sat, Nov 12, 2016 at 4:03 AM, Michael McCandless
> > <lucene@mikemccandless.com> wrote:
> >>
> >> On Fri, Nov 11, 2016 at 5:21 AM, Chitra R <chithu.r111@gmail.com>
> wrote:
> >>
> >> >         i)Hope, when opening SortedSetDocValuesReaderState , we are
> >> > calculating ordinals( this will be used to calculate facet count ) for
> >> > doc
> >> > values field and this only made the state instance somewhat costly.
> >> >                       Am I right or any other reason behind that?
> >>
> >> That's correct.  It adds some latency to an NRT refresh, and some heap
> >> used to hold the ordinal mappings.
> >>
> >> >          ii) During indexing, we are providing facet ordinals in each
> >> > doc
> >> > and I think it will be useful in search side, to calculate facet
> counts
> >> > only for matching docs.  otherwise, it carries any other benefits?
> >>
> >> Well, compared to the taxonomy facets, SSDV facets don't require a
> >> separate index.
> >>
> >> But they add latency/heap usage, and they cannot do hierarchical
> >> facets yet (though this could be fixed if someone just built it).
> >>
> >> >          iii) Is SortedSetDocValuesReaderState thread-safe (ie)
> multiple
> >> > threads can call this method concurrently?
> >>
> >> Yes.
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message