lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kaushik Chakraborty <kaych...@gmail.com>
Subject Re: Faceting on multivalued field
Date Mon, 04 Apr 2011 08:18:42 GMT
Are you implying to change the DB query of the nested entity which fetches
the comments (query is in my post) or something can be done during the index
like using Transformers etc. ?

Thanks,
Kaushik


On Mon, Apr 4, 2011 at 8:07 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> Why not count them on the way in and just store that number along
> with the original e-mail?
>
> Best
> Erick
>
> On Sun, Apr 3, 2011 at 10:10 PM, Kaushik Chakraborty <kaychaks@gmail.com
> >wrote:
>
> > Ok. My expectation was since "comment_post_id" is a MultiValued field
> hence
> > it would appear multiple times (i.e. for each comment). And hence when I
> > would facet with that field it would also give me the count of those many
> > documents where comment_post_id appears.
> >
> > My requirement is getting total for every document i.e. finding number of
> > comments per post in the whole corpus. To explain it more clearly, I'm
> > getting a result xml something like this
> >
> > <str name="post_id">46</str>
> > <str name="post_text">Hello World</str>
> > <str name="person_id">20</str>
> > <arr name="comment_id">
> >    <str>9</str>
> >    <str>10</str>
> > </arr>
> > <arr name="comment_person_id">
> >   <str>19</str>
> >   <str>2</str>
> > </arr>
> > <arr name="comment_post_id">
> >  <str>46</str>
> >  <str>46</str>
> > </arr>
> > <arr name="comment_text">
> >   <str>Hello - from World</str>
> >   <str>Hi</str>
> > </arr>
> >
> > <lst name="facet_fields">
> >  <lst name="comment_post_id">
> >     *<int name="46">1</int>*
> >
> > I need the count to be 2 as the post 46 has 2 comments.
> >
> >  What other way can I approach?
> >
> > Thanks,
> > Kaushik
> >
> >
> > On Mon, Apr 4, 2011 at 4:29 AM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > Hmmm, I think you're misunderstanding faceting. It's counting the
> > > number of documents that have a particular value. So if you're
> > > faceting on "comment_post_id", there is one and only one document
> > > with that value (assuming that the comment_post_ids are unique).
> > > Which is what's being reported.... This will be quite expensive on a
> > > large corpus, BTW.
> > >
> > > Is your task to show the totals for *every* document in your corpus or
> > > just the ones in a display page? Because if the latter, your app could
> > > just count up the number of elements in the XML returned for the
> > > multiValued comments field.
> > >
> > > If that's not relevant, could you explain a bit more why you need this
> > > count?
> > >
> > > Best
> > > Erick
> > >
> > > On Sun, Apr 3, 2011 at 2:31 PM, Kaushik Chakraborty <
> kaychaks@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > My index contains a root entity "Post" and a child entity "Comments".
> > > Each
> > > > post can have multiple comments. data-config.xml:
> > > >
> > > > <document>
> > > >            <entity name="posts" transformer="TemplateTransformer"
> > > > dataSource="jdbc" query="">
> > > >
> > > >                <field column="post_id" />
> > > >                <field column="post_text"/>
> > > >                <field column="person_id"/>
> > > >                <entity name="comments" dataSource="jdbc"
> query="select
> > *
> > > > from comments where post_id = ${posts.post_id}" >
> > > >                    <field column="comment_id" />
> > > >                    <field column="comment_text" />
> > > >                    <field column="comment_person_id" />
> > > >                    <field column="comment_post_id" />
> > > >               </entity>
> > > >            </entity>
> > > > </document>
> > > >
> > > > The schema has all columns of "comment" entity as "MultiValued"
> fields
> > > and
> > > > all fields are indexed & stored. My requirement is to count the
> number
> > of
> > > > comments for each post. Approach I'm taking is to query on "*:*" and
> > > > faceting the result on "comment_post_id" so that it gives the count
> of
> > > > comment occurred for that post.
> > > >
> > > > But I'm getting incorrect result e.g. if a post has 2 comments, the
> > > > multivalued fields are populated alright but the facet count is
> coming
> > as
> > > 1
> > > > (for that post_id). What else do I need to do?
> > > >
> > > >
> > > > Thanks,
> > > > Kaushik
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message