lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <rochk...@jhu.edu>
Subject RE: Faceting on multivalued field
Date Mon, 04 Apr 2011 13:45:19 GMT
Is there a kind of function query that can count number of values in a multi-valued field on
a given document?  I do not know. 
________________________________________
From: Erick Erickson [erickerickson@gmail.com]
Sent: Sunday, April 03, 2011 10:37 PM
To: solr-user@lucene.apache.org
Subject: Re: Faceting on multivalued field

Why not count them on the way in and just store that number along
with the original e-mail?

Best
Erick

On Sun, Apr 3, 2011 at 10:10 PM, Kaushik Chakraborty <kaychaks@gmail.com>wrote:

> Ok. My expectation was since "comment_post_id" is a MultiValued field hence
> it would appear multiple times (i.e. for each comment). And hence when I
> would facet with that field it would also give me the count of those many
> documents where comment_post_id appears.
>
> My requirement is getting total for every document i.e. finding number of
> comments per post in the whole corpus. To explain it more clearly, I'm
> getting a result xml something like this
>
> <str name="post_id">46</str>
> <str name="post_text">Hello World</str>
> <str name="person_id">20</str>
> <arr name="comment_id">
>    <str>9</str>
>    <str>10</str>
> </arr>
> <arr name="comment_person_id">
>   <str>19</str>
>   <str>2</str>
> </arr>
> <arr name="comment_post_id">
>  <str>46</str>
>  <str>46</str>
> </arr>
> <arr name="comment_text">
>   <str>Hello - from World</str>
>   <str>Hi</str>
> </arr>
>
> <lst name="facet_fields">
>  <lst name="comment_post_id">
>     *<int name="46">1</int>*
>
> I need the count to be 2 as the post 46 has 2 comments.
>
>  What other way can I approach?
>
> Thanks,
> Kaushik
>
>
> On Mon, Apr 4, 2011 at 4:29 AM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > Hmmm, I think you're misunderstanding faceting. It's counting the
> > number of documents that have a particular value. So if you're
> > faceting on "comment_post_id", there is one and only one document
> > with that value (assuming that the comment_post_ids are unique).
> > Which is what's being reported.... This will be quite expensive on a
> > large corpus, BTW.
> >
> > Is your task to show the totals for *every* document in your corpus or
> > just the ones in a display page? Because if the latter, your app could
> > just count up the number of elements in the XML returned for the
> > multiValued comments field.
> >
> > If that's not relevant, could you explain a bit more why you need this
> > count?
> >
> > Best
> > Erick
> >
> > On Sun, Apr 3, 2011 at 2:31 PM, Kaushik Chakraborty <kaychaks@gmail.com
> > >wrote:
> >
> > > Hi,
> > >
> > > My index contains a root entity "Post" and a child entity "Comments".
> > Each
> > > post can have multiple comments. data-config.xml:
> > >
> > > <document>
> > >            <entity name="posts" transformer="TemplateTransformer"
> > > dataSource="jdbc" query="">
> > >
> > >                <field column="post_id" />
> > >                <field column="post_text"/>
> > >                <field column="person_id"/>
> > >                <entity name="comments" dataSource="jdbc" query="select
> *
> > > from comments where post_id = ${posts.post_id}" >
> > >                    <field column="comment_id" />
> > >                    <field column="comment_text" />
> > >                    <field column="comment_person_id" />
> > >                    <field column="comment_post_id" />
> > >               </entity>
> > >            </entity>
> > > </document>
> > >
> > > The schema has all columns of "comment" entity as "MultiValued" fields
> > and
> > > all fields are indexed & stored. My requirement is to count the number
> of
> > > comments for each post. Approach I'm taking is to query on "*:*" and
> > > faceting the result on "comment_post_id" so that it gives the count of
> > > comment occurred for that post.
> > >
> > > But I'm getting incorrect result e.g. if a post has 2 comments, the
> > > multivalued fields are populated alright but the facet count is coming
> as
> > 1
> > > (for that post_id). What else do I need to do?
> > >
> > >
> > > Thanks,
> > > Kaushik
> > >
> >
>

Mime
View raw message