lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Gupta <varun.vgu...@gmail.com>
Subject Re: Results after using Field Collapsing are not matching the results without using Field Collapsing
Date Mon, 21 Dec 2009 08:33:13 GMT
Hi Martijn,

Yes, it is working after making these changes.

--
Thanks
Varun Gupta

On Sun, Dec 20, 2009 at 5:54 PM, Martijn v Groningen <
martijn.is.hier@gmail.com> wrote:

> Hi Varun,
>
> Yes, after going over the code I think you are right. If you change
> the following if block in SolrIndexSearcher.getDocSet(Query query,
> DocSet filter, DocSetAwareCollector collector):
> if (first==null) {
>        first = getDocSetNC(absQ, null);
>        filterCache.put(absQ,first);
> }
> with:
> if (first==null) {
>        first = getDocSetNC(absQ, null, collector);
>        filterCache.put(absQ,first);
> }
> It should work then. Let me know if this solves your problem.
>
> Martijn
>
>
> 2009/12/18 Varun Gupta <varun.vgupta@gmail.com>:
> > After a lot of debugging, I finally found why the order of collapse
> results
> > are not matching the uncollapsed results. I can't say if it is a bug in
> the
> > implementation of fieldcollapse or not.
> >
> > *Explaination:*
> > Actually, I am querying the fieldcollapse with some filters to restrict
> the
> > collapsing to some particular categories only by appending the parameter:
> > fq=ctype:(1+2+8+6+3).
> >
> > In: NonAdjacentDocumentCollapser.doQuery()
> > Line: DocSet filter = searcher.getDocSet(filterQueries);
> >
> > Here, filter docset is got without any scores (since I have filter in my
> > query, this line actually gets executed) and also stored in the filter
> > cache. In the next line in the code, the actual uncollapsed DocSet is got
> > passing the DocSetScoreCollector.
> >
> > Now, in: SolrIndexSearcher.getDocSet(Query query, DocSet filter,
> > DocSetAwareCollector collector)
> > Line: if (filterCache != null)
> > Because of the filter cache not being null, and no result for the query
> in
> > the cache, the line: first = getDocSetNC(absQ,null); gets executed.
> Notice,
> > over here the DocSetScoreCollector is not passed. Hence, results are
> > collected without any scores.
> >
> > This makes the uncollapsedDocSet to be without any scores and hence the
> > sorting is not done based on score.
> >
> > @Martijn: Is what I am right or I should use field collapsing in some
> other
> > way. Else, what is the ideal fix for this problem (I am not an active
> > developer, so can't say the fix that I do will not break anything).
> >
> > --
> > Thanks,
> > Varun Gupta
> >
> >
> > On Mon, Dec 14, 2009 at 10:35 AM, Varun Gupta <varun.vgupta@gmail.com
> >wrote:
> >
> >> When I used collapse.threshold=1, out of the 5 categories 4 had the same
> >> top result, but 1 category had a different result (it was the 3rd result
> >> coming for that category when I used threshold as 3).
> >>
> >> --
> >> Thanks,
> >> Varun Gupta
> >>
> >>
> >>
> >> On Mon, Dec 14, 2009 at 2:56 AM, Martijn v Groningen <
> >> martijn.is.hier@gmail.com> wrote:
> >>
> >>> I would not expect that Solr 1.4 build is the cause of the problem.
> >>> Just out of curiosity does the same happen when collapse.threshold=1?
> >>>
> >>> 2009/12/11 Varun Gupta <varun.vgupta@gmail.com>:
> >>> > Here is the field type configuration of ctype:
> >>> >    <field name="ctype" type="integer" indexed="true" stored="true"
> >>> > omitNorms="true" />
> >>> >
> >>> > In solrconfig.xml, this is how I am enabling field collapsing:
> >>> >    <searchComponent name="query"
> >>> > class="org.apache.solr.handler.component.CollapseComponent"/>
> >>> >
> >>> > Apart from this, I made no changes in solrconfig.xml for field
> collapse.
> >>> I
> >>> > am currently not using the field collapse cache.
> >>> >
> >>> > I have applied the patch on the Solr 1.4 build. I am not using the
> >>> latest
> >>> > solr nightly build. Can that cause any problem?
> >>> >
> >>> > --
> >>> > Thanks
> >>> > Varun Gupta
> >>> >
> >>> >
> >>> > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen <
> >>> > martijn.is.hier@gmail.com> wrote:
> >>> >
> >>> >> I tried to reproduce a similar situation here, but I got the
> expected
> >>> >> and correct results. Those three documents that you saw in your
> first
> >>> >> search result should be the first in your second search result
> (unless
> >>> >> the index changes or the sort changes ) when fq on that specific
> >>> >> category. I'm not sure what is causing this problem. Can you give
me
> >>> >> some more information like the field type configuration for the
> ctype
> >>> >> field and how have configured field collapsing?
> >>> >>
> >>> >> I did find another problem to do with field collapse caching. The
> >>> >> collapse.threshold or collapse.maxdocs parameters are not taken
into
> >>> >> account when caching, which is off course wrong because they do
> matter
> >>> >> when collapsing. Based on the information you have given me this
> >>> >> caching problem is not the cause of the situation you have. I will
> >>> >> update the patch that fixes this problem shortly.
> >>> >>
> >>> >> Martijn
> >>> >>
> >>> >> 2009/12/10 Varun Gupta <varun.vgupta@gmail.com>:
> >>> >> > Hi Martijn,
> >>> >> >
> >>> >> > I am not sending the collapse parameters for the second query.
> Here
> >>> are
> >>> >> the
> >>> >> > queries I am using:
> >>> >> >
> >>> >> > *When using field collapsing (searching over all categories):*
> >>> >> >
> >>> >>
> >>>
> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch
> >>> >> >
> >>> >> > categories is represented as the field "ctype" above.
> >>> >> >
> >>> >> > *Without using field collapsing:*
> >>> >> >
> >>> >>
> >>>
> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch
> >>> >> >
> >>> >> > I append "&fq=ctype:1" to the above queries when trying
to get
> >>> results
> >>> >> for a
> >>> >> > particular category.
> >>> >> >
> >>> >> > --
> >>> >> > Thanks
> >>> >> > Varun Gupta
> >>> >> >
> >>> >> >
> >>> >> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen <
> >>> >> > martijn.is.hier@gmail.com> wrote:
> >>> >> >
> >>> >> >> Hi Varun,
> >>> >> >>
> >>> >> >> Can you send the whole requests (with params), that you
send to
> Solr
> >>> >> >> for both queries?
> >>> >> >> In your situation the collapse parameters only have to
be used
> for
> >>> the
> >>> >> >> first query and not the second query.
> >>> >> >>
> >>> >> >> Martijn
> >>> >> >>
> >>> >> >> 2009/12/10 Varun Gupta <varun.vgupta@gmail.com>:
> >>> >> >> > Hi,
> >>> >> >> >
> >>> >> >> > I have documents under 6 different categories. While
searching,
> I
> >>> want
> >>> >> to
> >>> >> >> > show 3 documents from each category along with a
link to see
> all
> >>> the
> >>> >> >> > documents under a single category. I decided to use
field
> >>> collapsing
> >>> >> so
> >>> >> >> that
> >>> >> >> > I don't have to make 6 queries (one for each category).
> Currently
> >>> I am
> >>> >> >> using
> >>> >> >> > the field collapsing patch uploaded on 29th Nov.
> >>> >> >> >
> >>> >> >> > Now, the results that are coming after using field
collapsing
> are
> >>> not
> >>> >> >> > matching the results for a single category. For example,
for
> >>> category
> >>> >> C1,
> >>> >> >> I
> >>> >> >> > am getting results R1, R2 and R3 using field collapsing,
but
> after
> >>> I
> >>> >> see
> >>> >> >> > results only from the category C1 (without using
field
> collapsing)
> >>> >> these
> >>> >> >> > results are nowhere in the first 10 results.
> >>> >> >> >
> >>> >> >> > Am I doing something wrong or using the field collapsing
for
> the
> >>> wrong
> >>> >> >> > feature?
> >>> >> >> >
> >>> >> >> > I am using the following field collapsing parameters
while
> >>> querying:
> >>> >> >> >   collapse.field=category
> >>> >> >> >   collapse.facet=before
> >>> >> >> >   collapse.threshold=3
> >>> >> >> >
> >>> >> >> > --
> >>> >> >> > Thanks
> >>> >> >> > Varun Gupta
> >>> >> >> >
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >> --
> >>> >> >> Met vriendelijke groet,
> >>> >> >>
> >>> >> >> Martijn van Groningen
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Met vriendelijke groet,
> >>> >>
> >>> >> Martijn van Groningen
> >>> >>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Met vriendelijke groet,
> >>>
> >>> Martijn van Groningen
> >>>
> >>
> >>
> >
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message