lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Solr <ravis...@gmail.com>
Subject Re: Query ReRanking question
Date Fri, 16 Jan 2015 16:03:46 GMT
As per Erick's suggestion reposting my response to the group. Joel and
Erick Thank you very much for helping me out with the ReRanking question a
while ago.

I have an alternative which seems to be working better for me than
ReRanking, can you kindly let me know of any pitfalls that you guys can
think of about the this approach ?? Since we value relevancy & recency at
the same time even though both are mutually exclusive, i thought maybe I
can use the function queries to adjust the boost as follows

boost=max(recip(ms(NOW/HOUR,publish_date),7.889e-10,1,1),scale(query($q),0,1))

What I intended to do here is - if it matched a more recent doc it will
take recency into consideration, however if the relevancy is better than
date boost we keep relevancy. What do you guys think ??

Thanks,

Ravi Kiran Bhaskar


On Mon, Sep 8, 2014 at 12:35 PM, Ravi Solr <ravisolr@gmail.com> wrote:

> Joel and Erick,
>            Thank you very much for explaining how the ReRanking works. Now
> its a bit more clear.
>
> Thanks,
>
> Ravi Kiran Bhaskar
>
> On Sun, Sep 7, 2014 at 4:45 PM, Joel Bernstein <joelsolr@gmail.com> wrote:
>
>> Oops wrong usage pattern. It should be:
>>
>> 1) Main query is sorted by a field (scores tracked silently in the
>> background).
>> 2) Reranker is reRanking docs based on the score from the main query.
>>
>>
>>
>> Joel Bernstein
>> Search Engineer at Heliosearch
>>
>>
>> On Sun, Sep 7, 2014 at 4:43 PM, Joel Bernstein <joelsolr@gmail.com>
>> wrote:
>>
>> > Ok, just reviewed the code. The ReRankingQParserPlugin always tracks the
>> > scores from the main query. So this explains things. Speaking of
>> explaining
>> > things, the ReRankingParserPlugin also works with Lucene's explain. So
>> if
>> > you use debugQuery=true we should see that the score from the initial
>> query
>> > was combined with the score from the reRankQuery, which should be 1.
>> >
>> > You have stumbled on a interesting usage pattern which I never
>> considered.
>> > But basically what's happening is:
>> >
>> > 1) Main query is sorted by score.
>> > 2) Reranker is reRanking docs based on the score from the main query.
>> >
>> > No, worries Erick, you've taught me a lot over the past couple of years!
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Joel Bernstein
>> > Search Engineer at Heliosearch
>> >
>> >
>> > On Sun, Sep 7, 2014 at 11:37 AM, Erick Erickson <
>> erickerickson@gmail.com>
>> > wrote:
>> >
>> >> Joel:
>> >>
>> >> I find that whenever I say something totally wrong publicly, I
>> >> remember the correction really really well...
>> >>
>> >> Thanks for straightening that out!
>> >> Erick
>> >>
>> >> On Sat, Sep 6, 2014 at 12:58 PM, Joel Bernstein <joelsolr@gmail.com>
>> >> wrote:
>> >> > This folllowing query:
>> >> >
>> >> > http://localhost:8080/solr/select?q=malaysian airline
>> crash&rq={!rerank
>> >> > reRankQuery=$rqq reRankDocs=1000}&rqq=*:*&sort=publish_date
>> >> > desc&fl=headline,publish_date,score
>> >> >
>> >> > Is doing the following:
>> >> >
>> >> > The main query is sorted by publish_date. Then the results are
>> reranked
>> >> by
>> >> > *:*, which in theory would have no effect at all.
>> >> >
>> >> > The reRankQuery only uses the reRankQuery to re-rank the results. The
>> >> sort
>> >> > param will always apply to the main query.
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > Joel Bernstein
>> >> > Search Engineer at Heliosearch
>> >> >
>> >> >
>> >> > On Sat, Sep 6, 2014 at 2:33 PM, Ravi Solr <ravisolr@gmail.com>
>> wrote:
>> >> >
>> >> >> Erick,
>> >> >>         Your idea about reversing Joel's suggestion seems to give
>> the
>> >> best
>> >> >> results of all the options I tried...but I cant seem to understand
>> >> why. I
>> >> >> thought the query shown below should give irrelevant results as
>> >> sorting by
>> >> >> date would throw relevancy off...but somehow its getting relevant
>> >> results
>> >> >> with fair enough reverse chronology. It is as if the sort is applied
>> >> after
>> >> >> the docs are collected and reranked (which is what I wanted). One
>> more
>> >> >> thing that baffled me was, if I change reRankDocs from 1000 to100
>> the
>> >> >> results become irrelevant, which doesnt make sense.
>> >> >>
>> >> >> So can you kindly explain whats going on in the following query.
>> >> >>
>> >> >> http://localhost:8080/solr/select?q=malaysian airline
>> >> crash&rq={!rerank
>> >> >> reRankQuery=$rqq reRankDocs=1000}&rqq=*:*&sort=publish_date
>> >> >> desc&fl=headline,publish_date,score
>> >> >>
>> >> >> I love the solr community, so much to learn from so many
>> knowledgeable
>> >> >> people.
>> >> >>
>> >> >> Thanks
>> >> >>
>> >> >> Ravi Kiran Bhaskar
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Fri, Sep 5, 2014 at 1:23 PM, Erick Erickson <
>> >> erickerickson@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> > OK, why can't you switch the clauses from Joel's suggestion?
>> >> >> >
>> >> >> > Something like:
>> >> >> > q=Malaysia plane crash&rq={!rerank reRankDocs=1000
>> >> >> > reRankQuery=$myquery}&myquery=*:*&sort=date+desc
>> >> >> >
>> >> >> > (haven't tried this yet, but you get the idea....).
>> >> >> >
>> >> >> > Best,
>> >> >> > Erick
>> >> >> >
>> >> >> > On Fri, Sep 5, 2014 at 9:33 AM, Markus Jelsma
>> >> >> > <markus.jelsma@openindex.io> wrote:
>> >> >> > > Hi - You can already achieve this by boosting on the
document's
>> >> >> recency.
>> >> >> > The result set won't be exactly ordered by date but you will
get
>> the
>> >> most
>> >> >> > relevant and recent documents on top.
>> >> >> > >
>> >> >> > > Markus
>> >> >> > >
>> >> >> > > -----Original message-----
>> >> >> > >> From:Ravi Solr <ravisolr@gmail.com <mailto:ravisolr@gmail.com>
>> >
>> >> >> > >> Sent: Friday 5th September 2014 18:06
>> >> >> > >> To: solr-user@lucene.apache.org <mailto:
>> >> solr-user@lucene.apache.org>
>> >> >> > >> Subject: Re: Query ReRanking question
>> >> >> > >>
>> >> >> > >> Thank you very much for responding. I want to do
exactly the
>> >> opposite
>> >> >> of
>> >> >> > >> what you said. I want to sort the relevant docs in
reverse
>> >> chronology.
>> >> >> > If
>> >> >> > >> you sort by date before hand then the relevancy is
lost. So I
>> >> want to
>> >> >> > get
>> >> >> > >> Top N relevant results and then rerank those Top
N to achieve
>> >> relevant
>> >> >> > >> reverse chronological results.
>> >> >> > >>
>> >> >> > >> If you ask Why would I want to do that ??
>> >> >> > >>
>> >> >> > >> Lets take a example about Malaysian airline crash.
several
>> >> articles
>> >> >> > might
>> >> >> > >> have been published over a period of time. When I
search for -
>> >> >> malaysia
>> >> >> > >> airline crash blackbox - I would want to see "relevant"
results
>> >> but
>> >> >> > would
>> >> >> > >> also like to see the the recent developments on the
top i.e.
>> >> >> > effectively a
>> >> >> > >> reverse chronological order within the relevant results,
like
>> >> telling
>> >> >> a
>> >> >> > >> story over a period of time
>> >> >> > >>
>> >> >> > >> Hope i am clear. Thanks for your help.
>> >> >> > >>
>> >> >> > >> Thanks
>> >> >> > >>
>> >> >> > >> Ravi Kiran Bhaskar
>> >> >> > >>
>> >> >> > >>
>> >> >> > >> On Thu, Sep 4, 2014 at 5:08 PM, Joel Bernstein <
>> >> joelsolr@gmail.com
>> >> >> > <mailto:joelsolr@gmail.com> > wrote:
>> >> >> > >>
>> >> >> > >> > If you want the main query to be sorted by date
then the top
>> N
>> >> docs
>> >> >> > >> > reranked by a query, that should work. Try something
like
>> this:
>> >> >> > >> >
>> >> >> > >> > q=foo&sort=date+desc&rq={!rerank reRandDocs=1000
>> >> >> > >> > reRankQuery=$myquery}&myquery=blah
>> >> >> > >> >
>> >> >> > >> >
>> >> >> > >> > Joel Bernstein
>> >> >> > >> > Search Engineer at Heliosearch
>> >> >> > >> >
>> >> >> > >> >
>> >> >> > >> > On Thu, Sep 4, 2014 at 4:25 PM, Ravi Solr <
>> ravisolr@gmail.com
>> >> >> > <mailto:ravisolr@gmail.com> > wrote:
>> >> >> > >> >
>> >> >> > >> > > Can the ReRanking API be used to sort within
docs retrieved
>> >> by a
>> >> >> > date
>> >> >> > >> > field
>> >> >> > >> > > ? Can somebody help me understand how to
write such a
>> query ?
>> >> >> > >> > >
>> >> >> > >> > > Thanks
>> >> >> > >> > >
>> >> >> > >> > > Ravi Kiran Bhaskar
>> >> >> > >> > >
>> >> >> > >> >
>> >> >> > >>
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message