lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uri Boness <ubon...@gmail.com>
Subject Re: Field Collapsing (was Re: Schema for group/child entity setup)
Date Sat, 05 Sep 2009 19:58:21 GMT
You can check out http://www.ilocal.nl. If you search for a bank in 
Amsterdam then you'll see that a lot of the results are collapsed. For 
this we used an older version of this patch (which works on 1.3) but a 
lot has changed since then. We're currently using this patch on another 
project, but it's not live yet.

Uri

R. Tan wrote:
> Thanks Uri. Your personal suggestion is appreciated and I think I'll follow
> your advice. We're still early in development and 1.4 would be a good
> choice. I hope I can get field collapsing to work with my requirements. Do
> you know any live site using field collapsing already?
>
> On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness <uboness@gmail.com> wrote:
>
>   
>> There's work on the patch that is being done now which will enable you to
>> ask for specific field values of the collapsed documents using a dedicated
>> request parameter. This work is not committed yet to the latest patch, but
>> will be very soon. There is of course a drawback to that as well, the
>> collapsed documents set can be very large (depends on your data of course)
>> in which case the returned result which includes the fields values can be
>> rather large, which will impact performance, this is why this feature will
>> be enabled only if you specify this extra parameter - by default no field
>> values will be returned.
>>
>> AFAIK, the latest patch should work fine with the latest build. Martijn
>> (which is the main maintainer of this patch) tries to keep it up to date
>> with the latest builds. But I guess the safest way is to work with the
>> nightly build of the same date as the latest patch (though I would give it a
>> try first with the latest build).
>>
>> BTW, it's not an official suggestion from the Solr development team, but if
>> you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would
>> go for the later. 1.4 is supposed to be released in the upcoming week or two
>> and it bring loads of bug fixes, enhancements and extra functionality. But
>> again, this is my personal suggestion.
>>
>>
>> cheers,
>> Uri
>>
>> R. Tan wrote:
>>
>>     
>>> Okay. Thanks for giving an insight on how it works in general. Without
>>> trying it myself, are the field values for the collapsed ones also part of
>>> the results data?
>>> What is the latest build that is safe to use on a production environment?
>>> I'd probably go for that and use field collapsing.
>>>
>>> Thank you very much.
>>>
>>>
>>> On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness <uboness@gmail.com> wrote:
>>>
>>>
>>>
>>>       
>>>> The collapsed documents are represented by one "master" document which
>>>> can
>>>> be part of the normal search result (the doc list), so pagination just
>>>> works
>>>> as expected, meaning taking only the returned documents in account
>>>> (ignoring
>>>> the collapsed ones). As for the scoring, the "master" document is
>>>> actually
>>>> the document with the highest score in the collapsed group.
>>>>
>>>> As for Solr 1.3 compatibility... well... it's very hart to tell. All
>>>> latest
>>>> patch are certainly *not* 1.3 compatible (I think they're also depending
>>>> on
>>>> some changes in lucene which are not available for solr 1.3). I guess
>>>> you'll
>>>> have to try some of the old patches, but I'm not sure about their
>>>> stability.
>>>>
>>>> cheers,
>>>> Uri
>>>>
>>>>
>>>> R. Tan wrote:
>>>>
>>>>
>>>>
>>>>         
>>>>> Thanks Uri. How does paging and scoring work when using field
>>>>> collapsing?
>>>>> What patch works with 1.3? Is it production ready?
>>>>>
>>>>> R
>>>>>
>>>>>
>>>>> On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness <uboness@gmail.com>
wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> The development on this patch is quite active. It works well for
single
>>>>>> solr instance, but distributed search (ie. shards) is not yet
>>>>>> supported.
>>>>>> Using this page you can group search results based on a specific
field.
>>>>>> There are two flavors of field collapsing - adjacent and non-adjacent,
>>>>>> the
>>>>>> former collapses only document which happen to be located next to
each
>>>>>> other
>>>>>> in the otherwise-non-collapsed results set. The later (the
>>>>>> non-adjacent)
>>>>>> one
>>>>>> collapses all documents with the same field value (regardless of
their
>>>>>> position in the otherwise-non-collapsed results set). Note, that
>>>>>> non-adjacent performs better than adjacent one. There's currently
>>>>>> discussion
>>>>>> to extend this support so in addition to collapsing the documents,
>>>>>> extra
>>>>>> information will be returned for the collapsed documents (see the
>>>>>> discussion
>>>>>> on the issue page).
>>>>>>
>>>>>> Uri
>>>>>>
>>>>>>
>>>>>> R. Tan wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> I think this is what I'm looking for. What is the status of this
>>>>>>> patch?
>>>>>>>
>>>>>>> On Thu, Sep 3, 2009 at 12:00 PM, R. Tan <tanrihaed58@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Hi Solrers,
>>>>>>>> I would like to get your opinion on how to best approach
a search
>>>>>>>> requirement that I have. The scenario is I have a set of
business
>>>>>>>> listings
>>>>>>>> that may be group into one parent business (such as 7-eleven
having
>>>>>>>> several
>>>>>>>> locations). On the results page, I only want 7-eleven to
show up once
>>>>>>>> but
>>>>>>>> also show how many locations matched the query (facet filtered
by
>>>>>>>> state,
>>>>>>>> for
>>>>>>>> example) and maybe a preview of the some of the locations.
>>>>>>>>
>>>>>>>> Searching for the business name is straightforward but the
locations
>>>>>>>> within
>>>>>>>> the a result is quite tricky. I can do the opposite, searching
for
>>>>>>>> the
>>>>>>>> locations and faceting on business names, but it will still
basically
>>>>>>>> be
>>>>>>>> the
>>>>>>>> same thing and repeat results with the same business name.
>>>>>>>>
>>>>>>>> Any advice?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> R
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>
>>>>>>>               
>>>       
>
>   

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message