lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Bernstein <joels...@gmail.com>
Subject Re: Engage custom hit collector for special search processing
Date Tue, 13 Jan 2015 21:57:06 GMT
You may also want to take a look at how AnalyticsQueries can be plugged in.
This won't show you how to do the implementation but it will show you how
you can plugin a custom collector.

http://heliosearch.org/solrs-new-analyticsquery-api/
http://heliosearch.org/solrs-mergestrategy/

Joel Bernstein
Search Engineer at Heliosearch

On Tue, Jan 13, 2015 at 4:45 PM, Alexandre Rafalovitch <arafalov@gmail.com>
wrote:

> Sounds like:
>
> https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
>
> http://heliosearch.org/the-collapsingqparserplugin-solrs-new-high-performance-field-collapsing-postfilter/
>
> The main issue is your multi-field criteria. So you may need to
> extend/overwrite the comparison method. Plus you'd need to keep the
> counts. Which you should know since you are doing the filtering.
>
> Is this the right direction for what you need?
>
> Regards,
>    Alex.
> ----
> Sign up for my Solr resources newsletter at http://www.solr-start.com/
>
>
> On 13 January 2015 at 16:29, tedsolr <tsmith@sciquest.com> wrote:
> > I have a complicated problem to solve, and I don't know enough about
> > lucene/solr to phrase the question properly. This is kind of a shot in
> the
> > dark. My requirement is to return search results always in completely
> > "collapsed" form, rolling up duplicates with a count. Duplicates are
> defined
> > by whatever fields are requested. If the search requests fields A, B, C,
> > then all matched documents that have identical values for those 3 fields
> are
> > "dupes". The field list may change with every new search request. What I
> do
> > know is the super set of all fields that may be part of the field list at
> > index time.
> >
> > I know this can't be done with configuration alone. It doesn't seem
> > performant to retrieve all 1M+ docs and post process in Java. A very
> smart
> > person told me that a custom hit collector should be able to do the
> > filtering for me. So, maybe I create a custom search handler that somehow
> > exposes this custom hit collector that can use FieldCache or DocValues to
> > examine all the matches and filter the results in the way I've described
> > above.
> >
> > So assuming this is a viable solution path, can anyone suggest some
> helpful
> > posts, code fragments, books for me to review? I admit to being out of my
> > depth, but this requirement isn't going away. I'm grasping for straws
> right
> > now.
> >
> > thanks
> > (using Solr 4.9)
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Engage-custom-hit-collector-for-special-search-processing-tp4179348.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message