lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tedsolr <>
Subject Engage custom hit collector for special search processing
Date Tue, 13 Jan 2015 21:29:03 GMT
I have a complicated problem to solve, and I don't know enough about
lucene/solr to phrase the question properly. This is kind of a shot in the
dark. My requirement is to return search results always in completely
"collapsed" form, rolling up duplicates with a count. Duplicates are defined
by whatever fields are requested. If the search requests fields A, B, C,
then all matched documents that have identical values for those 3 fields are
"dupes". The field list may change with every new search request. What I do
know is the super set of all fields that may be part of the field list at
index time.

I know this can't be done with configuration alone. It doesn't seem
performant to retrieve all 1M+ docs and post process in Java. A very smart
person told me that a custom hit collector should be able to do the
filtering for me. So, maybe I create a custom search handler that somehow
exposes this custom hit collector that can use FieldCache or DocValues to
examine all the matches and filter the results in the way I've described

So assuming this is a viable solution path, can anyone suggest some helpful
posts, code fragments, books for me to review? I admit to being out of my
depth, but this requirement isn't going away. I'm grasping for straws right

(using Solr 4.9)

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message