lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuriy Akopov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-236) Field collapsing
Date Thu, 24 Mar 2011 21:30:07 GMT

    [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010942#comment-13010942
] 

Yuriy Akopov commented on SOLR-236:
-----------------------------------

Hi,

First of all, thanks you guys for working on that! However, I have encountered a problem with
this patch which is hopefully caused by my mistakes, so please correct me if I have done something
wrong.

So, I have applied SOLR-236 patch to release-1.4.1 and gained support for collapse.*, which
works. However, two issues discussed above in this thread are still there:

a) When collapsing is requested, only grouped results are returned. So, if the document has
got a unique value in the field collapsed (i.e. it has no other docs to group with) it is
excluded from the results. Instead of expected "unique documents plus non-unique grouped to
the most relevant one" just grouped ones are returned.

b) The number of results matching the query ("numFound") returned is always equal to "rows"
parameter provided or 10 if not supplied (i.e. it represents the number of results on the
page is returned, not the total number of matched documents).

There is a way around the latter "numFound" issue: faceting by the field collapsed as it was
suggested before, but the number retrieved with that facet is also useless as it includes
unique (non-grouped) documents as well, but they are not returned.

So far, I'm stuck with that. Is there any chance of resolving that? What about the SOLR-1682
patch - if it fixes that, should be applied to the original release-1.4.1 or to the release-1.4.1
patched with SOLR-236 beforehand?

Thanks in advance.

P.S. As I understand, grouping is planned in Solr 4.0. Does anybody know by any chance if
it is safe to use its nightly builds? I ran through its pending critical issues and they doesn't
look fatal, but still I'm afraid of possible implications.


> Field collapsing
> ----------------
>
>                 Key: SOLR-236
>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Emmanuel Keller
>            Assignee: Shalin Shekhar Mangar
>             Fix For: Next
>
>         Attachments: DocSetScoreCollector.java, NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java,
SOLR-236-1_4_1-NPEfix.patch, SOLR-236-1_4_1-paging-totals-working.patch, SOLR-236-1_4_1.patch,
SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
SOLR-236-branch_3x.patch, SOLR-236-distinctFacet.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch,
SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236.patch, SOLR-236.patch,
SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch,
SOLR-236_collapsing.patch, SOLR-236_collapsing.patch, collapsing-patch-to-1.3.0-dieter.patch,
collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch,
field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch,
field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch,
solr-236.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given field to
a single entry in the result set. Site collapsing is a special case of this, where all results
for a given web site is collapsed into one or two entries in the result set, typically with
an associated "more documents from this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message