lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Avoid duplicates in MoreLikeThis using field collapsing
Date Tue, 02 Jun 2009 17:47:21 GMT

But why does MLT return duplicates in the first place?  That seems strange to me.  If there
are no duplicates in your index, how does MLT manage to return dupes?

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Marc Sturlese <marc.sturlese@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Friday, May 29, 2009 7:05:15 AM
> Subject: Avoid duplicates in MoreLikeThis using field collapsing
> 
> 
> Hey there, 
> I am testing MoreLikeThis feaure (with MoreLikeThis component and with
> MoreLikeThis handler) and I am getting lots of duplicates. I have noticed
> that lots of the similar documents returned are duplicates. To avoid that I
> have tried to use the field collapsing patch but it's not taking effect.
> 
> In case of MoreLikeThis handler I think it's normal has I have seen it
> extends directly from RequestHandlerBase.java and not from
> SearchHandler.java that is the one that in the function handleRequestBody
> will deal with components:
> 
>       for( SearchComponent c : components ) {
>         rb.setTimer( subt.sub( c.getName() ) );
>         c.prepare(rb);
>         rb.getTimer().stop();
>       }
> 
> To sort it out I have "embbed" the collapseFilter in the getMoreLikeThis
> method of the MoreLikeThisHandler.java
> This is working alrite but would like to know if is there any more polite
> way to make MoreLikeThisHandler able to deal with components. I mean via
> solrconfig.xml or "pluging" something instead of "hacking" it.
> 
> Thanks in advance
> 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Avoid-duplicates-in-MoreLikeThis-using-field-collapsing-tp23778054p23778054.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message