lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Can I exclude certain terms from MoreLikeThis query?
Date Tue, 25 Aug 2009 10:38:06 GMT
Hi Paras,

 > As I understand from StopFilter,
 > it is a static method to exclude terms such as stop words.

Correct. As far as I know, to control what words MLT component
chooses for generating BooleanQuery, what you can do is
that you specify the following parameters:

mlt.mintf
Minimum Term Frequency - the frequency below which terms will be ignored 
in the source doc.

mlt.mindf
Minimum Document Frequency - the frequency at which words will be 
ignored which do not occur in at least this many docs.

mlt.minwl
minimum word length below which words will be ignored.

mlt.maxwl
maximum word length above which words will be ignored.

mlt.maxqt
maximum number of query terms that will be included in any generated query.

If these parameters are unusable for your case, I don't think
you can exclude certain terms OOTB.

Koji


Paras Chopra wrote:
> Hi Koji,
> Thank you for your reply. Actually, the terms I would like to exclude would
> be based on the document I use for MoreLikeThis Query. As I understand
> from StopFilter,
> it is a static method to exclude terms such as stop words.
>
> My problem is that I want to return theme/area specific results for
> MoreLikeThis. Usually, MoreLikeThis picks up nouns such as names and returns
> results based on them as it finds them as interesting. I would rather like
> it to return results based on general theme of a text. So, I was wondering
> if I can prevent MoreLikeThis to exclude results containing a list of
> detected nouns that I provide per document.
>
> Thanks,
> Paras Chopra
>
> On Tue, Aug 25, 2009 at 11:30 AM, Koji Sekiguchi <koji@r.email.ne.jp> wrote:
>
>   
>> Paras Chopra wrote:
>>
>>     
>>> Hi All,
>>> I am tinkering with MoreLikeThis component of Solr and had a particular
>>> use
>>> case where I would like to exclude certain terms from consideration while
>>> MoreLikeThis makes a query vector out of a document. Is it possible with
>>> Solr? I searched for this in the documentation but wasn't able to find a
>>> method.
>>>
>>> Thanks in advance,
>>> Paras Chopra
>>>
>>>
>>>
>>>       
>> How about using StopFilter that excludes certain terms on a field,
>> then specify the field for mlt.fl parameter when executing MLT?
>>
>>
>> Koji
>>
>>
>>
>>     
>
>
>   


Mime
View raw message