lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <>
Subject Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project
Date Mon, 09 Jul 2007 19:12:46 GMT
 >>the case matters only for those words that should be included.

Jong, just want to check we're on the same page - you do know 
MoreLikeThis has a kind of automatic Stop-Wording built in , yes?
MoreLikeThis looks at the document frequency of all terms in the "this" 
text you provide and only selects a shortlist (up to maxQueryTerms) of 
the rarer words. As such, users (admin or otherwise) surrender precise 
control over what terms are used, hence my earlier point "does case 
really matter in this 'inexact' scenario?" and can you use the 
lower-case version of the field you said you already create?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message