lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject Re: Re: tf and very short text fields
Date Tue, 01 Apr 2014 19:30:54 GMT
Also, if i remember correctly, k1 set to zero for bm25 automatically omits norms in the calculation.
So thats easy to play with without reindexing.


Markus Jelsma <markus.jelsma@openindex.io> schreef:Yes, override tfidfsimilarity and
emit 1f in tf(). You can also use bm25 with k1 set to zero in your schema.


Walter Underwood <wunder@wunderwood.org> schreef:And here is another peculiarity of
short text fields.

The movie "New York, New York" should not be twice as relevant for the query "new york". Is
there a way to use a binary term frequency rather than a count?

wunder
--
Walter Underwood
wunder@wunderwood.org



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message