lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Effect of multiple white space at WhiteSpaceTokenizer
Date Tue, 08 Oct 2013 14:29:34 GMT
Result is the same and performance difference should be negligible, unless you're uploading
megabytes of white space. Consecutive white space should be collapsed outside of Solr/Lucene
anyway because it'll end up in your stored field. Index size will be slightly bigger but not
much due to compression.
 
-----Original message-----
> From:Furkan KAMACI <furkankamaci@gmail.com>
> Sent: Tuesday 8th October 2013 16:21
> To: solr-user@lucene.apache.org
> Subject: Effect of multiple white space at WhiteSpaceTokenizer
> 
> I use Solr 4.5 and I have a WhiteSpaceTokenizer at my schema. What is the
> difference (index size and performance) for that two sentences:
> 
> First one: This is a sentence.
> Second one: This       is         a                          sentence.
> 

Mime
View raw message