lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alireza Salimi <>
Subject Re: Removing whitespace
Date Mon, 12 Dec 2011 22:07:59 GMT
That sounds strange requirement, but I think you can use CharFilters
instead of implementing your own Tokenizer.
Take a look at this section, maybe it helps.


On Mon, Dec 12, 2011 at 4:51 PM, Devon Baumgarten <> wrote:

> Hello,
> I am having trouble finding how to remove/ignore whitespace when indexing.
> The only answer I have found suggested that it is necessary to write my own
> tokenizer. Is this true? I want to remove whitespace and special characters
> from the phrase and create N-grams from the result.
> Ultimately, the effect I am after is that searching "bobdole" would match
> "Bob Dole", "Bo B. Dole", and maybe "Bobdo". Maybe there is a better way...
> can anyone lend some assistance?
> Thanks!
> Dev B

Alireza Salimi
Java EE Developer

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message