lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alireza Salimi <alireza.sal...@gmail.com>
Subject Re: Removing whitespace
Date Mon, 12 Dec 2011 22:07:59 GMT
That sounds strange requirement, but I think you can use CharFilters
instead of implementing your own Tokenizer.
Take a look at this section, maybe it helps.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#CharFilterFactories



The

On Mon, Dec 12, 2011 at 4:51 PM, Devon Baumgarten <
dbaumgarten@nationalcorp.com> wrote:

> Hello,
>
> I am having trouble finding how to remove/ignore whitespace when indexing.
> The only answer I have found suggested that it is necessary to write my own
> tokenizer. Is this true? I want to remove whitespace and special characters
> from the phrase and create N-grams from the result.
>
> Ultimately, the effect I am after is that searching "bobdole" would match
> "Bob Dole", "Bo B. Dole", and maybe "Bobdo". Maybe there is a better way...
> can anyone lend some assistance?
>
> Thanks!
>
> Dev B
>
>


-- 
Alireza Salimi
Java EE Developer

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message