lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "d.kumar@technisat.de" <d.ku...@technisat.de>
Subject Re: AW: plus sign in request / looking for + in title
Date Fri, 04 Aug 2017 18:01:57 GMT
Hey,

that is a good point. What is the best way for filtering? About the plus at the request, we
are doing on the whole request an URL encode..



Thanks
David


 

> Am 04.08.2017 um 17:34 schrieb Erick Erickson <erickerickson@gmail.com>:
> 
> Glad to hear it. Two things:
> 
> 1> you might have to do some additional filtering when using
> WhitespaceTokenizer. It, well, splits on whitespace so things like
> punctuation will come through as part of the token. So "My dog has
> fleas." (note the period after fleas) would have the period included
> in the token "fleas.".
> 
> 2> getting the plus sign through URL encoding and the parser may be
> fun, you may have to escape it to keep it from being interpreted as an
> operator....
> 
> Best,
> Erick
> 
> On Fri, Aug 4, 2017 at 5:55 AM, d.kumar@technisat.de
> <d.kumar@technisat.de> wrote:
>> Hey, thanks.
>> 
>> Yeah i found a  way..
>> I sued for these files my on fieldtype. In these I'm using the WhitespaceTokenizerFactory
for query an index.. and now everything is like it should be..
>> 
>> :-)
>> 
>> Thanks
>> 
>> David
>> 
>> -----Urspr√ľngliche Nachricht-----
>> Von: Shawn Heisey [mailto:apache@elyograg.org]
>> Gesendet: Freitag, 4. August 2017 14:53
>> An: solr-user@lucene.apache.org
>> Betreff: Re: AW: plus sign in request / looking for + in title
>> 
>>> On 8/4/2017 2:15 AM, d.kumar@technisat.de wrote:
>>> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus sign?
An suggestions?
>> 
>> You can't.  The standard tokenizer really isn't configurable at all.
>> 
>> You'd need to change your analysis chain (tokenizer and filters) to produce the results
you want.
>> 
>> Thanks,
>> Shawn
>> 

Mime
View raw message