lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: Stemming and other tokenizers
Date Mon, 12 Sep 2011 08:53:29 GMT
Hi

Everybody else use dedicated field per language, so why can't you?
Please explain your use case, and perhaps we can better help understand what you're trying
to do.
Do you always know the query language in advance?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 12. sep. 2011, at 08:28, Patrick Sauts wrote:

> I can't create one field per language, that is the problem but I'll dig into
> it following your indications.
> I let you know what I could come out with.
> 
> Patrick.
> 
> 2011/9/11 Jan Høydahl <jan.asf@cominvent.com>
> 
>> Hi,
>> 
>> You'll not be able to detect language and change stemmer on the same field
>> in one go. You need to create one fieldType in your schema per language you
>> want to use, and then use LanguageIdentification (SOLR-1979) to do the magic
>> of detecting language and renaming the field. If you set
>> langid.override=false, languid.map=true and populate your "language" field
>> with the known language, you will probably get the desired effect.
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>> 
>> On 10. sep. 2011, at 03:24, Patrick Sauts wrote:
>> 
>>> Hello,
>>> 
>>> 
>>> 
>>> I want to implement some king of AutoStemming that will detect the
>> language
>>> of a field based on a tag at the start of this field like #en# my field
>> is
>>> stored on disc but I don't want this tag to be stored. Is there a way to
>>> avoid this field to be stored ?
>>> 
>>> To me all the filters and the tokenizers interact only with the indexed
>>> field and not the stored one.
>>> 
>>> Am I wrong ?
>>> 
>>> Is it possible to you to do such a filter.
>>> 
>>> 
>>> 
>>> Patrick.
>>> 
>> 
>> 


Mime
View raw message