lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: Stemming and other tokenizers
Date Sun, 11 Sep 2011 22:09:05 GMT
Hi,

You'll not be able to detect language and change stemmer on the same field in one go. You
need to create one fieldType in your schema per language you want to use, and then use LanguageIdentification
(SOLR-1979) to do the magic of detecting language and renaming the field. If you set langid.override=false,
languid.map=true and populate your "language" field with the known language, you will probably
get the desired effect.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 10. sep. 2011, at 03:24, Patrick Sauts wrote:

> Hello,
> 
> 
> 
> I want to implement some king of AutoStemming that will detect the language
> of a field based on a tag at the start of this field like #en# my field is
> stored on disc but I don't want this tag to be stored. Is there a way to
> avoid this field to be stored ?
> 
> To me all the filters and the tokenizers interact only with the indexed
> field and not the stored one.
> 
> Am I wrong ?
> 
> Is it possible to you to do such a filter.
> 
> 
> 
> Patrick.
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message