lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clemens Wyss DEV <clemens...@mysign.ch>
Subject AW: indexing two words, searching single word
Date Fri, 03 Aug 2018 11:19:28 GMT
Hi Markus,
thanks for the quick answer. 

"sound stage" was just an example. We are looking for a generic solution ...

Is it "ok" to apply an NGRamFilter for query-analyzing?
<analyzer type="query">
	<tokenizer class="solr.WhitespaceTokenizerFactory" />
	<filter class="solr.LowerCaseFilterFactory" />
	<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15" />
</analyzer>

I guess (besides the performance impact) this reduces search results accuracy?

-Clemens

-----Urspr√ľngliche Nachricht-----
Von: Markus Jelsma <markus.jelsma@openindex.io> 
Gesendet: Freitag, 3. August 2018 12:43
An: solr-user@lucene.apache.org
Betreff: RE: indexing two words, searching single word

Hello,

If your case is English you could use synonyms to work around the problem of the few compound
words of the language. However, would you be dealing with a Germanic compound language, the
HyphenationCompoundWordTokenFilter [1] or DictionaryCompoundWordTokenFilter are a better choice.
The former is much more flexible but has its drawbacks.

Regards,
Markus

https://lucene.apache.org/core/7_4_0/analyzers-common/org/apache/lucene/analysis/compound/HyphenationCompoundWordTokenFilterFactory.html

 
 
-----Original message-----
> From:Clemens Wyss DEV <clemensdev@mysign.ch>
> Sent: Friday 3rd August 2018 12:22
> To: solr-user@lucene.apache.org
> Subject: indexing two words, searching single word
> 
> Sounds like a rather simple issue:
> if I index "sound stage" and search for "soundstage" I get no hits
> 
> What am I doing wrong 
> a) when indexing
> b) when searching
> ?
> 
> Thx in advance
> - Clemens
> 
Mime
View raw message