Simply multi-word synonyms are recommended to use at index time.
As explained here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
--- On Wed, 9/7/11, roySolr <royrutten1989@gmail.com> wrote:
> From: roySolr <royrutten1989@gmail.com>
> Subject: Synonyms problem
> To: solr-user@lucene.apache.org
> Date: Wednesday, September 7, 2011, 1:46 PM
> hello,
>
> I have some problems with synonyms. I will show some
> examples to descripe
> the problem:
>
> Data:
>
> High school Lissabon
> High school Barcelona
> University of applied science
>
> When a user search for IFD i want all the results back. So
> i want to use
> this synonyms at query time:
>
> IFD => high school lissabon, high school
> barcelona,University of applied
> science
>
>
> The data is stored in the field "schools".
>
> Schools type looks like this:
>
> <fieldType name="schools"
> class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <charFilter
> class="solr.HTMLStripCharFilterFactory"/>
> <tokenizer
> class="solr.PatternTokenizerFactory" pattern="\s|,|-" />
> <filter
> class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <charFilter
> class="solr.HTMLStripCharFilterFactory"/>
> <tokenizer
> class="solr.PatternTokenizerFactory" pattern="\s|,|-" />
> <filter
> class="solr.LowerCaseFilterFactory"/>
> <filter
> class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="false"/>
> </analyzer>
> </fieldType>
>
>
> AS you can see i use some pattern tokenizer which splits on
> whitespace. When
> i use the synonyms at query time the
> analytics show me this:
>
> high | school
> | lissabon | science
> high | school
> | barcelona |
> university | of
> | applied |
>
> When i search for IFD i get no results. I found this in
> debugQuery:
>
> schools:"(high high university) (school school of)
> (lissaban barcelona
> applied) (science)"
>
> With this i see the problem: solr tries a lot of
> combinations but not the
> right one. I thought i could
> escape the whitespaces in the synonyms(High\ school\
> Lissabon). Then the
> analytics shows me better results:
>
> High school Lissabon
> High school Barcelona
> University of applied science
>
> Then SOLR search for "high school Lissabon" but in my index
> it is tokenized
> on whitespace, still no results.
>
>
> I'm stuck, can someone help me??
>
> Thanks
> R
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-problem-tp3316287p3316287.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.
>
|