lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajani Maski <rajinima...@gmail.com>
Subject Re: Solr Synonyms, Escape space in case of multi words
Date Thu, 16 Oct 2014 06:26:20 GMT
Hi David,

  I think you should have the filter class with tokenizer specified. [As
shown below]

  <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"

*tokenizerFactory="solr.KeywordTokenizerFactory"/>*



So your field type should be as shown below:

<fieldType name="text_syn" class="solr.TextField"
positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"
tokenizerFactory="solr.KeywordTokenizerFactory"/>
      </analyzer>
    </fieldType>


On Wed, Oct 15, 2014 at 7:25 PM, David Philip <davidphilipsheron@gmail.com>
wrote:

> Sorry, analysis page clip is getting trimmed off and hence the indention is
> lost.
>
> Here it is :
>
> ridemakers | ride | ridemakerz | ride | ridemark | ride | makers | makerz|
> care
>
> expected:
>
> ridemakers | ride | ridemakerz | ride | ridemark | ride | makers |
> makerz| *ride
> care*
>
>
>
> On Wed, Oct 15, 2014 at 7:21 PM, David Philip <davidphilipsheron@gmail.com
> >
> wrote:
>
> > contd..
> >
> > expectation was that the "ride care"  should not have split into two
> > tokens.
> >
> > It should have been as below. Please correct me/point me where I am
> wrong.
> >
> >
> > Input : ridemakers, ride makers, ridemakerz, ride makerz, ride\mark,
> ride\
> > care
> >
> > o/p
> >
> > ridemakersrideridemakerzrideridemarkridemakersmakerz
> >
> > *ride care*
> >
> >
> >
> >
> > On Wed, Oct 15, 2014 at 7:16 PM, David Philip <
> davidphilipsheron@gmail.com
> > > wrote:
> >
> >> Hi All,
> >>
> >>    I remember using multi-words in synonyms in Solr 3.x version. In case
> >> of multi words, I was escaping space with back slash[\] and it work as
> >> intended.  Ex: ride\ makers, riders, rider\ guards.  Each one mapped to
> >> each other and so when I searched for ride makers, I obtained the search
> >> results for all of them. The field type was same as below. I have same
> set
> >> up in solr 4.10 but now the multi word space escape is getting ignored.
> It
> >> is tokenizing on spaces.
> >>
> >>  synonyms.txt
> >>     ridemakers, ride makers, ridemakerz, ride makerz, ride\mark, ride\
> >> care
> >>
> >>
> >> Analysis page:
> >>
> >> ridemakersrideridemakerzrideridemarkridemakersmakerzcare
> >>
> >> Field Type
> >>
> >>     <fieldType name="text_syn" class="solr.TextField"
> >> positionIncrementGap="100">
> >>       <analyzer>
> >>         <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>         <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"
> >> ignoreCase="true" expand="true"/>
> >>       </analyzer>
> >>     </fieldType>
> >>
> >>
> >>
> >> Could you please tell me what could be the issue? How do I handle
> >> multi-word cases?
> >>
> >>
> >>
> >>
> >>     synonyms.txt
> >>     ridemakers, ride makers, ridemakerz, ride makerz, ride\mark, ride\
> >> care
> >>
> >>
> >> Thanks - David
> >>
> >>
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message