lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mohanmca01 <mohanmc...@gmail.com>
Subject RE: Arabic words search in solr
Date Wed, 02 Aug 2017 12:57:46 GMT
Hi Phil Scadden,

 Thank you for your reply,

we tried your suggested solution by removing hyphen while indexing, but it
was getting wrong results. i was searching for "شرطة ازكي" and it was
showing me the result that am looking for, plus irrelevant result which
either have the first or second word that i have typed while searching.

First word: شرطة 
Second Word: ازكي

results that we are getting:


{
  "responseHeader": {
    "status": 0,
    "QTime": 3,
    "params": {
      "indent": "true",
      "q": "bizNameAr:(شرطة ازكي)",
      "_": "1501678260335",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 444,
    "start": 0,
    "docs": [
      {
        "id": "28107",
        "bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة
الداخلية  -  -
مركز شرطة إزكي",
        "_version_": 1574621132849414100
      },
      {
        "id": "13937",
        "bizNameAr": "مؤسسةا الازكي للتجارة والمقاولات",
        "_version_": 1574621132197200000
      },
      {
        "id": "15914",
        "bizNameAr": "العلوي والازكي المتحدة ش.م.م",
        "_version_": 1574621132344000500
      },
      {
        "id": "20639",
        "bizNameAr": "سحائب ازكي للتجارة",
        "_version_": 1574621132574687200
      },
      {
        "id": "25108",
        "bizNameAr": "المستشفيات -  - مستشفى إزكي",
        "_version_": 1574621132737216500
      },
      {
        "id": "27629",
        "bizNameAr": "وزارة الداخلية -  -  - والي إزكي -",
        "_version_": 1574621132833685500
      },
      {
        "id": "36351",
        "bizNameAr": "طوارئ الكهرباء - إزكي",
        "_version_": 1574621133183910000
      },
      {
        "id": "61235",
        "bizNameAr": "اضواء ازكي للتجارة",
        "_version_": 1574621133785792500
      },
      {
        "id": "66821",
        "bizNameAr": "أطلال إزكي للتجارة",
        "_version_": 1574621133915816000
      },
      {
        "id": "67011",
        "bizNameAr": "بنك ظفار - فرع ازكي",
        "_version_": 1574621133920010200
      }
    ]
  }
}

Actually  we expecting the below results only since it has both the words
that we typed while searching:

      {
        "id": "28107",
        "bizNameAr": "شرطة عمان السلطانية - قيادة شرطة محافظة
الداخلية  -  -
مركز شرطة إزكي",
        "_version_": 1574621132849414100
      },


Configuration:

In schema.xml we configured as below:

    <field name="bizNameAr" type="text_ar" indexed="true" stored="true"/>

    
    <fieldType name="text_ar" class="solr.TextField"
positionIncrementGap="100">
      <analyzer> 
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_ar.txt" />
        <filter class="solr.ArabicNormalizationFilterFactory"/>
        <filter class="solr.ArabicStemFilterFactory"/>
		<filter class="solr.ICUFoldingFilterFactory"/>
		<filter class="solr.HyphenatedWordsFilterFactory"/>
		<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="ى"
replacement="ئ"/> 
		<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="ء"
replacement=""/> 
      </analyzer>
    </fieldType>


Thanks,





--
View this message in context: http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4348774.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message