lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mirko Mancin <mirko.man...@t-frutta.it>
Subject Problem with NGram
Date Wed, 01 Apr 2015 12:37:27 GMT
Hi,

    I have a problem with n-gram. I would try to find the word "PRINTER".

I have this fields:


<field name="bestExternalDescriptionStandard" type="text_general" indexed="true" stored="true"
multiValued="true" termVectors="true" termPositions="true" termOffsets="true"/>

   <field name="bestExternalDescriptionGram" type="text_ngram" indexed="true" stored="true"
multiValued="true" termVectors="true" termPositions="true" termOffsets="true"/>




<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">

      <analyzer>

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.SnowballPorterFilterFactory" language="Italian" />

      </analyzer>

</fieldType>


<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">

<analyzer>

          <tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="4"/>


          <filter class="solr.LowerCaseFilterFactory"/>

          <filter class="solr.SnowballPorterFilterFactory" language="Italian" />

        </analyzer>

</fieldType>



And rightly found:

"BROTHER PRINTER","SAMSUNG PRINTER",ecc...

But if I search "PRIN3R" (with an error within the string), solr do not return anything!!

How to do it? How to setup my schema.xml for found documents with a certain similarity?

Thanks


Mirko Mancin

Software Developer

[cid:522DC2EC-33F1-4171-B17A-171D46B2CF64]

Ubiq srl
stradello Conrad Marca-Relli, 9
43122 Parma (PR)
t. +39 0521 781601
cell. +39 346 4137577
follow us on Linkedin<https://www.linkedin.com/company/ubiq-srl>

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. If you have received this email
in error please notify the system manager. This message contains confidential information
and is intended only for the individual named. If you are not the named addressee you should
not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail
if you have received this e-mail by mistake and delete this e-mail from your system. If you
are not the intended recipient you are notified that disclosing, copying, distributing or
taking any action in reliance on the contents of this information is strictly prohibited.

Mime
View raw message