lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tulsi Das <tulsi.das1...@gmail.com>
Subject Re: Why do I get different results for the same query with two Solr versions?
Date Thu, 24 Dec 2020 19:27:55 GMT
Hi,
Try adding debug=true or debug=query in the url and see the formed query at
the end .
You will get to know why the results are different.


On Thu, 24 Dec, 2020, 8:05 pm nettadalet, <nsteinberg@dalet.com> wrote:

> Hello,
>
> I have the the same field type defined in Solr 4.6 and Solr 7.5. When I
> search with both versions, I get different results, and I don't know why
>
> I have the following *field type definition in Solr 4.6*:
> <fieldType name="text_type1" class="solr.TextField"
> positionIncrementGap="1000">
>         <analyzer type="index">
>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <filter class="solr.ASCIIFoldingFilterFactory" />
>                 <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>                 <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="0"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>         </analyzer>
>         <analyzer type="query">
>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <filter class="solr.ASCIIFoldingFilterFactory" />
>                 <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>                 <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords.txt"
>                 />
>                 <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="0"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>         </analyzer>
> </fieldType>
>
>
> I have the following *field type definition in Solr 7.5*:
> <fieldType name="text_type1" class="solr.TextField"
> positionIncrementGap="1000">
>         <analyzer type="index">
>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <filter class="solr.ASCIIFoldingFilterFactory" />
>                 <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>                 <filter class="solr.WordDelimiterGraphFilterFactory"
> generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="0"/>
>                 <filter class="solr.FlattenGraphFilterFactory"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>         </analyzer>
>         <analyzer type="query">
>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <filter class="solr.ASCIIFoldingFilterFactory" />
>                 <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>                 <filter class="solr.StopFilterFactory"
>                                    ignoreCase="true"
>                                    words="stopwords.txt"
>                                        />
>                 <filter class="solr.WordDelimiterGraphFilterFactory"
> generateWordParts="1"
> generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="0"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>         </analyzer>
> </fieldType>
>
> * I tried to use solr.WordDelimiterFilterFactory with Solr 7.5 instead of
> solr.WordDelimiterGraphFilterFactory so the field types will be more alike,
> but the result was the same.
>
> I have the following *6 values set for field text1 of type text_type1 for 6
> different documents* (the type(s) from above):
> KI_d5e7b43a
> KI_b7c490bd
> KI_7df2f026
> KI_fa7d129d
> KI_5867aec7
> KI_7c3c0b93
>
>
> My query is *text1=KI_7*.
> Using Solr 4.6, I get 2 result - KI_7df2f026, KI_7c3c0b93
> Using Solr 7.5, I get all 6 results.
>
> Questions:
> 1. How come I get different results with the same data, when my fields
> definitions are the same (as far as I can tell)?
>
> 2. What are the expected results?
> I think that the results Solr 7.5 returns are the correct ones, since at
> the
> end of the of the analysis I get *KA* as a term and *7* as a term, both
> during the indexing analysis and the query analysis, so, to my
> understanding, all 6 results should be found.
> Is this correct? if not, what am I missing? what don't I understand
> correctly?
>
> I would very much appreciate a full/partial answer, but even a link that
> could explain at least the expected results part would be great.
>
> Thanks in advance, I know this might be a tough one to answer [Hope not
> :)]
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message