lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From meghana <meghana.rav...@amultek.com>
Subject Re: Issue with fuzzy search in Distributed Search
Date Wed, 01 May 2013 06:05:59 GMT
To ensure the all records exist in single node, i queried on specific
duration, so , for shards core and simple core query, results should be
similar. 

as you suggested, i analyzed the debugQuery for one specific search
*text:worde~1*, and I seen that the record which returns in shards core have
highlights like *word*, *words*, *word!n*. but when I look in debugQuery it
just processing for *word!n*, and was not processing  other highlights
(words, word), although it shows it in highlight for that record. and so,
shards core do not return other records , having text as *word* or *words* ,
but not *word!n* in it. 

on the other case, the simple core processing all *word*, *words*, *word!n*,
and return proper results.  this seems very weird behavior, any suggestion ? 



Jack Krupansky-2 wrote
> A fuzzy query itself does not know about distributed search - Lucene
> simply 
> scores the query results based on the local index. Then, Solr is merging
> the 
> merging the query results from different nodes.
> 
> Try the query locally for each node and set debugQuery=true and see how
> each 
> document gets scored.
> 
> I'm actually not sure what the specific "problem" (symptom) is that you
> are 
> seeing. I mean, maybe there is only 1 result on that node - how do you
> know 
> otherwise?? Or maybe one node has more exact matches.
> 
> -- Jack Krupansky
> 
> -----Original Message----- 
> From: meghana
> Sent: Tuesday, April 30, 2013 7:51 AM
> To: 

> solr-user@.apache

> Subject: Issue with fuzzy search in Distributed Search
> 
> I have created 2 versions of Solr core in different servers. one is simple
> core having all records in one core. And other is shards core, distributed
> over 3 cores on server.
> 
> Simple core :
> 
> http://localhost:8080/sorl/core0/select?q=text:hoers~1
> 
> Distributed core :
> 
> http://192.168.1.91:8080/core0/select?shards=http://192.168.1.91:8080/core0,http://192.168.1.91:8080/core1,http://192.168.1.91:8080/core2&q=text:hoers~1
> 
> data, schema and other configuration is similar in both the cores.
> 
> but while doing fuzzy search like hoers~1 one core returns many
> records(about 450), while other core return only 1 record.
> 
> While this issue does not seem related to Distributed Search, as Although
> i
> do not use distributed search, then also it do not return more rows.
> 
> as http://192.168.1.91:8080/core0/select?q=text:hoers~1
> 
> below is schema definition for my field.
> <fieldType name="text_en_splitting" class="solr.TextField"
> positionIncrementGap="100" autoGeneratePhraseQueries="true">
>       
> <analyzer type="index">
>       
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         
> <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords.txt"
>                 enablePositionIncrements="false"
>                 />
>         
> <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords_en.txt"
>                 enablePositionIncrements="true"
>                 />
>         
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
> protected="protwords.txt" types="wdfftypes.txt"  />
>         
> <filter class="solr.LowerCaseFilterFactory"/>
>         
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>         
> <filter class="solr.PorterStemFilterFactory"/>
>       
> </analyzer>
>       
> <analyzer type="query">
>         
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         
> <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords_extra_query.txt"
>                 enablePositionIncrements="false"
>                 />
>         
> <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords_en.txt"
>                 enablePositionIncrements="true"
>                 />
>         
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
> protected="protwords.txt" types="wdfftypes.txt"  />
>         
> <filter class="solr.LowerCaseFilterFactory"/>
>         
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>         
> <filter class="solr.PorterStemFilterFactory"/>
>       
> </analyzer>
>     
> </fieldType>
> Not sure, what is wrong with this. Can anybody help me on this??
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Issue-with-fuzzy-search-in-Distributed-Search-tp4060022.html
> Sent from the Solr - User mailing list archive at Nabble.com.





--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Results-differ-in-2-solr-cores-same-configuration-for-fuzzy-search-tp4060022p4060201.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message