lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wunderw...@netflix.com>
Subject Re: When searching for !@#$%^&*() all documents are matched incorrectly
Date Mon, 01 Jun 2009 03:35:33 GMT
Use the [analysis] link on the Solr admin UI to get more info on
how this is being interpreted.

However, I am curious about why this is important. Do users enter
this query often? If not, maybe it is not something to spend time on.

wunder

On 5/31/09 2:56 PM, "Sam Michaels" <masu69@yahoo.com> wrote:

> 
> Here is the output from the debug query when I'm trying to match the String @
> against Bathing (should not match)
> 
> <str name="GLOM-1">
> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>   0.99999994 = queryWeight(activity_type:NAME), product of:
>     3.2689075 = idf(docFreq=153, numDocs=1489)
>     0.30591258 = queryNorm
>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
>     1.0 = tf(termFreq(activity_type:NAME)=1)
>     3.2689075 = idf(docFreq=153, numDocs=1489)
>     1.0 = fieldNorm(field=activity_type, doc=0)
> </str>
> 
> Looks like the AND clause in the search string is ignored...
> 
> SM.
> 
> 
> ryantxu wrote:
>> 
>> two key things to try (for anyone ever wondering why a query matches
>> documents)
>> 
>> 1.  add &debugQuery=true and look at the explain text below --
>> anything that contributed to the score is listed there
>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>> break text up into tokens.
>> 
>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>> something to do with it...
>> 
>> 
>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels <masu69@yahoo.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I'm running Solr 1.3/Java 1.6.
>>> 
>>> When I run a query like  - (activity_type:NAME) AND
>>> title:(\!@#$%\^&\*\(\))
>>> all the documents are returned even though there is not a single match.
>>> There is no title that matches the string (which has been escaped).
>>> 
>>> My document structure is as follows
>>> 
>>> <doc>
>>> <str name="activity_type">NAME</str>
>>> <str name="title">Bathing</str>
>>> ....
>>> </doc>
>>> 
>>> 
>>> The title field is of type text_title which is described below.
>>> 
>>> <fieldType name="text_title" class="solr.TextField"
>>> positionIncrementGap="100">
>>>      <analyzer type="index">
>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>        <!-- in this example, we will only use synonyms at query time
>>>        <filter class="solr.SynonymFilterFactory"
>>> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>>>        -->
>>>        <filter class="solr.WordDelimiterFilterFactory"
>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        <filter class="solr.LowerCaseFilterFactory"/>
>>>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>>>      </analyzer>
>>>      <analyzer type="query">
>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>>> ignoreCase="true" expand="true"/>
>>>        <filter class="solr.WordDelimiterFilterFactory"
>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        <filter class="solr.LowerCaseFilterFactory"/>
>>>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>>> 
>>>      </analyzer>
>>>    </fieldType>
>>> 
>>> When I run the query against Luke, no results are returned. Any
>>> suggestions
>>> are appreciated.
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
>>> s-are-matched-incorrectly-tp23797731p23797731.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>>> 
>> 
>> 


Mime
View raw message