lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter S <pete...@hotmail.com>
Subject RE: Non-leading wildcard search
Date Mon, 04 Jan 2010 23:29:04 GMT

Hi Yonik,

 

Thanks for your quick reply.

No, the queries themselves aren't in quotes.

 

Since I sent the initial email, I have managed to get non-leading wildcard queries to work
with this, but by unexpected means (for me at least :-).

 

If I add a LowerCaseFilterFactory to the fieldType, queries like s* (or S*) work as expected.

 

So the fieldType schema element now looks like:

    <fieldType name="text_verbatim" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
      </analyzer>
    </fieldType>

 

I wasn't expecting this, as I would have thought this would change only the case behaviour,
not the wildcard behaviour (or at least not just the non-leading wildcard behaviour). Perhaps
I'm just not understanding how the terms (term in this case as not tokenized) is indexed and
subsequently matched.

 

What I've noticed is that with the LowerCaseFilterFactory in place, document queries return
results with case intact, but facet queries show the results in lower-case

(e.g. document->appname=Something  facet.field.appname=something). (I kind of expected
the document->appname field to be lower case as well)

 

Does this sound like correct behaviour to you?

If it's correct, that's ok, I'll manage to work 'round it (maybe there's a way to map the
facet field back to the document field?), but if it sounds wrong, perhaps it warrants further
investigation.

 

Many thanks,

Peter

 


 
> Date: Mon, 4 Jan 2010 17:42:30 -0500
> Subject: Re: Non-leading wildcard search
> From: yonik@lucidimagination.com
> To: solr-user@lucene.apache.org
> 
> On Mon, Jan 4, 2010 at 5:38 PM, Peter S <peter4u@hotmail.com> wrote:
> > When I query:  "Something" or "Something Else" or "*thing"  or "*omething*", I get
back the expected results.
> > If, however, I query: "Some*" or "S*" or "s*" etc, I get no results (although this
type of non-leading wildcard works fine with other fieldType schema elements that don't use
KeywordTokenizer).
> 
> Is your query string actually in quotes? Wildcards aren't currently
> supported in quotes.
> So text_verbatim:Some* should work.
> 
> -Yonik
> http://www.lucidimagination.com
 		 	   		  
_________________________________________________________________
View your other email accounts from your Hotmail inbox. Add them now.
http://clk.atdmt.com/UKM/go/186394592/direct/01/
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message