lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Zhang <smartag...@gmail.com>
Subject Re: search behavior on a case-sensitive field
Date Tue, 04 Dec 2012 04:31:32 GMT
haha, makes perfect sense! Thanks a lot!

On Mon, Dec 3, 2012 at 9:25 PM, Jack Krupansky <jack@basetechnology.com>wrote:

> "CoSt" was split into two terms and the query parser generated an OR of
> them. Adding the autoGeneratePhraseQueries="**true" attribute to your
> field type should fix the problem.
>
> You can also change splitOnCaseChange="1" to splitOnCaseChange="0" to
> avoid the term splitting issue.
>
> Be sure to completely reindex in either case.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Joe Zhang
> Sent: Monday, December 03, 2012 11:10 PM
> To: solr-user@lucene.apache.org
> Subject: search behavior on a case-sensitive field
>
>
> I have a search like this:
>
>        <fieldType name="text_cs" class="solr.TextField"
>            positionIncrementGap="100">
>            <analyzer>
>                <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
>                <filter class="solr.StopFilterFactory"
>                    ignoreCase="true" words="stopwords.txt"/>
>                <filter class="solr.**WordDelimiterFilterFactory"
>                    generateWordParts="1" generateNumberParts="1"
>                    catenateWords="1" catenateNumbers="1" catenateAll="0"
>                    splitOnCaseChange="1"/>
> <!--                <filter class="solr.**LowerCaseFilterFactory"/>  -->
>                <filter class="solr.**EnglishPorterFilterFactory"
>                    protected="protwords.txt"/>
>                <filter class="solr.**RemoveDuplicatesTokenFilterFac**
> tory"/>
>            </analyzer>
>        </fieldType>
>
> When I query "COST", it gives reasonable results (n1);
> When I query "CoSt", however, it gives me n2 (>n1) results, and I can't
> locate actual occurence of "CoSt" in the docs at all. Can anybody advise?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message