lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yetkin Ozkucur <Yetkin.Ozku...@asg.com>
Subject Searching for tokens does not return any results
Date Thu, 01 May 2014 14:04:06 GMT
Hello everyone,

I am new to SOLR and this is my first post in this list. 
I have been working on this problem for a couple of days. I tried everything which I found
in google but it looks like I am missing something.

Here is my problem:
I have a field called: DBASE_LOCAT_NM_TEXT
It contains values like: CRD_PROD
The goal is to be able to search this field either by putting the exact string "CRD_PROD"
or part of it (tokenized by "_")  like "CRD" or "PROD"

Currently: 
This query returns results: q=DBASE_LOCAT_NM_TEXT:CRD_PROD
But this does not: q=DBASE_LOCAT_NM_TEXT:CRD
I want to understand why the second query does not return any results

Here is how I configured the field:
<field name="DBASE_LOCAT_NM_TEXT" type="text_general" indexed="true" stored="true" required="false"
multiValued="false"/>

And Here is how I configured the field type :
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
      <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory"  ignoreCase="true" words="stopwords.txt"/>
         <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>

        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

      </analyzer>
    </fieldType>

I am also using the analysis panel in the SOLR admin console. It shows this:
WT	CRD_PROD

WDF	CRD_PROD
	CRD
	PROD
	CRDPROD

SF	CRD_PROD
	CRD
	PROD
	CRDPROD

LCF	crd_prod
	crd
	prod
	crdprod

SKMF	crd_prod
	crd
	prod
	crdprod

RDTF	crd_prod
	crd
	prod
	crdprod


I am not sure if it is related or not but this index was created using a Java program using
Lucene interface. It used StandardAnalyzer for writing and the field was configured as tokenized,
indexed and stored.  Does this affect the SOLR configuration?
	
Can you please help me understand what I am missing and how I can debug it?

Thanks,
Yetkin

Mime
View raw message