lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Parvez <par...@gmail.com>
Subject Re: Wild card search does not return any result
Date Wed, 05 Aug 2009 15:53:19 GMT
Thanks Otis and Avlesh,

Below is the configuration I have

1] solrconfig.xml
....
.....
  <requestHandler name="standard" class="solr.SearchHandler" default="true">
     <lst name="defaults">
       <str name="echoParams">explicit</str>
      <str name="spellcheck.onlyMorePopular">false</str>
      <str name="spellcheck.extendedResults">false</str>
      <str name="spellcheck.count">1</str>
    </lst>
     <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>
.....
.....
  <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">data-import.xml</str>
    </lst>
  </requestHandler>
......
......
  <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
    <str name="queryAnalyzerFieldType">textSpell</str>
    <lst name="spellchecker">
      <str name="name">default</str>
      <str name="field">SPELL</str>
      <str name="spellcheckIndexDir">./spellcheckerIndex</str>
      <str name="buildOnCommit">true</str>
      <str name="buildOnOptimize">true</str>
    </lst>
  </searchComponent>

2] data-import.xml

.....
......
    <document name="doc">
        <entity name="user" pk="ID"
                  query="select * from user">
    <field column="ROLE" name="ROLE" />
    <field column="ID" name="ID" />
    <field column="BUS" name="BUS" />
.....
.....

3] schema.xml
......
......
<field name="ID" type="float" indexed="true" stored="true" />
<field name="BUS" type="text" indexed="true" stored="true"/>
<field name="ROLE" type="text" indexed="true" stored="true" />
......
......
<field name="ID" type="float" indexed="true" stored="true" />
<field name="BUS" type="text" indexed="true" stored="true"/>
<field name="ROLE" type="text" indexed="true" stored="true" />
<field name="SPELL" type="textSpell" indexed="true" stored="true"
multiValued="true"/>
<copyField source="BUS" dest="SPELL" />
......
......
    <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType
....
....


To make it simple. I have only one record in the table,
ID=1
BUS=ICS
ROLE=SSE


like I said before,
*I don't get any match, if i search for q=ics*
I get the match, which is correct result, if i search for q=sse**

I have not done any query rewriting, i am just using the default
configuration, that comes with solr.

Otis, Let me know if you need any more information.

Avlesh, The above set up is just a striped down version, to figure out what
is the issue, In my real application, I have 100 of collums in the table,
that i use for building the search index. I dont think its a good option to
copy over all the fields and create another 100 odd fields, with just lower
case filter applied.

----
Parvez


From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
Date: Tue, Aug 4, 2009 at 8:25 PM
Subject: Re: Wild card search does not return any result
To: solr-user@lucene.apache.org


Hi,

I doubt it's a bug.  It's probably working correctly based on the config,
etc., I just don't have enough details about the configuration, your request
handler, query rewriting, the data in your index, etc. to tell you what
exactly is happening.

 Otis


On Tue, Aug 4, 2009 at 11:13 PM, Avlesh Singh <avlesh@gmail.com> wrote:

> You read it incorrectly Parvez.
> The "bug" that Bill seem to have found out is with the analysis tool and
> NOT
> the search handler itself. Results in your case is as expected. Wildcard
> queries are not analyzed hence the inconsistency.
> A workaround is suggested, on the same thread, here -
>
> http://markmail.org/message/ts65a6jok3ii6nva#query:+page:1+mid:i5zxdbnvspgek2bp+state:results
>
> Cheers
> Avlesh
>
> On Wed, Aug 5, 2009 at 12:52 AM, Mohamed Parvez <parvez@gmail.com> wrote:
>
> > Thanks Otis, The thread suggests that this is bug
> >
> >
> >
> http://markmail.org/message/ts65a6jok3ii6nva#query:+page:1+mid:qinymqdn6mkocv4k
> >
> > Both SSE and ICS are 3 letter word and both are not part of English
> > language.
> > SEE* works fine and ICS* does not work, this is sure a bug.
> >
> > Any idea when will this bug be fixed or if there is any work around.
> >
> > ----
> > Thanks/Regards,
> > Parvez
> > GV : 786-693-2228
> >
> >
> > On Tue, Aug 4, 2009 at 11:48 AM, Otis Gospodnetic <
> > otis_gospodnetic@yahoo.com> wrote:
> >
> > > Could it be the same reason as described here:
> > >
> > > http://markmail.org/message/ts65a6jok3ii6nva
> > >
> > > Otis
> > > --
> > > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> > >
> > >
> > >
> > > ----- Original Message ----
> > > > From: Mohamed Parvez <parvez@gmail.com>
> > > > To: solr-user@lucene.apache.org
> > > > Sent: Tuesday, August 4, 2009 11:26:45 AM
> > > > Subject: Wild card search does not return any result
> > > >
> > > > Hello All,
> > > >
> > > >        I have two fields.
> > > >
> > > >
> > > >
> > > >
> > > > I have document(which has been indexed) that has a value of "ICS for
> > BUS
> > > > field" and "SSE for ROLE filed"
> > > >
> > > > When I search for q=BUS:ics i get the result, but if i search for
> > > q=BUS:ics*
> > > > i don't get any match (or result)
> > > >
> > > > when I search for q=ROLE:sse or q=ROLE:sse*, both the times I get the
> > > > result.
> > > >
> > > > why BUS:ics* does not return any result ?
> > > >
> > > >
> > > > I have the default configuration for text filed, see below.
> > > >
> > > >
> > > > positionIncrementGap="100">
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >                 ignoreCase="true"
> > > >                 words="stopwords.txt"
> > > >                 enablePositionIncrements="true"
> > > >                 />
> > > >
> > > > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > > > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> > > >
> > > >
> > > > protected="protwords.txt"/>
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ignoreCase="true" expand="true"/>
> > > >
> > > > words="stopwords.txt"/>
> > > >
> > > > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > > > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> > > >
> > > >
> > > > protected="protwords.txt"/>
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ----
> > > > Thanks/Regards,
> > > > Parvez
> > > >
> > > > Note : This is a re-post. looks like something went wrong the first
> > time
> > > > around.
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message