lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avlesh Singh <avl...@gmail.com>
Subject Re: Is wildcard search not correctly analyzed at query?
Date Thu, 20 Aug 2009 16:53:26 GMT
Wildcard queries are not analyzed by Lucene and hence the behavior. A
similar thread earlier -
http://www.lucidimagination.com/search/document/a6b9144ecab9d0ff/search_phrase_wildcard

Cheers
Avlesh

On Thu, Aug 20, 2009 at 7:03 PM, Alexander Herzog <herzoga@ait.co.at> wrote:

>
> It seems like the analyzer/filter isn't affected at all, since the query
>
> http://localhost:8983/solr/select/?q=PhysicalDescription:nü*&debugQuery=true<http://localhost:8983/solr/select/?q=PhysicalDescription:n%C3%BC*&debugQuery=true>
>
> does not return a
> <str name="parsedquery">PhysicalDescription:nu*</str>
> as I would expect.
>
> So can I just have a "you're right, wildcard search is passed to lucene
> directly without any analyzing".
>
> If it is like this, I'm happy with that as well.
>
> best,
> Alexander
>
>
> Alexander Herzog schrieb:
> > Hi all
> >
> > sorry for the long post
> >
> > We are switching from indexdata's zebra to solr for a new book
> > archival/preservation project with multiple languages, so expect more
> > questions soon (sorry for that)
> > The features of solr are pretty cool and more or less overwhelming!
> >
> > But there is one thing I found after a little test with wildcards.
> >
> > I'm using the latest svn build and didn't change anything except the
> > schema.xml
> > Solr Specification Version: 1.3.0.2009.08.20.07.53.52
> > Solr Implementation Version: 1.4-dev 806060 - ait015 - 2009-08-20
> 07:53:52
> > Lucene Specification Version: 2.9-dev
> > Lucene Implementation Version: 2.9-dev 804692 - 2009-08-16 09:33:41
> >
> > I have a text_ws field with this schema config:
> >
> > <fieldType name="text_ws" class="solr.TextField"
> positionIncrementGap="100">
> >    <analyzer>
> >       <charFilter class="solr.MappingCharFilterFactory"
> > mapping="mapping-ISOLatin1Accent.txt"/>
> >       <filter class="solr.LowerCaseFilterFactory"/>
> >       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >    </analyzer>
> > </fieldType>
> > ...
> > and I added a dynamic field for everything since I'm not sure what field
> > we will use...
> >
> > <dynamicField name="*"  type="text_ws"    indexed="true"  stored="true"
> > multiValued="true"/>
> > ...
> >
> >
> > So I <add>ed this content:
> > ...
> > <field name="PhysicalDescription">
> >    X, 143, XIV S.:
> >    124 feine Farbendrucktafeln mit über 600 Abbildungen;
> >    24,5 cm.
> > </field>
> > ...
> >
> > since it's German, and I couldn't find a tokenizer for German compound
> > words (any help appreciated) I wanted to search for 'Farb*'
> >
> > The final row of the query analyzer in the admin section told me:
> > farb*
> > for the content:
> > x,    143,    xiv     s.:     124     feine   farbendrucktafeln       mit
>     uber    600     abbildungen;
> > 24,5  cm.
> >
> > so everything seems to be ok, everything in lower case
> >
> > Now, for the rest service:
> >
> http://localhost:8983/solr/select/?q=PhysicalDescription:Farb*&debugQuery=true
> > <str name="rawquerystring">PhysicalDescription:Farb*</str>
> > <str name="querystring">PhysicalDescription:Farb*</str>
> > <str name="parsedquery">PhysicalDescription:Farb*</str>
> > <str name="parsedquery_toString">PhysicalDescription:Farb*</str>
> >
> > Since Farb* has a capital letter, nothing is found.
> > When using farb* as query, I get the result.
> >
> > Where can I add/change a query anaylizer that "lower cases" wildcard
> > searches?
> >
> > thanks, best wishes,
> > Alexander
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message