lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Tokenising on Each Letter
Date Sun, 22 Aug 2010 20:08:22 GMT
I suspect (though I can't say for sure since you didn't include your
schema definition, both type and actual field def) that your
problem stems from WordDelimiterFilterFactory options. The
default in the schema usually has catenateall=0. In which case
you have the tokens "ads" and "12" but not "ads12". So searching
for "ads1*" can't work. You could
try varying your worddelimiterfilterfactory parameters (your
specific example works for me), but that may have other
effects on your work.

You could also use a different analysis chain for model number
that didn't even try to split it up. Or you could use one of the
n-gram type filters on your model numbers to give you lots of
flexibility....

And if none of this is germain, can you explain more about what
you're trying to acomplish?

Best
Erick

On Fri, Aug 20, 2010 at 8:19 AM, Scottie <scottie@live.com> wrote:

>
> Just getting ready to launch Solr on one of our websites.
>
> Unfortunately, we can't work out one little issue; how do I configure Solr
> such that it can search our model numbers easily? For example:
>
> ADS12P2
>
> If somebody searched for ADS it would match, because currently its split
> into tokens when it sees letters and numbers, if somebody did ADS12 it
> would
> also work etc.
>
> But if somebody does ADS1, currently there is no results?
>
> Does anybody know how I should configure Solr such that it will split a
> certain field over each letter or wildcard etc?
>
> Kind Regards
>
> Scott
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Tokenising-on-Each-Letter-tp1247113p1247113.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message