lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Richmond" <>
Subject Custom E-mail Tokenizer
Date Wed, 21 Jun 2006 18:50:28 GMT
I have created a custom e-mail tokenizer and am trying to make e-mail
addresses more searchable inside of solr (without having to rely on
wildcard/prefix queries), but am running into a couple problems using

I created a tokenizer that when given the e-mail address
"" it produces the following tokens (this
was discussed on the java lucene users group and can be found here:

I then added the following to my schema configuration:
    <fieldtype name="email" class="solr.StrField">
        <analyzer type="index">
            <filter class="solr.LowerCaseFilterFactory"/>

If I then fire up solr and use the analysis tool from the admin page,
it seems to work exacly as I would expect (i.e. email addresses that I
type in do get broken up into the correct tokens).  However, when I
add data to this index and then attempt to perform a search using the
search interface I can not get any matches.  For example when I add
"" to a field that has type "email" (see schema
configuration above) I can not get the terms "richmondmike", or
"gmail" or "" to match any of the results.

Do I need to use a custom fieldtype class as well instead of using
"solr.StrField"?  Any help would be greatly appreciated.

Thanks in advance,


View raw message