lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "daniel rosher" <daniel.ros...@hotonline.com>
Subject Re: Auto complete
Date Tue, 08 Jul 2008 10:30:31 GMT
Hi,

This is how we implement our autocomplete feature, excerpt from
schema.xml

-First accept the input as is without alteration
-Lowercase the input, and eliminate all non a-z0-9 chars to normalize
the input
-split into multiple tokens with EdgeNGramFilterFactory upto a max of
100 chars, all starting from the beginning of the input, e.g. hello
becomes h,he,hel,hell,hello. 
-For queries we accept the first 20 chars.

Hope this helps.


<fieldType name="autocomplete" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all" />
            <filter class="solr.EdgeNGramFilterFactory"
maxGramSize="100" minGramSize="1" />
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all" />
            <filter class="solr.PatternReplaceFilterFactory"
pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
        </analyzer>
</fieldType>
...
<field name="ac" type="autocomplete" indexed="true" stored="true"
required="false" />

Regards,
Dan




On Mon, 2008-07-07 at 17:12 +0000, sundar shankar wrote:
> Hi All,
>            I am using Solr for some time and am having trouble with an auto complete
feature that I have been trying to incorporate. I am indexing solr as a database column to
solr field mapping. I have tried various configs that were mentioned in the solr user community
suggestions and have tried a few option of my own too. Each of them seem to either not bring
me the exact data I want or seems to get excess data.
> 
> I have tried.
> text_ws,
> text,
> string
> EdgeNGramTokenizerFactory
> the subword example
> textTight
> and juggling arnd some of the filters and analysers togther.
> 
> Couldnt get dismax to work as somehow it wasnt able to connect my field defined in the
schema to the qf param that I was passing in the request.
> 
> Text tight was the best results I had but the problem there was it was searching for
whole words and not part words.
> example
> 
> if my query String was field1:Word1 word2* I was getting back results but if my query
string was field1: Word1 wor* I didnt get a result back.
> 
> I am little perplexed on how to implement this. I dont know what has to be done.
> 
> The schema
> 
> 
>    <field name="institution.name" type="text_ws" indexed="true" stored="true" termVectors="true"/>
>    <!--Sundar changed city to subword so that spaces are ignored-->
> 
>    <field name="instAlphaSort" type="alphaOnlySort" indexed="true" stored="false"
multiValued="true"/>
>    <!-- Tight text cos we want results to be much the same for this-->
>    <field name="instText" type="text" indexed="true" stored="true"  termVectors="true"
multiValued="true"/>
>    <field name="instString" type="autosuggest" indexed="true" stored="true"  termVectors="true"
multiValued="true"/>
> 
>    <field name="instSubword" type="subword" indexed="true" stored="true" multiValued="true"
 termVectors="true"/>
>    <field name="instTight" type="textTight" indexed="true" stored="true" multiValued="true"
 termVectors="true"/>
> 
> 
> 
> I Index institution.name only, the rest are copy fields of the same.
> 
> 
> Any help is appreciated.
> 
> Thanks
> Sundar
> 
> _________________________________________________________________
> Chose your Life Partner? Join MSN Matrimony
> http://www.shaadi.com/msn/matrimony.php 
> 
> <<This email has been scanned for virus and spam content>>
Daniel Rosher
Developer
www.thehotonlinenetwork.com
d: 0207 3489 912

    t: 0845 4680 568

    f: 0845 4680 868

    m: 

		Beaumont House, Kensington Village, Avonmore Road, London, W14 8TS
	


    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - -

    This message is sent in confidence for the addressee only. It may contain privileged

    information. The contents are not to be disclosed to anyone other than the addressee.

    Unauthorised recipients are requested to preserve this confidentiality and to advise

    us of any errors in transmission. Thank you.

    hotonline ltd is registered in England & Wales. Registered office: One Canada Square,

    Canary Wharf, London E14 5AP. Registered No: 1904765.

Mime
View raw message