lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rantjil Bould" <iblogee.h...@iblogee.com>
Subject Your valuable suggestion on autocomplete
Date Tue, 06 May 2008 06:57:06 GMT
Hi Group,
             I have already got some valuable suggestions from group. Based
on that, I have come out with following process to finally implement
autocomplete like fetaure in my system
1- Index the whole documents
2- Extract all terms using indexReader's terms() method

I am getting terms like vl,vla,vlan,vlana,vlanan,vlanand. But I would like
to get absolute terms i.e. vlanand. The field definition in solr is

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"></tokenizer>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"></filter>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"></filter>
        <filter class="solr.LowerCaseFilterFactory"></filter>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"></filter>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"></filter>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"></tokenizer>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"></filter>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"></filter>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"></filter>
        <filter class="solr.LowerCaseFilterFactory"></filter>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"></filter>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"></filter>
      </analyzer>
    </fieldType>

Would appreciate your input to get absolute terms??

3- For each term, extract documents containing those term using termDocs()
method
4- Create one more index with fields, term, frequency and docNo. This index
would be used for autocomplete feature.
5- Any letter typed by user in search field, use Ajax script (like
scriptaculous or JQuery) to extract all terms using prefix query.
6- Based on search term selected by user, keep track of document nos in
which this term belongs.
7- For next search term selection using documents nos to select all terms
excluding currently selected term.

This somehow works. As new to SOlr ans also to Lucene, I would like to know
in case it can be improved?

- RB

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message