lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gwk <g...@eyefi.nl>
Subject Re: Autosuggest on PART of cityname
Date Mon, 23 Aug 2010 09:10:36 GMT
  On 8/20/2010 7:04 PM, PeterKerk wrote:
> @Markus: thanks, will try to work with that.
>
> @Gijs: I've looked at the site and the search function on your homepage is
> EXACTLY what I need! Do you have some Solr code samples for me to study
> perhaps? (I just need the relevant fields in the schema.xml and the query
> url) It would help me a lot! :)
>
> Thanks to you both!
The fields in our schema are:
<field name="id" type="string" indexed="true" stored="true" 
required="true" />
         - Just an id based on type, depth and a number, not important
<field name="type" type="string" indexed="true" stored="true" 
required="true" />
         - This is either "buy" or "rent" as our sections have separate 
autocompleters
<field name="depth" type="string" indexed="true" stored="true" />
         - Since you can search by country, region or city, this stores 
the type of this document (well, since we use geonames.org geographical 
data we actually have 4 regions)
<field name="name" type="text" indexed="true" stored="true" />
         - The canonical name of the country/region/city
<dynamicField name="name_*" type="text" indexed="true" stored="true" />
         - The name of the country/region/city in various languages
<field name="parent" type="text" indexed="true" stored="true" />
         - The name of the country/region/city with any of it's parents 
comma separated, this is used for phrase searches so if you enter 
"Amsterdam, Netherlands" the dutch Amsterdam will match before any of 
the Amsterdams in other countries.
<dynamicField name="parent_*" type="text" indexed="true" stored="true" />
         - The same as parent but in different languages
<field name="data" type="string" indexed="false" stored="true" />
         - This is some internal data used to create the correct filters 
when this particular suggestion is selected
<dynamicField name="data_*" type="text" indexed="true" stored="true" />
         - The same as parent but in different languages, as our filters 
are on the actual name of countries/regions/cities
<field name="count" type="tint" indexed="true" stored="true" />
         - The number of documents, i.e. the number on the right of the 
suggestions
<field name="names" type="text" indexed="true" multiValued="true" />
         - Multivalued field which is copyfield-ed from name and name_*
<field name="parents" type="text" indexed="true" multiValued="true" />
         - Multivalued field which is copyfield-ed from parent and parent_*

Where text is
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
                 ignoreCase="true"
                 words="stopwords.txt"
                 enablePositionIncrements="true"
                 />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" 
generateNumberParts="1" catenateWords="1" catenateNumbers="1" 
catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" 
maxGramSize="30"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" 
generateNumberParts="1" catenateWords="0" catenateNumbers="0" 
catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>


Our autocompletion requests are dismax request where the most important 
parameters are:
- q=the text the user has entered into the searchbox so far
- fq=type:sale (or rent)
- qf=name_<lang>^4 name^4 names (Where <lang> is the currently selected 
language on the website)
- pf=name_<lang>^4 name^4 names parents

Honestly, those parameters are basically just tweaked without quite 
understanding their meaning until I got something that worked 
adequately. Hope this helps.

Regards,

gwk

Mime
View raw message