lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sophie M." <sop...@beezik.com>
Subject Alphabetic range
Date Wed, 23 Jun 2010 12:56:39 GMT

Hello all,

I try since several day to build up an alphabetical range. I will explain
all steps (i have the Solr1.4 Enterprise  Search Server book written by
Smiley and Pugh).

I want get all artists beginning by the two first letter. If I request "mi",
I want to have as response "michael jackson" and all artists name beginning
by "mi".

I defined a field type similiar to Smiley and Pugh's example p.148

<fieldType name="bucketFirstTwoLetters" class="solr.TextField"
sortMissingLast="true" omitNorms="true">
		<analyser type="index">
			<tokenizer class="solr.PatternTokenizerFactory"
pattern="^([a-zA-Z])([a-zA-Z]).*" group="2"/> <!-- les deux premieres
lettres-->
		</analyser>
		<analyser type="query">
			<tokenizer class="solr.KeywordTokenizerFactory"/>
		</analyser>
	</fieldType>
	
I defined the field ArtistSort like : 

<field name="ArtistSort" type="bucketFirstTwoLetters" stored="true"
multivalued="false"/>
To the request : 

http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort

I get :

http://lucene.472066.n3.nabble.com/file/n916716/select.xml select.xml 

I don't understand why the pattern doesn't my exacty. For example "An An Yu"
matches but I only want artists whom name begins by "yu". And I know that an
artist named ReYu would match because ReYu would be interpreted as Re Yu (as
two words).

I also tried to make an other type of queries like : 

http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

I get exacly what I would. I made several tries, I get only artist's names
wich begins by the good first to letters.

But I get very few responses, see there :

result name="response" numFound="6" start="0">

<doc>
<str name="ArtistSort">mike manne and tiger blues</str>
</doc>
−
<doc>
<str name="ArtistSort">mimika</str>
</doc>
−
<doc>
<str name="ArtistSort">miduno</str>
</doc>
−
<doc>
<str name="ArtistSort">milue macïro</str>
</doc>
−
<doc>
<str name="ArtistSort">mister pringle</str>
</doc>
−
<doc>
<str name="ArtistSort">mimmai</str>
</doc>


In my index there is more than 80 000 artists...  I really don't understand
why I can't get more responses. I think about the problem since days and
days and now my brain freezes 

Thank you in advance.

Sophie
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message