lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaeger, Jay - DOT" <Jay.Jae...@dot.wi.gov>
Subject RE: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3
Date Mon, 12 Sep 2011 14:05:26 GMT
Looking at the Wiki  ( http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters ), it looks
like the solr.StandardTokenizerFactory changed with Solr 3.1 .

We use solr.KeyWordTokenizerFactory for our middle names (and then also throw in solr.LowerCaseFilterFactory
to normalize to lower case).  It treats the entire field as a single token, and in general
doesn't "futz" with what came in.

You might try the analyzer panel on the admin web page to see what exactly is happening during
indexing and analysis.

JRJ

-----Original Message-----
From: Marc Des Garets [mailto:marc.desgarets@192.com] 
Sent: Friday, September 09, 2011 5:21 AM
To: solr-user@lucene.apache.org
Subject: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

Hi,

I have a simple field defined like this:
    <fieldtype name="text" class="solr.TextField">
      <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
    </fieldtype>

Which I use here:
   <field name="middlename" type="text" indexed="true" stored="true" required="false" />

In solr 1.4, I could do:
?q=(middlename:a*)

And I was getting all documents where middlename = A or where middlename starts by the letter
A.

In solr 3.3, I get only results where middlename starts by the letter A but not where middlename
is equal to A.

The thing is this happens only with the letter A, with other letters, it is fine, I get the
ones starting by the letter and the ones equal to the letter. My guess is that it considers
A as the English article but I do not specify any filter with stopwords so how come the behaviour
with the letter A is different from the other letters? Is there a bug? How can I change my
field to work with the letter A, the same way it does with other letters.


Thanks,
Marc
----------------------------------------------------------
This transmission is strictly confidential, possibly legally privileged, and intended solely
for the 
addressee.  Any views or opinions expressed within it are those of the author and do not necessarily

represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary companies.
 If you 
are not the intended recipient then you must not disclose, copy or take any action in reliance
of this 
transmission. If you have received this transmission in error, please notify the sender as
soon as 
possible.  No employee or agent is authorised to conclude any binding agreement on behalf
of 
i-CD Publishing (UK) Ltd with another party by email without express written confirmation
by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  i-CD Publishing
(UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 673128728.
Mime
View raw message