lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven A Rowe" <>
Subject RE: Searching chomps my terms..
Date Tue, 11 Mar 2008 15:10:31 GMT
On 03/11/2008 at 8:46 AM, André Warnier wrote:
> João Rodrigues wrote:
> > @André:
> > 
> > Even if I use Simple Analyzer, which I think should leave the term
> > "alone", the number gets "eaten".
> I'm no expert, so I was just launching that answer to see if it elicited
> more qualified responses. But I found this on Google :
> (seems to
> say also that SimpleAnalyser does not retain numbers, and that you
> should try StandardAnalyser instead).
> (But I must say that precise documentation seems hard to find).

The API docs are at: <>.  Find the class name
you're interested in and follow it where it goes :) .

SimpleAnalyzer is "[a]n Analyzer that filters LetterTokenizer with LowerCaseFilter":


LetterTokenizer's docs say:

   A LetterTokenizer is a tokenizer that divides text at non-letters.
   That's to say, it defines tokens as maximal strings of adjacent
   letters, as defined by java.lang.Character.isLetter() predicate.


LowercaseFilter "[n]ormalizes token text to lower case":


Exercise for the reader: find the docs for StandardAnalyzer :) .


View raw message