lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernhard Messer <>
Subject Re: English and French documents together / analysis, indexing, searching
Date Thu, 20 Jan 2005 18:05:12 GMT
i think the easiest way ist to use Lucene's StandardAnalyzer. If you 
want to use the snowball stemmers, you have to add a language guesser to 
get the language for the particular document before creating the analyzer.

Bernhard schrieb:

> Greetings everyone
> I wonder is there a solution for analyzing both English and French 
> documents using the same analyzer.
> Reason being is that we have predominantly English documents but there 
> are some French, yet it all has to go into the same index
> and be searchable from the same location during any perticular search. 
> Is there a way to analyze both types of documents with
> a same analyzer (and which one)?
> I've looked around and I see there's a SnowBall analyzer but you have 
> to specify the language of analysis, and I do not know that
> ahead of time during indexing nor do I know it most of the time during 
> searching (users would like to search in both document types).
> There's also the issue of letter accents in french words and searching 
> for the same (how are they indexed at the first place even)?
> Has anyone dealt with this before and how did you solve the problem?
> thanks
> -pedja
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message