lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Murzaku" <>
Subject RE: Support for russian morphology in Lucene
Date Thu, 07 Mar 2002 13:50:00 GMT
Real morphology (finding the root for all the forms of a word) in
Russian might not be that easy since in Russian you have both prefixes
(aspect) and suffixes (case, number, conjugation) that inflect a word.
But, there are already efforts to write stemmers (suffix strippers) for
Russian following Porter's model. SNOWBALL (for SNOBOL) is a formal
language which has found it's main use in writing stemmers for different
languages. Until now there are rule sets for Danish, Dutch, English,
French, German, Italian, Norwegian, Portuguese, Russian, Spanish and

Sometimes ago, somebody posted an French stemmer built from SNOWBALL. It
seems straightforward to convert all these stemmers to Lucene and maybe
include them in the package.

The site for SNOWBALL is The latest version of their
compiler outputs Java code. I am attaching the Russian SNOWBALL file and
its corresponding Java output. This is just the stemmer though and does
not include the needed code for interfacing with Lucene.



-----Original Message-----
From: Philipp Chudinov [] 
Sent: Thursday, March 07, 2002 1:21 AM
To: Lucene Users List
Subject: Re: Support for russian morphology in Lucene

its mei :) having no ideas about morphology and great wishes to use
lucene in russian. nice to see you here. maybe we should try to do
things together.

----- Original Message -----
From: "Vadim Solonovich" <>
To: "Lucene Developers List" <>
Cc: "Lucene Users List" <>
Sent: Thursday, March 07, 2002 6:40 AM
Subject: Support for russian morphology in Lucene

> Hi All !
> Is there anybody who have any ideas about implementing russian 
> morphology
in Lucene ?
> Please, let me know.
> Thanks in advance.
> Vadim Solonovich,

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message