lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Rowe <sar...@syr.edu>
Subject Re: new spanish analyzer
Date Tue, 10 Jan 2006 16:42:29 GMT
Hola José,

Did you know that Java Lucene already has a contributed Snowball-based 
stemmer/analyzer, very similar to yours?

http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/snowball/

It looks to me as though your Spanish stopword list is the only 
significant difference.  Would you agree that this is true?

Also, your stoplist loader (SpanishAnalyzer.loadStopWords()) is not 
respecting the '|' comment-to-end-of-line character in your stoplist 
(stopwords-spanish.txt).

Steve

José Ramón Pérez Agüera wrote:
> I have developed a spanish analyzer with spanish stemmer based in Porter algorithm. Its
under GNU license and free for use. I hope that will be useful for spanish lucene users. You
can download the stemmer here:
> 
> http://multidoc.rediris.es/joseramon/index.php?option=com_docman&task=view_category&Itemid=25&subcat=1&catid=11&limitstart=0&limit=5
> 
> if somebody have any sugerences, i will be happy to improve my implementation
> 
> Sorry for my english :-)
> 
> jose
> 
> José Ramón Pérez Agüera
> Despacho 411 tlf. 913947599
> Dept. de Sistemas Informáticos y Programación
> Facultad de Informática
> Universidad Complutense de Madrid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message