lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-6138) ItalianLightStemmer doesn't apply on words shorter then 6 chars in length
Date Sun, 28 Dec 2014 17:52:13 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259684#comment-14259684
] 

Erick Erickson commented on LUCENE-6138:
----------------------------------------

Right, the important part of the discussion (I should have pointed it out) was that the stemmers
are not part of the Solr code base, they're another project and that project would be the
place to raise possible bugs or submit patches, 

bq: Can you propose your changes to http://members.unine.ch/jacques.savoy/clef/index.html?

Sorry for the confusion



> ItalianLightStemmer doesn't apply on words shorter then 6 chars in length
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-6138
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6138
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>    Affects Versions: 4.10.2
>            Reporter: Massimo Pasquini
>            Priority: Minor
>
> I expect a stemmer to transform nouns in their singular and plural forms into a shorter
common form. The implementation of the ItalianLightStemmer doesn't apply any stemming to words
shorter then 6 characters in length. This leads to some annoying results:
> singular form | plural form
> 4|5 chars in length (no stemming)
> alga -> alga | alghe -> alghe
> fuga -> fuga | fughe -> fughe
> lega -> lega | leghe -> leghe
> 5|6 chars in length (stemming only on plural form)
> vanga -> vanga | vanghe -> vang
> verga -> verga | verghe -> verg
> I suppose that such limitation on words length is to avoid other side effects on shorter
words not in the set above, but I think something must be reviewed in the code for better
results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message