lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8028) Arabic Stemmer improvement for Better Search Accuracy
Date Tue, 31 Oct 2017 13:09:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226750#comment-16226750
] 

Robert Muir commented on LUCENE-8028:
-------------------------------------

Hi, we should add it as an option! It is ok to have multiple stemmers (choices).

I think we should be conservative about changing the default: at least for the second paper
(which isn't paywalled, so i could quickly look), this appears to incorporate a dictionary-based
approach (domain-dependent, typically perform less well on average than rule-based due to
OOV) and i don't yet see any standard IR experiments confirming the improvement.

> Arabic Stemmer improvement for Better Search Accuracy
> -----------------------------------------------------
>
>                 Key: LUCENE-8028
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8028
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ayah Shamandi
>              Labels: Arabic, Stemmer, improvement
>
> HI, this is Ayah - bidi developer at IBM Egypt - Globalization Team, we are responsible
to support Arabic at IBM products and services and as we use lucence at many of services,
we found that it needs major improvement at Arabic stemmer, we implement the following two
papers https://dl.acm.org/citation.cfm?id=1921657 and http://waset.org/publications/10005688/arabic-light-stemmer-for-better-search-accuracy
to improve lucene arabic stemmer function and would like to open a Pull request to let you
integrate it as a part of lucene 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message