lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Stemming nouns ending in 'y'
Date Fri, 20 May 2016 01:17:26 GMT
Mark:

Just a sanity check, was the indexing porter stemmer defined when you
indexed your _first_ document? The admin/analysis page will tell you
what the term is stemmed to at both query and index time.

I'm puzzled by this statement:

bq:  As example, the term 'osteopathy' stemmed with the Porter Stemmer
Filter stems to 'osteopathi', which will match 'osteopath' and
'osteopathic'

Why do you think this will match? the stemmer wouldn't stem the
'osteopath' to the term in then index, namely 'osteopathi' and thus
wouldn't match. Or at least shouldn't.... So I'm probably missing
something here...

Best,
Erick

On Thu, May 19, 2016 at 12:31 PM, Markus Jelsma
<markus.jelsma@openindex.io> wrote:
> Hello - try the KStem filter. It is better suited for english and doesn't show this behaviour.
> Markus
>
>
>
> -----Original message-----
>> From:Mark Vega <vegamf@uci.edu>
>> Sent: Thursday 19th May 2016 19:55
>> To: solr-user@lucene.apache.org
>> Subject: Stemming nouns ending in 'y'
>>
>> I am using Apache Nutch v1.10 and SOLR v.5.2.1 to index and search a medical website
and am trying to find out why every stemmer I've tried on certain nouns in medical terminology
ending in 'y' merely replaces the ending 'y' with an 'I'.  As example, the term 'osteopathy'
stemmed with the Porter Stemmer Filter stems to 'osteopathi', which will match 'osteopath'
and 'osteopathic', but will not match the original term 'osteopathy' itself.  I've seen this
with quite a few medical and science nouns ending in 'y'  (though, oddly enough, the word
'terminology' itself stems to 'terminolog' just as I would expect it to) and am wondering
whether there is a different stemmer I should be using, or if I am just using this one incorrectly.
 I am currently applying the PorterStemFilterFactory to a field of type 'text' in both the
indexing and querying analyzers.  Any comments, suggestions or explanations would be much
appreciated.
>>
>> --
>> Mark F. Vega
>> Programmer/Analyst
>> UC Irvine Libraries - Web Services
>> vegamf@uci.edu<mailto:vegamf@uci.edu>
>> 949.824.9872
>> --
>>
>>

Mime
View raw message