lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-2341) explore morfologik integration
Date Tue, 21 Jun 2011 10:00:47 GMT


Robert Muir commented on LUCENE-2341:

Eventually it would be probably sensible to limit the automaton for use in Lucene to store
surface forms and lemmas only (no POS tags) and merge both dictionaries into a single automaton...
but this can be a future improvement.

or alternatively, you can expose the POS tags for each stem to lucene right, easiest way would
be to put it into TypeAttribute (a string), but you could make your own strongly-typed one
if thats a better fit.
this could be useful for downstream processing.

> explore morfologik integration
> ------------------------------
>                 Key: LUCENE-2341
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Robert Muir
>            Assignee: Dawid Weiss
>         Attachments: LUCENE-2341.diff, morfologik-stemming-1.5.0.jar
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer available:
> This works differently than LUCENE-2298, and ideally would be another option for users.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message