lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tommaso Teofili (JIRA)" <>
Subject [jira] [Assigned] (LUCENE-5548) Improve flexibility and testability of the classification module
Date Mon, 03 Nov 2014 08:06:34 GMT


Tommaso Teofili reassigned LUCENE-5548:

    Assignee: Tommaso Teofili

> Improve flexibility and testability of the classification module
> ----------------------------------------------------------------
>                 Key: LUCENE-5548
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/classification
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>              Labels: gsoc2014, mentor
> Lucene classification module's flexibility and capabilities may be improved with the
> - make it possible to use them "online" (or provide an online version of them) so that
if the underlying index(reader) is updated the classifier doesn't need to be trained again
to take into account newly added docs
> - eventually pass a different Analyzer together with the text to be classified (or directly
a TokenStream) to specify custom tokenization/filtering.
> - normalize score calculations of existing classifiers
> - provide publicly available dataset based accuracy and speed tests
> - more Lucene based classification algorithms
> Specific subtasks for each of the above topics should be created to discuss each of them
in depth.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message