flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "hoa nguyen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1729) Assess performance of classification algorithms
Date Fri, 22 Apr 2016 01:00:23 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253113#comment-15253113

hoa nguyen commented on FLINK-1729:

Hi [~till.rohrmann], Is there an update on this? To confirm, this would provide an example
implementation of say SVMs on publicly available datasets to validate the algorithm. Would
it be possible for me to be assigned this? Many thanks,

> Assess performance of classification algorithms
> -----------------------------------------------
>                 Key: FLINK-1729
>                 URL: https://issues.apache.org/jira/browse/FLINK-1729
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>              Labels: ML
> In order to validate Flink's classification algorithms (in terms of performance and accuracy),
we should run them on publicly available classification data sets. This will not only serve
as a proof for the correctness of the implementations but will also show how easy the machine
learning library can be used.
> Bottou [1] published some results for the RCV1 dataset using SVMs for classification.
The SVMs are trained using stochastic gradient descent. Thus, they would be a good comparison
for the CoCoA trained SVMs.
> Some more benchmark results and publicly available data sets ca be found here [2].
> Resources:
> [1] [http://leon.bottou.org/projects/sgd]
> [2] [https://github.com/BIDData/BIDMach/wiki/Benchmarks]

This message was sent by Atlassian JIRA

View raw message