spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-9941) Try ML pipeline API on Kaggle competitions
Date Thu, 13 Aug 2015 18:00:45 GMT

     [ https://issues.apache.org/jira/browse/SPARK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiangrui Meng updated SPARK-9941:
---------------------------------
    Description: 
This is an umbrella JIRA to track some fun tasks :)

We have built many features under the ML pipeline API, and we want to see how it works on
real-world datasets, e.g., Kaggle competition datasets. We want to invite community members
to help test. The goal is NOT to win the competitions but to provide code examples and to
find out missing features and other issues to help shape the roadmap.

For people who are interested, please do the following:

1. Create a subtask (or leave a comment if you cannot create a subtask) to claim a Kaggle
dataset.
2. Use the ML pipeline API to build and tune an ML pipeline that works for the Kaggle dataset.
3. Paste the code to gist (https://gist.github.com/) and provide the link.
4. Report missing features, issues, running times, and accuracy.

  was:
This is an umbrella JIRA to track some fun tasks:)

We have built many features under the ML pipeline API, and we want to see how it works on
real-world datasets, e.g., Kaggle competition datasets. We want to invite community members
to help test. The goal is NOT to win the competitions but to provide code examples and to
find out missing features and other issues to help shape the roadmap.

For people who are interested, please do the following:

1. Create a subtask (or leave a comment if you cannot create a subtask) to claim a Kaggle
dataset.
2. Use the ML pipeline API to build and tune an ML pipeline that works for the Kaggle dataset.
3. Paste the code to gist (https://gist.github.com/) and provide the link.
4. Report missing features, issues, running times, and accuracy.


> Try ML pipeline API on Kaggle competitions
> ------------------------------------------
>
>                 Key: SPARK-9941
>                 URL: https://issues.apache.org/jira/browse/SPARK-9941
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>
> This is an umbrella JIRA to track some fun tasks :)
> We have built many features under the ML pipeline API, and we want to see how it works
on real-world datasets, e.g., Kaggle competition datasets. We want to invite community members
to help test. The goal is NOT to win the competitions but to provide code examples and to
find out missing features and other issues to help shape the roadmap.
> For people who are interested, please do the following:
> 1. Create a subtask (or leave a comment if you cannot create a subtask) to claim a Kaggle
dataset.
> 2. Use the ML pipeline API to build and tune an ML pipeline that works for the Kaggle
dataset.
> 3. Paste the code to gist (https://gist.github.com/) and provide the link.
> 4. Report missing features, issues, running times, and accuracy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message