spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Pentreath <nick.pentre...@gmail.com>
Subject PySpark / scikit-learn integration sprint at Cloudera - Strata Conference Friday 14th Feb 2014
Date Mon, 02 Dec 2013 17:09:59 GMT
Hi Spark Devs

An idea developed recently out of a scikit-learn mailing list discussion (
http://sourceforge.net/mailarchive/forum.php?thread_name=CAFvE7K5HGKYH9Myp7imrJ-nU%3DpJgeGqcCn3JC0m4MmGWZi35Hw%40mail.gmail.com&forum_name=scikit-learn-general)
to have a coding sprint around Strata in Feb, focused on integration
between scikit-learn and PySpark for large-scale machine learning tasks.

Cloudera has kindly agreed to host the sprint, most likely in San
Francisco. Ideally it would be focused and capped at around 10 people. The
idea is not meant to be a teaching workshop for
newcomers but more as a prototyping session, so ideally it would be great
to have developers and users with deep knowledge of PySpark (Josh
especially :) and/or scikit-learn, attend.

Hopefully we can get some people from the Spark community involved, and
Olivier will drum up support from the scikit-learn community.

All the best and hope to see you there (though likely I will only be able
to join remotely).
Nick

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message