spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Bradley <jos...@databricks.com>
Subject Re: Spark.ml roadmap 2.3.0 and beyond
Date Thu, 07 Dec 2017 23:36:33 GMT
Hi Stephen,

I used to post those roadmap JIRAs to share instructions for contributing
to MLlib and to try to coordinate amongst committers.  My feeling was that
the coordination aspect was of mixed success, so I did not post one for
2.3.  I'm glad you pinged about this; if those were useful, then I can plan
on posting one for the release after 2.3.  As far as identifying
committers' plans, the best option right now is to look for Shepherds in
JIRA as well as the few mailing list threads about directions.

For myself, I'm mainly focusing on fixing some issues with persistence for
custom algorithms in PySpark (done), adding the image schema (done), and
using ML Pipelines in Structured Streaming (WIP).

Joseph

On Wed, Nov 29, 2017 at 6:52 AM, Stephen Boesch <javadba@gmail.com> wrote:

> There are several  JIRA's and/or PR's that contain logic the Data Science
> teams that I work with use in their local models. We are trying to
> determine if/when these features may gain traction again.  In at least one
> case all of the work were done but the shepherd said that getting it
> committed were of lower priority than other tasks - one specifically
> mentioned was the mllib/ml parity that has been ongoing for nearly three
> years.
>
> In order to prioritize work that the ML platform would do it would be
> helpful to know at least which if any of those tasks were going to be moved
> ahead by the community: since we could then focus on other ones instead of
> duplicating the effort.
>
> In addition there are some engineering code jam sessions that happen
> periodically: knowing which features are actively on the roadmap would *certainly
> *influence our selection of work.  The roadmaps from 2.2.0 and earlier
> were a very good starting point to understand not just the specific work in
> progress - but also the current mindset/thinking of the committers in terms
> of general priorities.
>
> So if the same format of document were not available - then what content *is
> *that gives a picture of where spark.ml were headed?
>
> 2017-11-29 6:39 GMT-08:00 Stephen Boesch <javadba@gmail.com>:
>
>> Any further information/ thoughts?
>>
>>
>>
>> 2017-11-22 15:07 GMT-08:00 Stephen Boesch <javadba@gmail.com>:
>>
>>> The roadmaps for prior releases e.g. 1.6 2.0 2.1 2.2 were available:
>>>
>>> 2.2.0 https://issues.apache.org/jira/browse/SPARK-18813
>>>
>>> 2.1.0 https://issues.apache.org/jira/browse/SPARK-15581
>>> ..
>>>
>>> It seems those roadmaps were not available per se' for 2.3.0 and later?
>>> Is there a different mechanism for that info?
>>>
>>> stephenb
>>>
>>
>>
>


-- 

Joseph Bradley

Software Engineer - Machine Learning

Databricks, Inc.

[image: http://databricks.com] <http://databricks.com/>

Mime
View raw message