spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <java...@gmail.com>
Subject Re: Spark.ml roadmap 2.3.0 and beyond
Date Thu, 07 Dec 2017 23:55:22 GMT
Thanks Joseph.  We can wait for post 2.3.0.

2017-12-07 15:36 GMT-08:00 Joseph Bradley <joseph@databricks.com>:

> Hi Stephen,
>
> I used to post those roadmap JIRAs to share instructions for contributing
> to MLlib and to try to coordinate amongst committers.  My feeling was that
> the coordination aspect was of mixed success, so I did not post one for
> 2.3.  I'm glad you pinged about this; if those were useful, then I can plan
> on posting one for the release after 2.3.  As far as identifying
> committers' plans, the best option right now is to look for Shepherds in
> JIRA as well as the few mailing list threads about directions.
>
> For myself, I'm mainly focusing on fixing some issues with persistence for
> custom algorithms in PySpark (done), adding the image schema (done), and
> using ML Pipelines in Structured Streaming (WIP).
>
> Joseph
>
> On Wed, Nov 29, 2017 at 6:52 AM, Stephen Boesch <javadba@gmail.com> wrote:
>
>> There are several  JIRA's and/or PR's that contain logic the Data Science
>> teams that I work with use in their local models. We are trying to
>> determine if/when these features may gain traction again.  In at least one
>> case all of the work were done but the shepherd said that getting it
>> committed were of lower priority than other tasks - one specifically
>> mentioned was the mllib/ml parity that has been ongoing for nearly three
>> years.
>>
>> In order to prioritize work that the ML platform would do it would be
>> helpful to know at least which if any of those tasks were going to be moved
>> ahead by the community: since we could then focus on other ones instead of
>> duplicating the effort.
>>
>> In addition there are some engineering code jam sessions that happen
>> periodically: knowing which features are actively on the roadmap would *certainly
>> *influence our selection of work.  The roadmaps from 2.2.0 and earlier
>> were a very good starting point to understand not just the specific work in
>> progress - but also the current mindset/thinking of the committers in terms
>> of general priorities.
>>
>> So if the same format of document were not available - then what content *is
>> *that gives a picture of where spark.ml were headed?
>>
>> 2017-11-29 6:39 GMT-08:00 Stephen Boesch <javadba@gmail.com>:
>>
>>> Any further information/ thoughts?
>>>
>>>
>>>
>>> 2017-11-22 15:07 GMT-08:00 Stephen Boesch <javadba@gmail.com>:
>>>
>>>> The roadmaps for prior releases e.g. 1.6 2.0 2.1 2.2 were available:
>>>>
>>>> 2.2.0 https://issues.apache.org/jira/browse/SPARK-18813
>>>>
>>>> 2.1.0 https://issues.apache.org/jira/browse/SPARK-15581
>>>> ..
>>>>
>>>> It seems those roadmaps were not available per se' for 2.3.0 and later?
>>>> Is there a different mechanism for that info?
>>>>
>>>> stephenb
>>>>
>>>
>>>
>>
>
>
> --
>
> Joseph Bradley
>
> Software Engineer - Machine Learning
>
> Databricks, Inc.
>
> [image: http://databricks.com] <http://databricks.com/>
>

Mime
View raw message