spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Burak Yavuz <bya...@stanford.edu>
Subject Re: 15 new MLlib algorithms
Date Wed, 09 Jul 2014 19:31:29 GMT
Hi,

The roadmap for the 1.1 release and MLLib includes algorithms such as:

Non-negative matrix factorization, Sparse SVD, Multiclass 
decision tree, Random Forests (?)

and optimizers such as:
ADMM, Accelerated gradient methods

also a statistical toolbox that includes:
descriptive statistics, sampling, hypothesis testing

and hopefully Parallel model training for autotuning.

Source:
https://databricks-training.s3.amazonaws.com/slides/Spark_Summit_MLlib_070214_v2.pdf

Best,
Burak



----- Original Message -----
From: "Michael Malak" <michaelmalak@yahoo.com.INVALID>
To: dev@spark.apache.org
Sent: Wednesday, July 9, 2014 11:43:26 AM
Subject: 15 new MLlib algorithms

At Spark Summit, Patrick Wendell indicated the number of MLlib algorithms would "roughly double"
in 1.1 from the current approx. 15.
http://spark-summit.org/wp-content/uploads/2014/07/Future-of-Spark-Patrick-Wendell.pdf

What are the planned additional algorithms?

In Jira, I only see two when filtering on version 1.1, component MLlib: one on multi-label
and another on high dimensionality.

https://issues.apache.org/jira/browse/SPARK-2329?jql=issuetype%20in%20(Brainstorming%2C%20Epic%2C%20%22New%20Feature%22%2C%20Story)%20AND%20fixVersion%20%3D%201.1.0%20AND%20component%20%3D%20MLlib

http://tinyurl.com/ku7sehu


Mime
View raw message