spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lijie Xu <>
Subject Re: ML Algos
Date Fri, 16 Aug 2013 02:13:01 GMT
Quite interesting. I have some questions about this amazing project:
1) In "Logistic Regression -­‐ Weak Scaling", MLlib and VW run slower in
each processor for fixed problem while data/machines are increasing. Could
you explain which component causes this performance degradataion problem.
Synchronization, network traffic, data partition or etc. ?

2) What's the relationship between MLBase and GraphX?

3) MLBase may require Spark to provide some new features for implementing
some specific algorithms. Is there any? Or you have added some new
fundamental features which are not supported in Spark-0.7?

On Fri, Aug 16, 2013 at 4:01 AM, Ameet Talwalkar <>wrote:

> The following slides<>
> the ML algorithms to be included in MLlib (slide 49) and MLI (slide 107) in
> the near future.  We plan to include additional
> classification/regression/CF/clustering/optimization primitives over time
> with the help of the open-source community, and based on feedback from
> users about desired functionality.  Moreover, we ultimately aim to add
> advance ML functionality, as briefly described in slide 140.
> -Ameet
> On Thu, Aug 15, 2013 at 12:32 PM, Gowtham N <>wrote:
>> Hi,
>> Can someone give details about the future work in ML algorithms (Inside
>> mllib folder).
>> Currently there are some basic algorithms implemented. Is there any
>> roadmap regarding what ML algorithms are required?

View raw message