spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evan R. Sparks" <>
Subject Re: Contributing to MLlib
Date Wed, 02 Jul 2014 18:10:24 GMT
Hi there,

Generally we try to avoid duplicating logic if possible, particularly for
algorithms that share a great deal of algorithmic similarity. See, for
example, the way we implement Logistic regression vs. Linear regression vs.
Linear SVM with different gradient functions all on top of SGD or L-BFGS.

Based on my (brief) look at the FCM algorithm, it appears that the main
difference is the ability to assign a weight vector associating the degree
of relationship of a given point to some centroid. My guess is that you can
figure out a way to inherit much of the K-Means logic in an algorithm for

Regardless, if you'd like to add an algorithm, please create a JIRA ticket
for it and then send a pull request which references that JIRA where we can
discuss the specifics of that implementation and whether it is of broad
enough interest to warrant inclusion in the library.

- Evan

On Wed, Jul 2, 2014 at 11:02 AM, salexln <> wrote:

> guys??? anyone???
> --
> View this message in context:
> Sent from the Apache Spark Developers List mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message