spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Ulanov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-5256) Improving MLlib optimization APIs
Date Thu, 15 Jan 2015 00:38:35 GMT

    [ https://issues.apache.org/jira/browse/SPARK-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277986#comment-14277986
] 

Alexander Ulanov commented on SPARK-5256:
-----------------------------------------

I would like to improve Gradient interface, so it will be able to process something more general
than `Label` (which is relevant only to classifiers but not to other machine learning methods)
and also allowing batch processing. The simplest way for me of doing this is to add another
function to `Gradient` interface:

def compute(data: Vector, output: Vector, weights: Vector, cumGradient: Vector): Double

In `Gradient` trait it should call `compute` with `label`. Of course, one needs to make some
adjustments to LBFGS and GradientDescent optimizers, replacing label: double with output:vector.


 For batch processing one can put data and output points stacked into a long vector (matrices
are stored in this way in breeze) and pass them with the proposed interface.

> Improving MLlib optimization APIs
> ---------------------------------
>
>                 Key: SPARK-5256
>                 URL: https://issues.apache.org/jira/browse/SPARK-5256
>             Project: Spark
>          Issue Type: Umbrella
>          Components: MLlib
>    Affects Versions: 1.2.0
>            Reporter: Joseph K. Bradley
>
> *Goal*: Improve APIs for optimization
> *Motivation*: There have been several disjoint mentions of improving the optimization
APIs to make them more pluggable, extensible, etc.  This JIRA is a place to discuss what API
changes are necessary for the long term, and to provide links to other relevant JIRAs.
> Eventually, I hope this leads to a design doc outlining:
> * current issues
> * requirements such as supporting many types of objective functions, optimization algorithms,
and parameters to those algorithms
> * ideal API
> * breakdown of smaller JIRAs needed to achieve that API
> I will soon create an initial design doc, and I will try to watch this JIRA and include
ideas from JIRA comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message