spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-18023) Adam optimizer
Date Thu, 20 Oct 2016 06:19:58 GMT
Vincent created SPARK-18023:
-------------------------------

             Summary: Adam optimizer
                 Key: SPARK-18023
                 URL: https://issues.apache.org/jira/browse/SPARK-18023
             Project: Spark
          Issue Type: New Feature
          Components: ML, MLlib
            Reporter: Vincent
            Priority: Minor


It could be incredibly slow for SGD methods to diverge or converge if their  learning rate
alpha are set inappropriately, many alternative methods have been proposed to produce desirable
convergence with less dependence on hyperparameter settings, and to help prevent local optimum,
e.g. Momentom, NAG (Nesterov's Accelerated Gradient), Adagrad, RMSProp etc.
Among which, Adam is one of the popular algorithms, which is for first-order gradient-based
optimization of stochastic objective functions. It's proved to be well suited for problems
with large data and/or parameters, and for problems with noisy and/or sparse gradients and
is computationally efficient. Refer to this paper for details<https://arxiv.org/pdf/1412.6980v8.pdf>

In fact, Tensorflow has implemented most of the adaptive optimization methods mentioned, and
we have seen that Adam out performs most of SGD methods in certain cases, such as very sparse
dataset in a FM model.

It could be nice for Spark to have these adaptive optimization methods. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message