flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1807) Stochastic gradient descent optimizer for ML library
Date Wed, 29 Apr 2015 09:53:07 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519054#comment-14519054
] 

ASF GitHub Bot commented on FLINK-1807:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/613#discussion_r29322598
  
    --- Diff: docs/libs/ml/optimization.md ---
    @@ -0,0 +1,218 @@
    +---
    +mathjax: include
    +title: "ML - Optimization"
    +displayTitle: <a href="index.md">ML</a> - Optimization
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +* Table of contents
    +{:toc}
    +
    +$$
    +\newcommand{\R}{\mathbb{R}}
    +\newcommand{\E}{\mathbb{E}} 
    +\newcommand{\x}{\mathbf{x}}
    +\newcommand{\y}{\mathbf{y}}
    +\newcommand{\wv}{\mathbf{w}}
    +\newcommand{\av}{\mathbf{\alpha}}
    +\newcommand{\bv}{\mathbf{b}}
    +\newcommand{\N}{\mathbb{N}}
    +\newcommand{\id}{\mathbf{I}}
    +\newcommand{\ind}{\mathbf{1}} 
    +\newcommand{\0}{\mathbf{0}} 
    +\newcommand{\unit}{\mathbf{e}} 
    +\newcommand{\one}{\mathbf{1}} 
    +\newcommand{\zero}{\mathbf{0}}
    +$$
    +
    +## Mathematical Formulation
    +
    +The optimization framework in Flink is a developer-oriented package that can be used
to solve
    +[optimization](https://en.wikipedia.org/wiki/Mathematical_optimization) 
    +problems common in Machine Learning (ML) tasks. In the supervised learning context, this
usually 
    +involves finding a model, as defined by a set of parameters $w$, that minimize a function
$f(\wv)$ 
    +given a set of $(\x, y)$ examples,
    +where $\x$ is a feature vector and $y$ is a real number, which can represent either a
real value in 
    +the regression case, or a class label in the classification case. In supervised learning,
the 
    +function to be minimized is usually of the form:
    +
    +$$
    +\begin{equation}
    +    f(\wv) := 
    +    \frac1n \sum_{i=1}^n L(\wv;\x_i,y_i) +
    +    \lambda\, R(\wv)
    +    \label{eq:objectiveFunc}
    +    \ .
    +\end{equation}
    +$$
    +
    +where $L$ is the loss function and $R(\wv)$ the regularization penalty. We use $L$ to
measure how
    +well the model fits the observed data, and we use $R$ in order to impose a complexity
cost to the
    +model, with $\lambda > 0$ being the regularization parameter.
    +
    +### Loss Functions
    +
    +In supervised learning, we use loss functions in order to measure the model fit, by 
    +penalizing errors in the predictions $p$ made by the model compared to the true $y$ for
each 
    +example. Different loss function can be used for regression (e.g. Squared Loss) and classification
    +(e.g. Hinge Loss).
    +
    +Some common loss functions are:
    + 
    +* Squared Loss: $ \frac{1}{2} (\wv^T \x - y)^2, \quad y \in \R $ 
    +* Hinge Loss: $ \max (0, 1-y \wv^T \x), \quad y \in \{-1, +1\} $
    --- End diff --
    
    maybe we can add a small spacing between `y` and `\wv^T\x`


> Stochastic gradient descent optimizer for ML library
> ----------------------------------------------------
>
>                 Key: FLINK-1807
>                 URL: https://issues.apache.org/jira/browse/FLINK-1807
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Theodore Vasiloudis
>              Labels: ML
>
> Stochastic gradient descent (SGD) is a widely used optimization technique in different
ML algorithms. Thus, it would be helpful to provide a generalized SGD implementation which
can be instantiated with the respective gradient computation. Such a building block would
make the development of future algorithms easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message