spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yanbo Liang <yblia...@gmail.com>
Subject Re: Hinge Gradient
Date Sat, 16 Dec 2017 22:52:53 GMT
Hello Deb,

To optimize non-smooth function on LBFGS really should be considered carefully.
Is there any literature that proves changing max to soft-max can behave well?
I’m more than happy to see some benchmarks if you can have.

+ Yuhao, who did similar effort in this PR: https://github.com/apache/spark/pull/17862 <https://github.com/apache/spark/pull/17862>

Regards
Yanbo   

> On Dec 13, 2017, at 12:20 AM, Debasish Das <debasish.das83@gmail.com> wrote:
> 
> Hi,
> 
> I looked into the LinearSVC flow and found the gradient for hinge as follows:
> 
> Our loss function with {0, 1} labels is max(0, 1 - (2y - 1) (f_w(x)))
> Therefore the gradient is -(2y - 1)*x
> 
> max is a non-smooth function.
> 
> Did we try using ReLu/Softmax function and use that to smooth the hinge loss ?
> 
> Loss function will change to SoftMax(0, 1 - (2y-1) (f_w(x)))
> 
> Since this function is smooth, gradient will be well defined and LBFGS/OWLQN should behave
well. 
> 
> Please let me know if this has been tried already. If not I can run some benchmarks.
> 
> We have soft-max in multinomial regression and can be reused for LinearSVC flow.
> 
> Thanks.
> Deb


Mime
View raw message