spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evan R. Sparks" <evan.spa...@gmail.com>
Subject Re: what paper is the L2 regularization based on?
Date Thu, 09 Jan 2014 19:10:00 GMT
Hi,

The L2 update rule is derived from the derivative of the loss function with
respect to the model weights - an L2 regularized loss function contains an
additional additive term involving the weights. This paper provides some
useful mathematical background:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.7377

The code that computes the new L2 weight is here:
https://github.com/apache/incubator-spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/optimization/Updater.scala#L90

The compute function calculates the new weights based on the current
weights gradient as computed at each step. Contrast it with the code in the
SimpleUpdater class to get a sense for how the regularization parameter is
incorporated - it's fairly simple.

In general, though, I agree it makes sense to include a discussion of the
algorithm and a reference to the specific version we implement in the
scaladoc.

- Evan


On Thu, Jan 9, 2014 at 10:49 AM, Walrus theCat <walrusthecat@gmail.com>wrote:

> No -- I'm not, and I appreciate the comment.  What I'm looking for is a
> specific mathematical formula that I can map to the source code.
>
> Personally, specifically, I'd like to see how the loss function gets
> embedded into the w (gradient), in the case of the regularized and
> unregularized operation.
>
> Looking through the source, the "loss history" makes sense to me, but I
> can't see how that translates into the effect on the gradient.
>
>
> On Thu, Jan 9, 2014 at 10:39 AM, Sean Owen <srowen@gmail.com> wrote:
>
>> L2 regularization just means "regularizing by penalizing parameters
>> whose L2 norm is large", and L2 norm just means squared length. It's
>> not something you would write an ML paper on any more than what the
>> vector dot product is. Are you asking something else?
>>
>> On Thu, Jan 9, 2014 at 6:19 PM, Walrus theCat <walrusthecat@gmail.com>
>> wrote:
>> > Thanks Christopher,
>> >
>> > I wanted to know if there was a specific paper this particular codebase
>> was
>> > based on.  For instance, Weka cites papers in their documentation.
>> >
>> >
>> > On Wed, Jan 8, 2014 at 7:10 PM, Christopher Nguyen <ctn@adatao.com>
>> wrote:
>> >>
>> >> Walrus, given the question, this may be a good place for you to start.
>> >> There's some good discussion there as well as links to papers.
>> >>
>> >>
>> >>
>> http://www.quora.com/Machine-Learning/What-is-the-difference-between-L1-and-L2-regularization
>> >>
>> >> Sent while mobile. Pls excuse typos etc.
>> >>
>> >> On Jan 8, 2014 2:24 PM, "Walrus theCat" <walrusthecat@gmail.com>
>> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Can someone point me to the paper that algorithm is based on?
>> >>>
>> >>> Thanks
>> >
>> >
>>
>
>

Mime
View raw message