spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From martinjaggi <>
Subject [GitHub] incubator-spark pull request: new MLlib documentation for optimiza...
Date Sun, 09 Feb 2014 00:54:49 GMT
GitHub user martinjaggi reopened a pull request:

    new MLlib documentation for optimization, regression and classification

    new documentation with tex formulas, hopefully improving usability and reproducibility
of the offered MLlib methods.
    also did some minor changes in the code for consistency. scala tests pass.
    for easier merging, we could maybe rebase these changes (only > feb 7 is relevant)
    is merged?

You can merge this pull request into a Git repository by running:

    $ git pull polishing-opt-MLlib

Alternatively you can review and apply these changes as the patch at:

commit d73948db0d9bc36296054e79fec5b1a657b4eab4
Author: Martin Jaggi <>
Date:   2014-02-06T15:57:23Z

    minor update on how to compile the documentation

commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb
Author: Martin Jaggi <>
Date:   2014-02-06T15:59:43Z

    enable mathjax formula in the .md documentation files
    code by @shivaram

commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa
Author: Martin Jaggi <>
Date:   2014-02-06T16:31:29Z

    split MLlib documentation by techniques
    and linked from the main site

commit dcd2142c164b2f602bf472bb152ad55bae82d31a
Author: Martin Jaggi <>
Date:   2014-02-06T17:04:26Z

    enabling inline latex formulas with $.$
    same mathjax configuration as used in
    sample usage in the linear algebra (SVD) documentation

commit 0364bfabbfc347f917216057a20c39b631842481
Author: Martin Jaggi <>
Date:   2014-02-07T02:19:38Z

    minor polishing, as suggested by @pwendell

commit 93d74988c33a9e4ef0d15e39c8b8fc9e6c36bb28
Author: Martin Jaggi <>
Date:   2014-02-07T16:33:24Z

    renaming LeastSquaresGradient
    not to confuse with squared regularizer or a squared gradient. added
    some more comments as what the loss functions are good for

commit e4cbe99bbcf7f53ebb8f1a0d2e0b869a4922bca4
Author: Martin Jaggi <>
Date:   2014-02-07T16:34:45Z

    use d for the number of features
    try to be consistent, that n is the number of data examples in the RDD,
    and each of them has d entries (also in documentation)

commit 79768fd3429df5c6d56f05ac93bdd8cf4355d946
Author: Martin Jaggi <>
Date:   2014-02-07T17:13:17Z

    correct scaling for MSE loss
    to be consistent with the documentation

commit 1e228062b01ac806c4bd032eb0975a8b92431fd9
Author: Martin Jaggi <>
Date:   2014-02-07T17:15:44Z

    new classification and regression documentation
    with complete mathematical formulations. trying to be general for
    adding future ML methods as well. table of all subgradients used for
    this change also required a small addition to the mathjax
    configuration, to allow equation numbers.

commit 89e472f4121debb175b625ab0c138e24c4e60de8
Author: Martin Jaggi <>
Date:   2014-02-07T17:16:51Z

    new optimization documentation
    explaining GD and SGD and the distributed versions that MLlib

commit a33be78a47bad1745a03a6e0ee1a4ea1a7893805
Author: Martin Jaggi <>
Date:   2014-02-07T17:38:57Z

    better comments in SGD code for regression

commit 73f5e71e3d9a253ff378907fca202b8d6aae1268
Author: Martin Jaggi <>
Date:   2014-02-07T22:41:42Z

    lambda R() in documentation

commit eec58c9c860def9b3b7604c990ec1697812bcbbf
Author: Martin Jaggi <>
Date:   2014-02-08T17:31:05Z

    telling what updater actually does
    also use proper scaling for the L2 regularization (using 1/2 as in the

commit 2c1cf8d35145081a61865f55f4e48fcfbafddbbe
Author: Martin Jaggi <>
Date:   2014-02-08T17:56:01Z

    remove broken url

commit ecbac73a7450fc90ef1509d9a410c9b627617130
Author: Martin Jaggi <>
Date:   2014-02-08T17:57:12Z

    better description of GradientDescent

commit eae3dce25a4b68bf32ece1ca7783f9b2ffd56dff
Author: Martin Jaggi <>
Date:   2014-02-08T20:30:35Z

    line wrap at 100 chars


View raw message