spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DB Tsai <dbt...@dbtsai.com>
Subject Re: Regularization in MLlib
Date Tue, 07 Apr 2015 22:28:04 GMT
1)  Norm(weights, N) will return (w_1^N + w_2^N +....)^(1/N), so norm
* norm is required.

2) This is bug as you said. I intend to fix this using weighted
regularization, and intercept term will be regularized with weight
zero. https://github.com/apache/spark/pull/1518 But I never actually
have time to finish it. In the meantime, I'm fixing this without this
framework in new ML pipeline framework.

3) I think in the long term, we need weighted regularizer instead of
updater which couples regularization and adaptive step size update for
GD which is not needed in other optimization package.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Tue, Apr 7, 2015 at 3:03 PM, Ulanov, Alexander
<alexander.ulanov@hp.com> wrote:
> Hi,
>
> Could anyone elaborate on the regularization in Spark? I've found that L1 and L2 are
implemented with Updaters (L1Updater, SquaredL2Updater).
> 1)Why the loss reported by L2 is (0.5 * regParam * norm * norm) where norm is Norm(weights,
2.0)? It should be 0.5*regParam*norm (0.5 to disappear after differentiation). It seems that
it is mixed up with mean squared error.
> 2)Why all weights are regularized? I think we should leave the bias weights (aka free
or intercept) untouched if we don't assume that the data is centralized.
> 3)Are there any short-term plans to move regularization from updater to a more convenient
place?
>
> Best regards, Alexander
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message