mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Harrington <>
Subject Re: Vector distance within a cluster
Date Mon, 04 Mar 2013 17:00:05 GMT
So if I'm understanding what you are saying is, simply put, that I should investigate the use
L_1 as my distance measure during my measuring of vector distance within a cluster?

On 1 Mar 2013, at 16:24, Ted Dunning wrote:

> What Sean says is just right, except that I was (telegraphically) getting
> at a slightly different point with L_1:
> On Wed, Feb 27, 2013 at 7:23 AM, Chris Harrington <>wrote:
>> Is L_1 regularization the same as manhattan distance?
> L_1 metric is manhattan distance, yes.
> L_1 regularization of k-means refers to something a little bit different.
> The idea with regularization is that you add some sort of penalty to the
> function you are optimizing.  This penalty pushes the optimization toward a
> solution that you would prefer on some other grounds than just the
> optimization alone.  Regularization often helps in solving underdetermined
> systems where there are an infinite number of solutions and we have to pick
> a preferred solution.
> There isn't anything that says that you have to be optimizing the same kind
> of function as the regularization.  Thus k-means, which is inherently
> optimizing squared error can quite reasonably be regularized with L_1 (sum
> of the absolute value of the centroids' coefficients).
> I haven't tried this at all seriously yet.  L_1 regularization tends to
> help drive toward sparsity, but it is normally used in convex problems
> where we can guarantee a findable global optimum.  The k-means problem,
> however, is not convex so adding the regularization may screw things up in
> practice.  For text-like data, I have a strong intuition that the idealized
> effect of L_1 should be very good, but the pragmatic effect may not be so
> useful.

View raw message