mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Brickley <>
Subject Re: Help for graph processing implementation
Date Sun, 23 Oct 2011 10:00:31 GMT
On 23 October 2011 07:16, Bae, Jae Hyeon <> wrote:

> I am implementing graph clustering algorithm based on hadoop and mahout.
> This is my term project of data mining course.

Cool! Fun topic...

> Spectral method of graph clustering needs calculation of eigenvectors,
> which is not practically efficient with the large scale graph. Thus, there
> exists multi-level graph clustering method without eigenvectors. This
> contains graph coarsening, base clustering, refining. Refining stage can be
> done with weighted kernel k-means clustering which is not so difficult to
> be implemented in MapReduce way, but the problem is graph coarsening.
> Pseudocode is on this paper
> Like any graph
> processing algorithm, this algorithm does not look easy to
> be intuitively implemented in MapReduce way. So, I need a help from experts
> more proficient at converting single thread graph algorithm to MapReduce
> way. If this work is done smoothly, I will contribute this graph clustering
> algorithm to Mahout if I am allowed to do so.

Do take a look at the Spectral Clustering code (and Eigencuts) already
in Mahout.

orig proposal:
blog notes:
javadoc: (maybe not freshest)

The Wiki page in particular might be a good start,

Code is in public svn:

Take a look at the run() method in

...this takes in a textual representation of an affinity matrix,
constructs the laplacian representation of it and uses the existing
DistributedLanczosSolver/SVD (see ) and
K-Means components from elsewhere in Mahout. The code seems fairly
well commented.

Having said that, this code doesn't seem very happy currently. It
seems per difficult
to get running in current Mahout trunk. I couldn't get it running yet.

If you can find a way in your term project to build on this work, that
would be fantastic...

Hope this helps,



View raw message