mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ahmed.nagy" <>
Subject Distributed Matrix Multiplication and operations
Date Mon, 10 Jan 2011 12:16:19 GMT

I am implementing a matrix factorisation technique for matrices that does not
fit in memory of a node. I have checked the documentation and the book
Mahout in Action for the distributed matrix operations DistributedRowMatrixI
need to carry out some distributed matrix operations. I have designed the
algorithm in that way.
Three matrices A B and C
Divide the matrix A into chunks
Divide C into chunks 
Map chunks of A, C and the matrix B 
Compute the updates 
Reduce Matrix C then compute Matrix B 
Repeat the above set of operations for Maxiterations
1-do I need to distribute the matrices on the cluster if I am carrying out
2-How can I control the amount of parallelism by the mappers for example.
3-When I used the constructor of the DistributedRowMatrix
DistributedRowMatrix m = new
DistributedRowMatrix("path/to/vector/sequenceFile", "tmp/path", 10000000,
from the example found on

it gives The constructor DistributedRowMatrix(String, String, int, int) is
I dug a bit and i found that the first two parameters are string and string
however i found that they should recieve a type Path that I tried to define
intialise like  that Path in=new Path("path/to/vector/sequenceFile");//
		Path out=new Path("/tmp/path");
then I passed in and out as parameters 
DistributedRowMatrix m = new DistributedRowMatrix(in,out, 10000000, 250000);
4-Another point is the  m.configure(new JobConf()); produces a warning of
deperciated JobConf.
5-Is  there anyside effect from using the deperciated JobConf.
6-Would anybody pinpoint me to how to package this job and run it on a
7-However I am not sure how to pass the sequence file when it is residing on
the HDFS.
Sorry if some of the questions might look naive.
I apperciate any insights.
Ahmed Nagy

Ahmed Nagy
View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message