mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Xue <xue....@gmail.com>
Subject Re: Transposing a matrix is limited by how large a node is.
Date Fri, 06 May 2011 14:36:52 GMT
Hi Jake,
As requested the stats from the job are listed below:

Counter Map Reduce Total
Job Counters Launched reduce tasks 0 0 2
Rack-local map tasks 0 0 69
Launched map tasks 0 0 194
Data-local map tasks 0 0 125
FileSystemCounters FILE_BYTES_READ 66,655,795,630 0 66,655,795,630
HDFS_BYTES_READ 12,871,657,393 0 12,871,657,393
FILE_BYTES_WRITTEN 103,841,910,638 0 103,841,910,638
Map-Reduce Framework Combine output records 0 0 0
Map input records 54,675 0 54,675
Spilled Records 4,720,084,588 0 4,720,084,588
Map output bytes 33,805,552,500 0 33,805,552,500
Map input bytes 12,804,666,825 0 12,804,666,825
Map output records 1,690,277,625 0 1,690,277,625
Combine input records 0 0 0

In response to your suggestion, I do have a server with lots of ram however
I would like to stick to having files on the HDFS.  As I am running some PCA
analysis I would have to reimport the data back into HDFS to run SVD. (We
tried to run similar computations on a machine with >64GB and the previous R
implementation crashed after a few days...)

Because I am limited by my resources, I  coded up a slower but effective
implementation of the transpose job that I could share. It avoids loading
all the data on to one node by transposing the matrix in pieces. The slowest
part of this is combining the pieces back to one matrix. :(

-Vincent


On Fri, May 6, 2011 at 2:54 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
>
> On Fri, May 6, 2011 at 6:01 AM, Vincent Xue <xue.vin@gmail.com> wrote:
>
> > Dear Mahout Users,
> >
> > I am using Mahout-0.5-SNAPSHOT to transpose a dense matrix of 55000 x
> > 31000.
> > My matrix is in stored on the HDFS as a
> > SequenceFile<IntWritable,VectorWritable>, consuming just about 13 GB.
When
> > I
> > run the transpose function on my matrix, the function falls over during
the
> > reduce phase. With closer inspection, I noticed that I was receiving the
> > following error:
> >
> > FSError: java.io.IOException: No space left on device
> >
> > I thought this was not possible considering that I was only using 15% of
> > the
> > 2.5 TB in the cluster but when I closely monitored the disk space, it
was
> > true that the 40 GB hard drive on the node was running out of space.
> > Unfortunately, all of my nodes are limited to 40 GB and I have not been
> > successful in transposing my matrix.
> >
>
> Running HDFS with nodes with only 40GB of hard disk each is a recipe
> for disaster, IMO.  There are lots of temporary files created by
map/reduce
> jobs, and working on an input file of size 13GB you're bound to run into
> this.
>
> Can you show us what your job tracker says the amount of
> HDFS_BYTES_WRITTEN (and other similar numbers) during your job?
>
>
> > From this observation, I would like to know if there is any alternative
> > method to transpose my matrix or if there is something I am missing?
>
>
> Do you have a server with 26GB of RAM lying around somewhere?
> You could do it on one machine without hitting disk. :)
>
>  -jake

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message