mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <suneel.mar...@gmail.com>
Subject Re: How to change /tmp directory for mahout usage of map-reduce?
Date Wed, 01 Apr 2015 07:27:25 GMT
You need to set the temp path in ur Configuration and pass the
Configuration object to the subsequent calls.

IIRC, Spectral KMeans internally calls other MapReduce jobs like
MatrixDiagnolizeJob, VectorMatrixMultiplicationJob, SSVD.
So ensure that you are passing common parameters like tempDir, outputDir
etc via Configuration across the jobs.

Shannon could help better here.

On Wed, Apr 1, 2015 at 3:21 AM, Vikas Kumar <kumar093@umn.edu> wrote:

> Sorry, it didn't solved the problem.
>
> What it changed was the *tmp* directory for the following (taken from the
> log attached above):
> 15/04/01 01:18:13 INFO mapred.MapTask: Processing split:
> file:/export/scratch/vikas/<<<<PRIVATE DIRECTORIES>>>>>
> /tmp/calculations/seqfile/part-r-00000:0+86000
>
> However, the *tmp* directory for TrackerDistributedCacheManager is still
> the same:
>
> 15/04/01 01:18:13 INFO filecache.TrackerDistributedCacheManager: Creating
> vector in */tmp/hadoop-vikas/mapred/local*/archive/-623590149816891030_-
> 1428839080_1939951392/file/export/scratch/vikas/<<<<PRIVATE
> DIRECTORIES>>>>>/tmp/calculations-work--3390146237769593830 with rwxr-xr-x
>
> It seems like I just require to set the right resource (Path or string) in
> the Configuration object passed as the parameter of the Spectral
> Clustering. But not able to figure out which one.
>
> Thanks
>
>
>
>
>
>
> On Wed, Apr 1, 2015 at 1:43 AM, Vikas Kumar <kumar093@umn.edu> wrote:
>
> > That was helpful to figure out what was required.
> > I had to set the right path for variable *tmp* in the function from :
> >
> > Path tmp = new Path("tmp")
> >
> > to
> >
> > Path tmp = new Path("<<CHOSEN DIRECTORY>>");
> >
> > Silly mistake. Thanks for the clue :)
> >
> > -Vikas
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Apr 1, 2015 at 1:34 AM, Suneel Marthi <suneel.marthi@gmail.com>
> > wrote:
> >
> >> If u running Spectral KMeans via Command Line, u should be able to set
> the
> >> parameter -tempDir to point to a different path
> >>
> >> On Wed, Apr 1, 2015 at 1:55 AM, Andrew Musselman <
> >> andrew.musselman@gmail.com
> >> > wrote:
> >>
> >> > Can you let us know which code/scripts you're using?
> >> >
> >> > On Tuesday, March 31, 2015, Vikas Kumar <kumar093@umn.edu> wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > I am using Mahout Spectral clustering example which internally
> calls a
> >> > map
> >> > > reduce job. Right now, it is using
> */tmp/hadoop-<username>/mapred/..*
> >> > > directory by default for its operations.
> >> > >
> >> > > Can someone please let me know how to make mahout to use a different
> >> > path?
> >> > >
> >> > > Thanks
> >> > > Vikas
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message