mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philippe Lamarche" <philippe.lamar...@gmail.com>
Subject Re: Problems with KMeans clustering
Date Sun, 02 Nov 2008 15:29:57 GMT
Hi there,
It also works on 0.19.0-dev, that is on hadoop/branches/branch-0.19.

I intend in the next few day to try to find out what exactly is the problem
to make sure that it won't come back in a few revisions.

Thanks!

On Thu, Oct 30, 2008 at 9:20 AM, Grant Ingersoll <gsingers@apache.org>wrote:

> Hmm, I believe that patch has been applied in 18.2 (whatever that is) but
> it also looks like it has been applied to 0.17.3 branch as well.    So, it
> might be something else that "fixed" it.
>
> At any rate, glad to hear it works on trunk.
>
>
> On Oct 29, 2008, at 6:38 PM, Philippe Lamarche wrote:
>
>  I am not sure I understand the hadoop svn structure, however I was able to
>> make it work with hadoop trunk, or 0.20.0-dev.
>> It didn't work with hadoop/branch-0.18, with or without patch 4277.
>>
>>
>> Here is a copy-paste of the steps, once Hadoop is built and installed.  I
>> am
>> using the same exact "apache-mahout-examples-0.1-dev.job", not rebuilt
>> with
>> the 0.20.0-dev jars.
>>
>> It works!
>>
>> That would mean that the bug/feature is not related to
>> HADOOP-4277<http://issues.apache.org/jira/browse/HADOOP-4277>,
>>
>> and was reintroduced (or never took away) in hadoop/trunk.
>>
>>
>> hadoop@phil:/usr/local/hadoop$ bin/hadoop namenode -format
>> 08/10/29 18:27:59 INFO namenode.NameNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting NameNode
>> STARTUP_MSG:   host = phil/127.0.1.1
>> STARTUP_MSG:   args = [-format]
>> STARTUP_MSG:   version = 0.20.0-dev
>> STARTUP_MSG:   build =  -r ; compiled by 'philippe' on Wed Oct 29 18:25:08
>> EDT 2008
>> ************************************************************/
>> 08/10/29 18:28:00 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
>> 08/10/29 18:28:00 INFO namenode.FSNamesystem: supergroup=supergroup
>> 08/10/29 18:28:00 INFO namenode.FSNamesystem: isPermissionEnabled=true
>> 08/10/29 18:28:00 INFO common.Storage: Image file of size 96 saved in 0
>> seconds.
>> 08/10/29 18:28:00 INFO common.Storage: Storage directory
>> /usr/local/hadoop-datastore/hadoop-hadoop/dfs/name has been successfully
>> formatted.
>> 08/10/29 18:28:00 INFO namenode.NameNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down NameNode at phil/127.0.1.1
>> ************************************************************/
>>
>> hadoop@phil:/usr/local/hadoop$ bin/hadoop dfs -put
>> /home/philippe/synthetic_control.data testdata
>>
>> hadoop@phil:/usr/local/hadoop$ bin/hadoop jar
>>
>> /home/philippe/workspace/MahoutJava/examples/build/apache-mahout-examples-0.1-dev.job
>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>> 08/10/29 18:28:45 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:28:46 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 1
>> 08/10/29 18:28:47 INFO mapred.JobClient: Running job:
>> job_200810291828_0002
>> 08/10/29 18:28:48 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:28:54 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:28:55 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:28:56 INFO mapred.JobClient: Job complete:
>> job_200810291828_0002
>> 08/10/29 18:28:56 INFO mapred.JobClient: Counters: 7
>> 08/10/29 18:28:56 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:28:56 INFO mapred.JobClient:     HDFS bytes read=291644
>> 08/10/29 18:28:56 INFO mapred.JobClient:     HDFS bytes written=323660
>> 08/10/29 18:28:56 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:28:56 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:28:56 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:28:56 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:28:56 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:28:56 INFO mapred.JobClient:     Map input bytes=288374
>> 08/10/29 18:28:56 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:28:56 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:28:56 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:28:56 INFO mapred.JobClient: Running job:
>> job_200810291828_0003
>> 08/10/29 18:28:57 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:29:03 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:29:05 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:29:10 INFO mapred.JobClient:  map 100% reduce 100%
>> 08/10/29 18:29:11 INFO mapred.JobClient: Job complete:
>> job_200810291828_0003
>> 08/10/29 18:29:11 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:29:11 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:29:11 INFO mapred.JobClient:     HDFS bytes read=323660
>> 08/10/29 18:29:11 INFO mapred.JobClient:     HDFS bytes written=9657
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Local bytes read=36119
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Local bytes written=72300
>> 08/10/29 18:29:11 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:29:11 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Reduce input groups=1
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Combine output records=28
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Reduce output records=7
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Map output bytes=943020
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Combine input records=1732
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Map output records=1732
>> 08/10/29 18:29:11 INFO mapred.JobClient:     Reduce input records=28
>> 08/10/29 18:29:11 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:29:11 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:29:12 INFO mapred.JobClient: Running job:
>> job_200810291828_0004
>> 08/10/29 18:29:13 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:29:20 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:29:22 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:29:27 INFO mapred.JobClient:  map 100% reduce 100%
>> 08/10/29 18:29:28 INFO mapred.JobClient: Job complete:
>> job_200810291828_0004
>> 08/10/29 18:29:28 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:29:28 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:29:28 INFO mapred.JobClient:     HDFS bytes read=342974
>> 08/10/29 18:29:28 INFO mapred.JobClient:     HDFS bytes written=3002539
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Local bytes read=3018455
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Local bytes written=6036972
>> 08/10/29 18:29:28 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:29:28 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Reduce input groups=7
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Combine output records=0
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Reduce output records=1591
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Map output bytes=3008903
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Combine input records=0
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Map output records=1591
>> 08/10/29 18:29:28 INFO mapred.JobClient:     Reduce input records=1591
>> 08/10/29 18:29:28 INFO kmeans.KMeansDriver: Iteration 0
>> 08/10/29 18:29:28 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:29:28 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:29:28 INFO mapred.JobClient: Running job:
>> job_200810291828_0005
>> 08/10/29 18:29:29 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:29:35 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:29:37 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:29:41 INFO mapred.JobClient: Job complete:
>> job_200810291828_0005
>> 08/10/29 18:29:41 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:29:41 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:29:41 INFO mapred.JobClient:     HDFS bytes read=342974
>> 08/10/29 18:29:41 INFO mapred.JobClient:     HDFS bytes written=8205
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Local bytes read=23227
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Local bytes written=46516
>> 08/10/29 18:29:41 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:29:41 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Reduce input groups=7
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Combine output records=10
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Reduce output records=7
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Map output bytes=1136504
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Combine input records=600
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:29:41 INFO mapred.JobClient:     Reduce input records=10
>> 08/10/29 18:29:41 INFO kmeans.KMeansDriver: Iteration 1
>> 08/10/29 18:29:41 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:29:41 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:29:42 INFO mapred.JobClient: Running job:
>> job_200810291828_0006
>> 08/10/29 18:29:43 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:29:50 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:29:51 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:29:55 INFO mapred.JobClient:  map 100% reduce 100%
>> 08/10/29 18:29:56 INFO mapred.JobClient: Job complete:
>> job_200810291828_0006
>> 08/10/29 18:29:56 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:29:56 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:29:56 INFO mapred.JobClient:     HDFS bytes read=340070
>> 08/10/29 18:29:56 INFO mapred.JobClient:     HDFS bytes written=8242
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Local bytes read=21265
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Local bytes written=42592
>> 08/10/29 18:29:56 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:29:56 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Reduce input groups=7
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Combine output records=10
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Reduce output records=7
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Map output bytes=1023966
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Combine input records=600
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:29:56 INFO mapred.JobClient:     Reduce input records=10
>> 08/10/29 18:29:56 INFO kmeans.KMeansDriver: Iteration 2
>> 08/10/29 18:29:56 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:29:56 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:29:56 INFO mapred.JobClient: Running job:
>> job_200810291828_0007
>> 08/10/29 18:29:57 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:30:03 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:30:05 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:30:09 INFO mapred.JobClient: Job complete:
>> job_200810291828_0007
>> 08/10/29 18:30:09 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:30:09 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:30:09 INFO mapred.JobClient:     HDFS bytes read=340144
>> 08/10/29 18:30:09 INFO mapred.JobClient:     HDFS bytes written=8280
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Local bytes read=21085
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Local bytes written=42232
>> 08/10/29 18:30:09 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:30:09 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Reduce input groups=7
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Combine output records=10
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Reduce output records=7
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Map output bytes=1023681
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Combine input records=600
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:30:09 INFO mapred.JobClient:     Reduce input records=10
>> 08/10/29 18:30:09 INFO kmeans.KMeansDriver: Iteration 3
>> 08/10/29 18:30:09 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:30:09 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:30:09 INFO mapred.JobClient: Running job:
>> job_200810291828_0008
>> 08/10/29 18:30:10 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:30:17 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:30:18 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:30:22 INFO mapred.JobClient:  map 100% reduce 100%
>> 08/10/29 18:30:23 INFO mapred.JobClient: Job complete:
>> job_200810291828_0008
>> 08/10/29 18:30:23 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:30:23 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:30:23 INFO mapred.JobClient:     HDFS bytes read=340220
>> 08/10/29 18:30:23 INFO mapred.JobClient:     HDFS bytes written=8250
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Local bytes read=21339
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Local bytes written=42740
>> 08/10/29 18:30:23 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:30:23 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Reduce input groups=7
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Combine output records=10
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Reduce output records=7
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Map output bytes=1028419
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Combine input records=600
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:30:23 INFO mapred.JobClient:     Reduce input records=10
>> 08/10/29 18:30:23 INFO kmeans.KMeansDriver: Iteration 4
>> 08/10/29 18:30:23 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:30:23 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:30:24 INFO mapred.JobClient: Running job:
>> job_200810291828_0009
>> 08/10/29 18:30:25 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:30:31 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:30:33 INFO mapred.JobClient:  map 100% reduce 0%
>> 08/10/29 18:30:37 INFO mapred.JobClient:  map 100% reduce 100%
>> 08/10/29 18:30:38 INFO mapred.JobClient: Job complete:
>> job_200810291828_0009
>> 08/10/29 18:30:38 INFO mapred.JobClient: Counters: 16
>> 08/10/29 18:30:38 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:30:38 INFO mapred.JobClient:     HDFS bytes read=340160
>> 08/10/29 18:30:38 INFO mapred.JobClient:     HDFS bytes written=8200
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Local bytes read=21219
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Local bytes written=42500
>> 08/10/29 18:30:38 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Launched reduce tasks=1
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:30:38 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Reduce input groups=7
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Combine output records=10
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Reduce output records=7
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Map output bytes=1024899
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Combine input records=600
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:30:38 INFO mapred.JobClient:     Reduce input records=10
>> 08/10/29 18:30:38 INFO kmeans.KMeansDriver: Clustering
>> 08/10/29 18:30:38 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:30:38 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:30:38 INFO mapred.JobClient: Running job:
>> job_200810291828_0010
>> 08/10/29 18:30:39 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:30:45 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:30:47 INFO mapred.JobClient: Job complete:
>> job_200810291828_0010
>> 08/10/29 18:30:47 INFO mapred.JobClient: Counters: 7
>> 08/10/29 18:30:47 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:30:47 INFO mapred.JobClient:     HDFS bytes read=340060
>> 08/10/29 18:30:47 INFO mapred.JobClient:     HDFS bytes written=1020535
>> 08/10/29 18:30:47 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:30:47 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:30:47 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:30:47 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:30:47 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:30:47 INFO mapred.JobClient:     Map input bytes=323660
>> 08/10/29 18:30:47 INFO mapred.JobClient:     Map output records=600
>> 08/10/29 18:30:47 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the same.
>> 08/10/29 18:30:47 INFO mapred.FileInputFormat: Total input paths to
>> process
>> : 2
>> 08/10/29 18:30:48 INFO mapred.JobClient: Running job:
>> job_200810291828_0011
>> 08/10/29 18:30:49 INFO mapred.JobClient:  map 0% reduce 0%
>> 08/10/29 18:30:56 INFO mapred.JobClient:  map 50% reduce 0%
>> 08/10/29 18:30:57 INFO mapred.JobClient: Job complete:
>> job_200810291828_0011
>> 08/10/29 18:30:57 INFO mapred.JobClient: Counters: 7
>> 08/10/29 18:30:57 INFO mapred.JobClient:   File Systems
>> 08/10/29 18:30:57 INFO mapred.JobClient:     HDFS bytes read=1020535
>> 08/10/29 18:30:57 INFO mapred.JobClient:     HDFS bytes written=325460
>> 08/10/29 18:30:57 INFO mapred.JobClient:   Job Counters
>> 08/10/29 18:30:57 INFO mapred.JobClient:     Launched map tasks=2
>> 08/10/29 18:30:57 INFO mapred.JobClient:     Data-local map tasks=2
>> 08/10/29 18:30:57 INFO mapred.JobClient:   Map-Reduce Framework
>> 08/10/29 18:30:57 INFO mapred.JobClient:     Map input records=600
>> 08/10/29 18:30:57 INFO mapred.JobClient:     Map input bytes=1020535
>> 08/10/29 18:30:57 INFO mapred.JobClient:     Map output records=600
>>
>>
>>
>>
>>
>> On Wed, Oct 29, 2008 at 11:10 AM, Philippe Lamarche <
>> philippe.lamarche@gmail.com> wrote:
>>
>>  I will!
>>>
>>>
>>> On 10/29/08, Grant Ingersoll <gsingers@apache.org> wrote:
>>>
>>>>
>>>> Philippe, can you try the patch suggested by Arun Murthy on
>>>> core-user@hadoop.a.o?  See
>>>> http://issues.apache.org/jira/browse/HADOOP-4277
>>>>
>>>> I'm pretty swamped at the moment w/ ApacheCon coming up next week, but
>>>> if
>>>> it does fix the issue, then maybe we should move forward to the 18.2
>>>> candidate (I don't think it has been released yet, those guys have a
>>>> pretty
>>>> sophisticated build process going)
>>>>
>>>> -Grant
>>>>
>>>> On Oct 28, 2008, at 7:19 AM, Philippe Lamarche wrote:
>>>>
>>>> Ubuntu linux 2.6.24 <http://2.6.24.21>, with java-6-sun-1.6.0.07.
>>>>
>>>>>
>>>>> On Tue, Oct 28, 2008 at 7:03 AM, Grant Ingersoll <gsingers@apache.org
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>
>>>>> Just a single machine.  I didn't think we were using features either.
>>>>>
>>>>>> Are
>>>>>> you saying you can run the example using 0.18.1?
>>>>>>
>>>>>> BTW, Philippe, what JVM, O/S, etc. are you using?
>>>>>>
>>>>>> -Grant
>>>>>>
>>>>>>
>>>>>> On Oct 27, 2008, at 11:55 PM, Jeff Eastman wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>>> Are you guys running on real Hadoop arrays? I can run the synthetic
>>>>>>> control example just fine on a single machine. That code is just
>>>>>>> trying
>>>>>>> to
>>>>>>> read a vector from a string. I'd be surprised if we were using any
>>>>>>> "features" but will watch the threads.
>>>>>>>
>>>>>>> Jeff
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Grant Ingersoll wrote:
>>>>>>>
>>>>>>> I started a thread on core-user@hadoop.a.o:
>>>>>>>
>>>>>>>> http://hadoop.markmail.org/message/cczunzfhpcqz6pis
>>>>>>>>
>>>>>>>>
>>>>>>>> On Oct 27, 2008, at 9:49 PM, Grant Ingersoll wrote:
>>>>>>>>
>>>>>>>> OK, I can confirm that the exact same code works with 0.17.2 and not
>>>>>>>> w/
>>>>>>>>
>>>>>>>>  0.18.1.  So, it sounds like a bug in Hadoop, or we are relying on
>>>>>>>>> incorrect behavior in Hadoop.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Oct 27, 2008, at 9:33 PM, Grant Ingersoll wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Oct 26, 2008, at 10:46 AM, Philippe Lamarche wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Unfortunately, I went straight from 0.17.2 to 0.18.1.  It was
>>>>>>>>>> working
>>>>>>>>>>
>>>>>>>>>>  on
>>>>>>>>>>> 0.17.2.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> BTW, are you saying the same exact code was working on 0.17.2 or
>>>>>>>>>>>
>>>>>>>>>> are
>>>>>>>>>> you referring to some older Mahout code that worked on 17.2?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  On Sun, Oct 26, 2008 at 9:48 AM, Grant Ingersoll <
>>>>>>>>>>> gsingers@apache.org
>>>>>>>>>>>
>>>>>>>>>>>  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Did this work with 0.18.0 or other prior versions for you?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 25, 2008, at 7:23 PM, Philippe Lamarche wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  I just updated to hadoop 0.18.1 and got a clean version of
>>>>>>>>>>>>> mahout
>>>>>>>>>>>>> from
>>>>>>>>>>>>> svn.
>>>>>>>>>>>>> However, I am having problems with KMeans, that can be traced
>>>>>>>>>>>>> down
>>>>>>>>>>>>> to :
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger:
>>>>>>>>>>>>> Merging
>>>>>>>>>>>>> 2 sorted segments
>>>>>>>>>>>>> 2008-10-25 19:10:16,987 INFO org.apache.hadoop.mapred.Merger:
>>>>>>>>>>>>> Down
>>>>>>>>>>>>> to
>>>>>>>>>>>>> the last merge-pass, with 2 segments left of total size: 5011
>>>>>>>>>>>>> bytes
>>>>>>>>>>>>> 2008-10-25 19:10:16,999 WARN
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>> attempt_200810251826_0013_r_000000_0 Merge of the inmemory
>>>>>>>>>>>>> files
>>>>>>>>>>>>> threw
>>>>>>>>>>>>> an exception: java.io.IOException: Intermedate merge failed
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2147)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2078)
>>>>>>>>>>>>> Caused by: java.lang.NumberFormatException: For input string:
>>>>>>>>>>>>> "["
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1224)
>>>>>>>>>>>>> at java.lang.Double.parseDouble(Double.java:510)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.mahout.matrix.DenseVector.decodeFormat(DenseVector.java:60)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.mahout.matrix.AbstractVector.decodeVector(AbstractVector.java:256)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:38)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.combineAndSpill(ReduceTask.java:2174)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$3100(ReduceTask.java:341)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2134)
>>>>>>>>>>>>> ... 1 more
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2008-10-25 19:10:16,999 INFO
>>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>> In-memory merge complete: 0 files left.
>>>>>>>>>>>>> 2008-10-25 19:10:17,000 WARN
>>>>>>>>>>>>> org.apache.hadoop.mapred.TaskTracker:
>>>>>>>>>>>>> Error running child
>>>>>>>>>>>>> java.io.IOException: attempt_200810251826_0013_r_000000_0The
>>>>>>>>>>>>> reduce
>>>>>>>>>>>>> copier failed
>>>>>>>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is while running the synthetic_control.data example, but I
>>>>>>>>>>>>> have
>>>>>>>>>>>>> the
>>>>>>>>>>>>> same problems with any other input data.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am able to do other map-reduce job without problems.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is the output of the jar task:
>>>>>>>>>>>>>
>>>>>>>>>>>>> hadoop@philippe-vaio:/usr/local/hadoop$ bin/hadoop jar
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> /home/philippe/workspace/MahoutJava/examples/dist/apache-mahout-examples-0.1-dev.jar
>>>>>>>>>>>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>>>>>>>>>>>>> 08/10/25 19:09:27 WARN mapred.JobClient: Use
>>>>>>>>>>>>> GenericOptionsParser
>>>>>>>>>>>>> for
>>>>>>>>>>>>> parsing the arguments. Applications should implement Tool for
>>>>>>>>>>>>> the
>>>>>>>>>>>>> same.
>>>>>>>>>>>>> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 1
>>>>>>>>>>>>> 08/10/25 19:09:28 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 1
>>>>>>>>>>>>> 08/10/25 19:09:28 INFO mapred.JobClient: Running job:
>>>>>>>>>>>>> job_200810251826_0010
>>>>>>>>>>>>> 08/10/25 19:09:29 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:31 INFO mapred.JobClient:  map 50% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Job complete:
>>>>>>>>>>>>> job_200810251826_0010
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Counters: 7
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:   File Systems
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     HDFS bytes
>>>>>>>>>>>>> read=291644
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     HDFS bytes
>>>>>>>>>>>>> written=323660
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:   Job Counters
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     Launched map
>>>>>>>>>>>>> tasks=2
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     Data-local map
>>>>>>>>>>>>> tasks=2
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:   Map-Reduce Framework
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     Map input
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     Map input
>>>>>>>>>>>>> bytes=288374
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient:     Map output
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:09:32 WARN mapred.JobClient: Use
>>>>>>>>>>>>> GenericOptionsParser
>>>>>>>>>>>>> for
>>>>>>>>>>>>> parsing the arguments. Applications should implement Tool for
>>>>>>>>>>>>> the
>>>>>>>>>>>>> same.
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 2
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 2
>>>>>>>>>>>>> 08/10/25 19:09:32 INFO mapred.JobClient: Running job:
>>>>>>>>>>>>> job_200810251826_0011
>>>>>>>>>>>>> 08/10/25 19:09:33 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:37 INFO mapred.JobClient:  map 50% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:39 INFO mapred.JobClient:  map 100% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:44 INFO mapred.JobClient:  map 100% reduce 16%
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Job complete:
>>>>>>>>>>>>> job_200810251826_0011
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient: Counters: 16
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:   File Systems
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     HDFS bytes
>>>>>>>>>>>>> read=323660
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     HDFS bytes
>>>>>>>>>>>>> written=1447
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Local bytes
>>>>>>>>>>>>> read=1389
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Local bytes
>>>>>>>>>>>>> written=37878
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:   Job Counters
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Launched reduce
>>>>>>>>>>>>> tasks=1
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Launched map
>>>>>>>>>>>>> tasks=2
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Data-local map
>>>>>>>>>>>>> tasks=2
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:   Map-Reduce Framework
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>>> groups=1
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Combine output
>>>>>>>>>>>>> records=29
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Map input
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>>> records=1
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Map output
>>>>>>>>>>>>> bytes=943020
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Map input
>>>>>>>>>>>>> bytes=323660
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Combine input
>>>>>>>>>>>>> records=1760
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Map output
>>>>>>>>>>>>> records=1732
>>>>>>>>>>>>> 08/10/25 19:09:52 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>>> records=1
>>>>>>>>>>>>> 08/10/25 19:09:53 WARN mapred.JobClient: Use
>>>>>>>>>>>>> GenericOptionsParser
>>>>>>>>>>>>> for
>>>>>>>>>>>>> parsing the arguments. Applications should implement Tool for
>>>>>>>>>>>>> the
>>>>>>>>>>>>> same.
>>>>>>>>>>>>> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 2
>>>>>>>>>>>>> 08/10/25 19:09:53 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 2
>>>>>>>>>>>>> 08/10/25 19:09:53 INFO mapred.JobClient: Running job:
>>>>>>>>>>>>> job_200810251826_0012
>>>>>>>>>>>>> 08/10/25 19:09:54 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:56 INFO mapred.JobClient:  map 50% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:09:58 INFO mapred.JobClient:  map 100% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Job complete:
>>>>>>>>>>>>> job_200810251826_0012
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient: Counters: 16
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:   File Systems
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     HDFS bytes
>>>>>>>>>>>>> read=326554
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     HDFS bytes
>>>>>>>>>>>>> written=1137260
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Local bytes
>>>>>>>>>>>>> read=1147358
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Local bytes
>>>>>>>>>>>>> written=2304490
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:   Job Counters
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Launched reduce
>>>>>>>>>>>>> tasks=1
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Launched map
>>>>>>>>>>>>> tasks=2
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Data-local map
>>>>>>>>>>>>> tasks=2
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:   Map-Reduce Framework
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>>> groups=1
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Combine output
>>>>>>>>>>>>> records=0
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Map input
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Map output
>>>>>>>>>>>>> bytes=1139660
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Map input
>>>>>>>>>>>>> bytes=323660
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Combine input
>>>>>>>>>>>>> records=0
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Map output
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>>> records=600
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO kmeans.KMeansDriver: Iteration 0
>>>>>>>>>>>>> 08/10/25 19:10:02 WARN mapred.JobClient: Use
>>>>>>>>>>>>> GenericOptionsParser
>>>>>>>>>>>>> for
>>>>>>>>>>>>> parsing the arguments. Applications should implement Tool for
>>>>>>>>>>>>> the
>>>>>>>>>>>>> same.
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 2
>>>>>>>>>>>>> 08/10/25 19:10:02 INFO mapred.FileInputFormat: Total input
>>>>>>>>>>>>> paths
>>>>>>>>>>>>> to
>>>>>>>>>>>>> process
>>>>>>>>>>>>> : 2
>>>>>>>>>>>>> 08/10/25 19:10:03 INFO mapred.JobClient: Running job:
>>>>>>>>>>>>> job_200810251826_0013
>>>>>>>>>>>>> 08/10/25 19:10:04 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:10:08 INFO mapred.JobClient:  map 50% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:10:09 INFO mapred.JobClient:  map 100% reduce 0%
>>>>>>>>>>>>> 08/10/25 19:10:21 INFO mapred.JobClient: Task Id :
>>>>>>>>>>>>> attempt_200810251826_0013_r_000000_0, Status : FAILED
>>>>>>>>>>>>> java.io.IOException: attempt_200810251826_0013_r_000000_0The
>>>>>>>>>>>>> reduce
>>>>>>>>>>>>> copier
>>>>>>>>>>>>> failed
>>>>>>>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not sure if I am doing something wrong here.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Philippe.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>> Grant Ingersoll
>>>>>>>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New
>>>>>>>>>>>> Orleans.
>>>>>>>>>>>> http://www.lucenebootcamp.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Lucene Helpful Hints:
>>>>>>>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>>>>>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------------
>>>>>>>>>>>>
>>>>>>>>>>> Grant Ingersoll
>>>>>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New
>>>>>>>>>> Orleans.
>>>>>>>>>> http://www.lucenebootcamp.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Lucene Helpful Hints:
>>>>>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>>>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --------------------------
>>>>>>>>>>
>>>>>>>>> Grant Ingersoll
>>>>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
>>>>>>>>> http://www.lucenebootcamp.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Lucene Helpful Hints:
>>>>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --------------------------
>>>>>>>>>
>>>>>>>> Grant Ingersoll
>>>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
>>>>>>>> http://www.lucenebootcamp.com
>>>>>>>>
>>>>>>>>
>>>>>>>> Lucene Helpful Hints:
>>>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  --------------------------
>>>>>>>
>>>>>> Grant Ingersoll
>>>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
>>>>>> http://www.lucenebootcamp.com
>>>>>>
>>>>>>
>>>>>> Lucene Helpful Hints:
>>>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  --------------------------
>>>> Grant Ingersoll
>>>> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
>>>> http://www.lucenebootcamp.com
>>>>
>>>>
>>>> Lucene Helpful Hints:
>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
> --------------------------
> Grant Ingersoll
> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
> http://www.lucenebootcamp.com
>
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message