mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paritosh Ranjan <pran...@xebia.com>
Subject Re: Checksum error on K-means
Date Fri, 11 May 2012 19:19:38 GMT
Moving to user list.

The error is happening while interacting with disc. That's for sure. 
However, I can not think of the reason.

I can only see one escape root for now. Try Mahout 0.7-snapshot. It has 
a completely new implementation of K-Means. So, you might not encounter 
this error.
All the best.

On 11-05-2012 18:12, Michael Kazekin wrote:
> Paritosh, I use Mahout in a local mode, so I don't think that there is 
> a HDFS partition :( Am I wrong?
>
> But thanks anyway!
>
> On 05/10/2012 09:36 PM, Paritosh Ranjan wrote:
>> I just googled out this exception and this looks likes a hdfs issue.
>> Can you try formatting your hdfs and then rerun the K-Means clustering?
>>
>> On 10-05-2012 21:50, Michael Kazekin wrote:
>>> Hello!
>>>
>>> I have about 1.3M vectors from lucene.vector utility that I later 
>>> try to clusterize in 550 clusters. Everything seems to be fine, 
>>> clusterization starts, but in an hour I get:
>>>
>>>
>>> 12/05/10 18:26:50 INFO fs.FSInputChecker: Found checksum error: 
>>> b[196, 
>>> 708]=6b8e184f4f7c8900d812ade6a7269429bc3520746f5df7387257558a3f02c4af3f6e4bf05ef2676a88b54c86409399375df28bb28abe47df012c891e771ff57264883d88f7ec0db1d1fb3581b00ab0438df7de297763f4b9005cdef9eda32b3715e0ed015bf2609ad5e8c18f5f3500e921f5fd856cebdc96173080e6cbeac5f4957eb3b9d0a72d31bf8d9ca8c0c4d7204092fd8269aad260d5b007a0f9d4d59a7ebb1291588a00346187d1a72b23b4d26804a0f7587d8cb32f4aeda0224086528c9ac617b7ce850888c3ef2fa24e61f5cb45ce26e9c6057b57fa53e950266946e5b1ca5135e1a79b804e3bd2d5b57f0d321b5e535dd76e3a754c40c66b00066bcd9991778af3add0314e476bc96e959aa80ea831e1a295c024e578dbdb4a0448538b0e5138482541c718e65bf967a5542a338b218617b6588db0ff0a66e443f1bcbfc8667e3b90f10e809da4bc33da59a34a1452ca2a85dd1edc17d57c6834f325e97b4a23b7b06abb18db4fdd7b01e5dd9ce265654b544423b473cf2efcd52ac905ac07603b19b653e952c3c2ab20baee4b5b82bb7ef4c86b085d14f284c3d106529c25e0a80f69b12368a52405c0ee3ecd7be8bd1dbf148410ab4e9c32068926f9755ac919f5344df12dba241601888fd565afef29088e4c458044251ee5db4bd7b2613b4049ed95d10fd5ceabf2856eebd476f5ea595564062340ead4fe6f
>>> org.apache.hadoop.fs.ChecksumException: Checksum error: 
>>> file:/lms/apps/data/mahout/mahout_rus_938K_en_410K/mahout_vectors at 
>>> 677131776
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
>>>     at java.io.DataInputStream.readFully(Unknown Source)
>>>     at 
>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
>>>     at 
>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>>>     at 
>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1930)
>>>     at 
>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2062)
>>>     at 
>>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.nextKeyValue(SequenceFileRecordReader.java:68)
>>>     at 
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
>>>     at 
>>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>     at 
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 
>>>
>>> 12/05/10 18:26:50 WARN mapred.LocalJobRunner: job_local_0001
>>> org.apache.hadoop.fs.ChecksumException: Checksum error: 
>>> file:/lms/apps/data/mahout/mahout_rus_938K_en_410K/mahout_vectors at 
>>> 677131776
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
>>>     at 
>>> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
>>>     at java.io.DataInputStream.readFully(Unknown Source)
>>>     at 
>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
>>>     at 
>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>>>     at 
>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1930)
>>>     at 
>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2062)
>>>     at 
>>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.nextKeyValue(SequenceFileRecordReader.java:68)
>>>     at 
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
>>>     at 
>>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>     at 
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 
>>>
>>> 12/05/10 18:26:50 INFO mapred.JobClient: Job complete: job_local_0001
>>> 12/05/10 18:26:50 INFO mapred.JobClient: Counters: 11
>>> 12/05/10 18:26:50 INFO mapred.JobClient:   FileSystemCounters
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     
>>> FILE_BYTES_READ=78797420248
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     
>>> FILE_BYTES_WRITTEN=1141852988
>>> 12/05/10 18:26:50 INFO mapred.JobClient:   File Input Format Counters
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Bytes Read=682074112
>>> 12/05/10 18:26:50 INFO mapred.JobClient:   Map-Reduce Framework
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Map output materialized 
>>> bytes=100785641
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Combine output 
>>> records=2182
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Map input records=236549
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Spilled Records=2182
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Map output 
>>> bytes=1444355648
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Combine input 
>>> records=234975
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     Map output records=236548
>>> 12/05/10 18:26:50 INFO mapred.JobClient:     SPLIT_RAW_BYTES=2730
>>> Exception in thread "main" java.lang.InterruptedException: K-Means 
>>> Iteration failed processing 
>>> /lms/apps/data/mahout/mahout_rus_938K_en_410K/centroid-rndm-seeds/part-randomSeed
>>>     at 
>>> org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:371)
>>>     at 
>>> org.apache.mahout.clustering.kmeans.KMeansDriver.buildClustersMR(KMeansDriver.java:316)
>>>     at 
>>> org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:239)
>>>     at 
>>> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:154)
>>>     at 
>>> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:112)
>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>     at 
>>> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:61)
>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>     at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>>     at java.lang.reflect.Method.invoke(Unknown Source)
>>>     at 
>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>     at 
>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>     at 
>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
>>>
>>>
>>>
>>> I tried Mahout 5.0 and 6.0. I didn't encounter such problems on 
>>> smaller collections (~ 400K vectors, Mahout 5.0)
>>>
>>> Do you have any insight on what's going on, and what are the 
>>> possible ways of solving the problem?
>>>
>>>
>>
>>
>>
>>
>
>



Mime
View raw message