spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alvarobrandon <alvarobran...@gmail.com>
Subject Task Output size in Spark WEB UI not the same as in HDFS
Date Fri, 26 Feb 2016 13:59:49 GMT
I'm puzzled by the following results I got from executing an application that
just generates data and writes it to HDFS. The 16 tasks that ran for that
app look like this:

<http://apache-spark-user-list.1001560.n3.nabble.com/file/n26345/Screen_Shot_2016-02-26_at_14.png>


So 4 task wrote 2.9GB to the output. But actually if I check in HDFS I have
16 2.9GB files. How comes?
abrandon@granduc-13:/opt/hadoop/bin$ ./hdfs dfs -ls -h kMeans
16/02/26 13:44:07 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 17 items
-rw-r--r--   3 abrandon supergroup          0 2016-02-26 13:40
kMeans/_SUCCESS
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:39
kMeans/part-00000
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:38
kMeans/part-00001
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:38
kMeans/part-00002
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:38
kMeans/part-00003
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:39
kMeans/part-00004
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:39
kMeans/part-00005
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:39
kMeans/part-00006
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:39
kMeans/part-00007
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00008
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00009
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00010
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00011
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00012
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00013
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:39
kMeans/part-00014
-rw-r--r--   3 abrandon supergroup      2.9 G 2016-02-26 13:40
kMeans/part-00015





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Task-Output-size-in-Spark-WEB-UI-not-the-same-as-in-HDFS-tp26345.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message