spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhijeet Kumar <abhijeet.ku...@sentienz.com>
Subject Spark / Scala code not recognising the path?
Date Sat, 09 Jun 2018 06:07:52 GMT
I'm modifying a CSV file which is inside HDFS and finally putting it back to HDFS in Spark.
val fs=FileSystem.get(spark.sparkContext.hadoopConfiguration)
csv_file.coalesce(1).write
  .format("csv”)
  .mode("overwrite”)
  .save("hdfs://localhost:8020/data/temp_insight”)
Thread.sleep(15000)
println(fs.exists(new Path("/data/temp_insight")))
Output:

false
while I have stopped the thread for 15 sec, I have checked my hdfs using command

hdfs dfs -ls /data/temp_insight
Output:

18/06/08 17:48:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
-rw-r--r--   3 abhijeet supergroup          0 2018-06-08 17:48 /data/temp_insight/_SUCCESS
-rw-r--r--   3 abhijeet supergroup        201 2018-06-08 17:48 /data/temp_insight/part-00000-7bffb826-f18d-4022-b089-da85565525b7-c000.csv
To cross verify whether it is taking the path of hdfs or not I have added one more println
statement in my code, providing the path which is already there in HDFS. It's showing true
in that case.

So, what could be the reason?

Thanks,

Abhijeet Kumar
Mime
View raw message