spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhijeet Kumar <abhijeet.ku...@sentienz.com>
Subject Re: Spark / Scala code not recognising the path?
Date Sat, 09 Jun 2018 06:34:02 GMT
Can you please tell the estimated time. So, that my program will wait for that time period.

Thanks,
Abhijeet Kumar
> On 09-Jun-2018, at 12:01 PM, Jörn Franke <jornfranke@gmail.com> wrote:
> 
> You need some time until the information of the file creation is propagated.
> 
> On 9. Jun 2018, at 08:07, Abhijeet Kumar <abhijeet.kumar@sentienz.com <mailto:abhijeet.kumar@sentienz.com>>
wrote:
> 
>> I'm modifying a CSV file which is inside HDFS and finally putting it back to HDFS
in Spark.
>> val fs=FileSystem.get(spark.sparkContext.hadoopConfiguration)
>> csv_file.coalesce(1).write
>>   .format("csv”)
>>   .mode("overwrite”)
>>   .save("hdfs://localhost:8020/data/temp_insight <hdfs://localhost:8020/data/temp_insight>”)
>> Thread.sleep(15000)
>> println(fs.exists(new Path("/data/temp_insight")))
>> Output:
>> 
>> false
>> while I have stopped the thread for 15 sec, I have checked my hdfs using command
>> 
>> hdfs dfs -ls /data/temp_insight
>> Output:
>> 
>> 18/06/08 17:48:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
>> -rw-r--r--   3 abhijeet supergroup          0 2018-06-08 17:48 /data/temp_insight/_SUCCESS
>> -rw-r--r--   3 abhijeet supergroup        201 2018-06-08 17:48 /data/temp_insight/part-00000-7bffb826-f18d-4022-b089-da85565525b7-c000.csv
>> To cross verify whether it is taking the path of hdfs or not I have added one more
println statement in my code, providing the path which is already there in HDFS. It's showing
true in that case.
>> 
>> So, what could be the reason?
>> 
>> Thanks,
>> 
>> Abhijeet Kumar


Mime
View raw message