spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Divya Narayan <>
Subject hadoop replication property from spark code not working
Date Wed, 26 Jun 2019 12:22:15 GMT

I have a use case for which I want to override the default hdfs replication
factor from my spark code. For this I have set the hadoop replication like

val sc = new SparkContext(conf)

Now my spark job runs as a cron job in some specific interval and create
output directory for corresponding hour. Problem I am facing is that for
80% of the runs,the  files are created with replication factor 1(which is
desired), but for rest 20% case files are created with default replication
factor 2. I am not sure why that is happening. Any help would be

Thank you

View raw message