spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ulanov, Alexander" <alexander.ula...@hp.com>
Subject Force Spark save parquet files with replication factor other than 3 (default one)
Date Tue, 23 Jun 2015 01:44:29 GMT
Hi,

My Hadoop is configured to have replication ratio = 2. I've added $HADOOP_HOME/config to the
PATH as suggested in http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-td289.html.
Spark (1.4) does rdd.saveAsTextFile with replication=2. However DataFrame.saveAsParquet is
done with replication = 3. How can I force Spark Dataframe to save parquet files with replication
factor other than 3 (default one)?

Best regards, Alexander

Mime
View raw message