spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haopu Wang" <HW...@qilinsoft.com>
Subject RE: SparkSQL: How to specify replication factor on the persisted parquet files?
Date Mon, 08 Jun 2015 06:46:26 GMT
Cheng, thanks for the response.

Yes, I was using HiveContext.setConf() to set "dfs.replication".
However, I cannot change the value in Hadoop core-site.xml because that
will change every HDFS file.
I only want to change the replication factor of some specific files.

-----Original Message-----
From: Cheng Lian [mailto:lian.cs.zju@gmail.com] 
Sent: Sunday, June 07, 2015 10:17 PM
To: Haopu Wang; user
Subject: Re: SparkSQL: How to specify replication factor on the
persisted parquet files?

Were you using HiveContext.setConf()?

"dfs.replication" is a Hadoop configuration, but setConf() is only used 
to set Spark SQL specific configurations. You may either set it in your 
Hadoop core-site.xml.

Cheng


On 6/2/15 2:28 PM, Haopu Wang wrote:
> Hi,
>
> I'm trying to save SparkSQL DataFrame to a persistent Hive table using
> the default parquet data source.
>
> I don't know how to change the replication factor of the generated
> parquet files on HDFS.
>
> I tried to set "dfs.replication" on HiveContext but that didn't work.
> Any suggestions are appreciated very much!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message