spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: SparkSQL: How to specify replication factor on the persisted parquet files?
Date Mon, 08 Jun 2015 10:40:30 GMT
Then one possible workaround is to set "dfs.replication" in 
"sc.hadoopConfiguration".

However, this configuration is shared by all Spark jobs issued within 
the same application. Since different Spark jobs can be issued from 
different threads, you need to pay attention to synchronization.

Cheng

On 6/8/15 2:46 PM, Haopu Wang wrote:
> Cheng, thanks for the response.
>
> Yes, I was using HiveContext.setConf() to set "dfs.replication".
> However, I cannot change the value in Hadoop core-site.xml because that
> will change every HDFS file.
> I only want to change the replication factor of some specific files.
>
> -----Original Message-----
> From: Cheng Lian [mailto:lian.cs.zju@gmail.com]
> Sent: Sunday, June 07, 2015 10:17 PM
> To: Haopu Wang; user
> Subject: Re: SparkSQL: How to specify replication factor on the
> persisted parquet files?
>
> Were you using HiveContext.setConf()?
>
> "dfs.replication" is a Hadoop configuration, but setConf() is only used
> to set Spark SQL specific configurations. You may either set it in your
> Hadoop core-site.xml.
>
> Cheng
>
>
> On 6/2/15 2:28 PM, Haopu Wang wrote:
>> Hi,
>>
>> I'm trying to save SparkSQL DataFrame to a persistent Hive table using
>> the default parquet data source.
>>
>> I don't know how to change the replication factor of the generated
>> parquet files on HDFS.
>>
>> I tried to set "dfs.replication" on HiveContext but that didn't work.
>> Any suggestions are appreciated very much!
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message