spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hao Ren <inv...@gmail.com>
Subject Re: HiveContext setConf seems not stable
Date Thu, 02 Apr 2015 08:47:43 GMT
Hi,

Jira created: https://issues.apache.org/jira/browse/SPARK-6675

Thank you.


On Wed, Apr 1, 2015 at 7:50 PM, Michael Armbrust <michael@databricks.com>
wrote:

> Can you open a JIRA please?
>
> On Wed, Apr 1, 2015 at 9:38 AM, Hao Ren <invkrh@gmail.com> wrote:
>
>> Hi,
>>
>> I find HiveContext.setConf does not work correctly. Here are some code
>> snippets showing the problem:
>>
>> snippet 1:
>>
>> ----------------------------------------------------------------------------------------------------------------
>> import org.apache.spark.sql.hive.HiveContext
>> import org.apache.spark.{SparkConf, SparkContext}
>>
>> object Main extends App {
>>
>>   val conf = new SparkConf()
>>     .setAppName("context-test")
>>     .setMaster("local[8]")
>>   val sc = new SparkContext(conf)
>>   val hc = new HiveContext(sc)
>>
>>   *hc.setConf("spark.sql.shuffle.partitions", "10")*
>> *  hc.setConf("hive.metastore.warehouse.dir",
>> "/home/spark/hive/warehouse_test")*
>>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>>   hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach
>> println
>> }
>>
>> ----------------------------------------------------------------------------------------------------------------
>>
>> *Results:*
>> (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test)
>> (spark.sql.shuffle.partitions,10)
>>
>> snippet 2:
>>
>> ----------------------------------------------------------------------------------------------------------------
>> ...
>>   *hc.setConf("hive.metastore.warehouse.dir",
>> "/home/spark/hive/warehouse_test")*
>> *  hc.setConf("spark.sql.shuffle.partitions", "10")*
>>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>>   hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach
>> println
>> ...
>>
>> ----------------------------------------------------------------------------------------------------------------
>>
>> *Results:*
>> (hive.metastore.warehouse.dir,/user/hive/warehouse)
>> (spark.sql.shuffle.partitions,10)
>>
>> *You can see that I just permuted the two setConf call, then that leads
>> to two different Hive configuration.*
>> *It seems that HiveContext can not set a new value on
>> "hive.metastore.warehouse.dir" key in one or the first "setConf" call.*
>> *You need another "setConf" call before changing
>> "hive.metastore.warehouse.dir". For example, set
>> "hive.metastore.warehouse.dir" twice and the snippet 1*
>>
>> snippet 3:
>>
>> ----------------------------------------------------------------------------------------------------------------
>> ...
>> *  hc.setConf("hive.metastore.warehouse.dir",
>> "/home/spark/hive/warehouse_test")*
>> *  hc.setConf("hive.metastore.warehouse.dir",
>> "/home/spark/hive/warehouse_test")*
>>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>> ...
>>
>> ----------------------------------------------------------------------------------------------------------------
>>
>> *Results:*
>> (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test)
>>
>>
>> *You can reproduce this if you move to the latest branch-1.3
>> (1.3.1-snapshot, htag = 7d029cb1eb6f1df1bce1a3f5784fb7ce2f981a33)*
>>
>> *I have also tested the released 1.3.0 (htag =
>> 4aaf48d46d13129f0f9bdafd771dd80fe568a7dc). It has the same problem.*
>>
>> *Please tell me if I am missing something. Any help is highly
>> appreciated.*
>>
>> Hao
>>
>> --
>> Hao Ren
>>
>> {Data, Software} Engineer @ ClaraVista
>>
>> Paris, France
>>
>
>


-- 
Hao Ren

{Data, Software} Engineer @ ClaraVista

Paris, France

Mime
View raw message