spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: HiveContext setConf seems not stable
Date Wed, 01 Apr 2015 17:50:51 GMT
Can you open a JIRA please?

On Wed, Apr 1, 2015 at 9:38 AM, Hao Ren <invkrh@gmail.com> wrote:

> Hi,
>
> I find HiveContext.setConf does not work correctly. Here are some code
> snippets showing the problem:
>
> snippet 1:
>
> ----------------------------------------------------------------------------------------------------------------
> import org.apache.spark.sql.hive.HiveContext
> import org.apache.spark.{SparkConf, SparkContext}
>
> object Main extends App {
>
>   val conf = new SparkConf()
>     .setAppName("context-test")
>     .setMaster("local[8]")
>   val sc = new SparkContext(conf)
>   val hc = new HiveContext(sc)
>
>   *hc.setConf("spark.sql.shuffle.partitions", "10")*
> *  hc.setConf("hive.metastore.warehouse.dir",
> "/home/spark/hive/warehouse_test")*
>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>   hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach
> println
> }
>
> ----------------------------------------------------------------------------------------------------------------
>
> *Results:*
> (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test)
> (spark.sql.shuffle.partitions,10)
>
> snippet 2:
>
> ----------------------------------------------------------------------------------------------------------------
> ...
>   *hc.setConf("hive.metastore.warehouse.dir",
> "/home/spark/hive/warehouse_test")*
> *  hc.setConf("spark.sql.shuffle.partitions", "10")*
>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
>   hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach
> println
> ...
>
> ----------------------------------------------------------------------------------------------------------------
>
> *Results:*
> (hive.metastore.warehouse.dir,/user/hive/warehouse)
> (spark.sql.shuffle.partitions,10)
>
> *You can see that I just permuted the two setConf call, then that leads to
> two different Hive configuration.*
> *It seems that HiveContext can not set a new value on
> "hive.metastore.warehouse.dir" key in one or the first "setConf" call.*
> *You need another "setConf" call before changing
> "hive.metastore.warehouse.dir". For example, set
> "hive.metastore.warehouse.dir" twice and the snippet 1*
>
> snippet 3:
>
> ----------------------------------------------------------------------------------------------------------------
> ...
> *  hc.setConf("hive.metastore.warehouse.dir",
> "/home/spark/hive/warehouse_test")*
> *  hc.setConf("hive.metastore.warehouse.dir",
> "/home/spark/hive/warehouse_test")*
>   hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println
> ...
>
> ----------------------------------------------------------------------------------------------------------------
>
> *Results:*
> (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test)
>
>
> *You can reproduce this if you move to the latest branch-1.3
> (1.3.1-snapshot, htag = 7d029cb1eb6f1df1bce1a3f5784fb7ce2f981a33)*
>
> *I have also tested the released 1.3.0 (htag =
> 4aaf48d46d13129f0f9bdafd771dd80fe568a7dc). It has the same problem.*
>
> *Please tell me if I am missing something. Any help is highly appreciated.*
>
> Hao
>
> --
> Hao Ren
>
> {Data, Software} Engineer @ ClaraVista
>
> Paris, France
>

Mime
View raw message