spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: How is hive-site.xml loaded?
Date Mon, 13 Apr 2015 20:04:21 GMT
There's some magic in the process that is worth knowing/being cautious of

Those special HDFSConfiguration, YarnConfiguration, HiveConf objects are all doing work in
their class initializer to call Configuration.addDefaultResource

this puts their -default and -site XML files onto the list of default configuration. Hadoop
then runs through the list of configuration instances it is tracking in a WeakHashmap, and,
if created with the useDefaults=true option in their constructor, tells them to reload all
their "default" config props (preserving anything set explicitly).

This means you can use/abuse this feature to force in properties onto all Hadoop Configuration
instances that asked for the default values -though this doesn't guarantee the changes will
be picked up.

It's generally considered best practice for apps to create an instance of the configuration
classes whose defaults & site they want picked up as soon as they can. Even if you discard
the instance itself. Your goal is to get those settings in, so that the defaults don't get
picked up elsewhere.
-steve

> On 13 Apr 2015, at 07:10, Raunak Jhawar <raunak.jhawar@gmail.com> wrote:
> 
> The most obvious path being /etc/hive/conf, but this can be changed to
> lookup for any other path.
> 
> --
> Thanks,
> Raunak Jhawar
> 
> 
> 
> 
> 
> 
> On Mon, Apr 13, 2015 at 11:22 AM, Dean Chen <dean@ocirs.com> wrote:
> 
>> Ah ok, thanks!
>> 
>> --
>> Dean Chen
>> 
>> On Apr 12, 2015, at 10:45 PM, Reynold Xin <rxin@databricks.com> wrote:
>> 
>> It is loaded by Hive's HiveConf, which simply searches for hive-site.xml on
>> the classpath.
>> 
>> 
>> On Sun, Apr 12, 2015 at 10:41 PM, Dean Chen <dean@ocirs.com> wrote:
>> 
>>> The docs state that:
>>> Configuration of Hive is done by placing your `hive-site.xml` file in
>>> `conf/`.
>>> 
>>> I've searched the codebase for hive-site.xml and didn't find code that
>>> specifically loaded it anywhere so it looks like there is some magic to
>>> autoload *.xml files in /conf? I've skimmed through HiveContext
>>> <
>>> 
>> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
>>>> 
>>> and didn't see anything obvious in there.
>>> 
>>> The reason I'm asking is that I am working on a feature that needs config
>>> in hbase-site.xml to be available in the spark context and would prefer
>> to
>>> follow the convention set by hive-site.xml.
>>> 
>>> --
>>> Dean Chen
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message