spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianshi Huang <jianshi.hu...@gmail.com>
Subject Re: Ephemeral Hive metastore for HiveContext?
Date Mon, 27 Oct 2014 15:58:57 GMT
Thanks Ted and Cheng for the in memory derby solution. I'll check it out. :)

And to me, using in-mem by default makes sense, if user wants a shared
metastore, it needs to be specified. An 'embedded' local metastore in the
working directory barely has a use case.

Jianshi



On Mon, Oct 27, 2014 at 9:57 PM, Cheng Lian <lian.cs.zju@gmail.com> wrote:

>  Thanks Ted, this is exactly what Spark SQL LocalHiveContext does. To make
> an embedded metastore local to a single HiveContext, we must allocate
> different Derby database directories for each HiveContext, and Jianshi is
> also trying to avoid that.
>
>
> On 10/27/14 9:44 PM, Ted Yu wrote:
>
> Please see
> https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-EmbeddedMetastore
>
>  Cheers
>
> On Oct 27, 2014, at 6:20 AM, Cheng Lian <lian.cs.zju@gmail.com> wrote:
>
>  I have never tried this yet, but maybe you can use an in-memory Derby
> database as metastore
> https://db.apache.org/derby/docs/10.7/devguide/cdevdvlpinmemdb.html
>
> I'll investigate this when free, guess we can use this for Spark SQL Hive
> support testing.
>
> On 10/27/14 4:38 PM, Jianshi Huang wrote:
>
> There's an annoying small usability issue in HiveContext.
>
>
>  By default, it creates a local metastore which forbids other processes
> using HiveContext to be launched from the same directory.
>
>
>  How can I make the metastore local to each HiveContext? Is there an
> in-memory metastore configuration? /tmp/xxxx temp folders is one solution,
> but it's not elegant and I still need to clean up the files...
>
>
>  I can add hive-site.xml and use a shared metastore, however they'll
> still operate in the same catalog space.
>
>
>  (Simple) SQLContext by default uses in-memory catalog which is bound to
> each context. Since HiveContext is a subclass, we should make the same
> semantics as default. Make sense?
>
>
>  Spark is very much functional and shared nothing, these are wonderful
> features. Let's not have something global as a dependency.
>
>
>
>  Cheers,
>
> --
>
> Jianshi Huang
>
>
>  LinkedIn: jianshi
>
> Twitter: @jshuang
>
> Github & Blog: http://huangjs.github.com/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
>


-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Mime
View raw message