spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
Subject Re: Can spark sql read existing tables created in hive
Date Fri, 27 Mar 2015 12:42:02 GMT
I can recreate tables but what about data. It looks like this is a obvious
feature that Spark SQL must be having. People will want to transform tons
of data stored in HDFS through Hive from Spark SQL.

Spark programming guide suggests its possible.


Spark SQL also supports reading and writing data stored in Apache Hive
<http://hive.apache.org/>.  .... Configuration of Hive is done by placing
your hive-site.xml file in conf/.
https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables

For some reason its not working.


On Fri, Mar 27, 2015 at 3:35 PM, Arush Kharbanda <arush@sigmoidanalytics.com
> wrote:

> Seems Spark SQL accesses some more columns apart from those created by
> hive.
>
> You can always recreate the tables, you would need to execute the table
> creation scripts but it would be good to avoid recreation.
>
> On Fri, Mar 27, 2015 at 3:20 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
> wrote:
>
>> I did copy hive-conf.xml form Hive installation into spark-home/conf. IT
>> does have all the meta store connection details, host, username, passwd,
>> driver and others.
>>
>>
>>
>> Snippet
>> ======
>>
>>
>> <configuration>
>>
>> <property>
>>   <name>javax.jdo.option.ConnectionURL</name>
>>   <value>jdbc:mysql://host.vip.company.com:3306/HDB</value>
>> </property>
>>
>> <property>
>>   <name>javax.jdo.option.ConnectionDriverName</name>
>>   <value>com.mysql.jdbc.Driver</value>
>>   <description>Driver class name for a JDBC metastore</description>
>> </property>
>>
>> <property>
>>   <name>javax.jdo.option.ConnectionUserName</name>
>>   <value>hiveuser</value>
>>   <description>username to use against metastore database</description>
>> </property>
>>
>> <property>
>>   <name>javax.jdo.option.ConnectionPassword</name>
>>   <value>some-password</value>
>>   <description>password to use against metastore database</description>
>> </property>
>>
>> <property>
>>   <name>hive.metastore.local</name>
>>   <value>false</value>
>>   <description>controls whether to connect to remove metastore server or
>> open a new metastore server in Hive Client JVM</description>
>> </property>
>>
>> <property>
>>   <name>hive.metastore.warehouse.dir</name>
>>   <value>/user/hive/warehouse</value>
>>   <description>location of default database for the
>> warehouse</description>
>> </property>
>>
>> ......
>>
>>
>>
>> When i attempt to read hive table, it does not work. dw_bid does not
>> exists.
>>
>> I am sure there is a way to read tables stored in HDFS (Hive) from Spark
>> SQL. Otherwise how would anyone do analytics since the source tables are
>> always either persisted directly on HDFS or through Hive.
>>
>>
>> On Fri, Mar 27, 2015 at 1:15 PM, Arush Kharbanda <
>> arush@sigmoidanalytics.com> wrote:
>>
>>> Since hive and spark SQL internally use HDFS and Hive metastore. The
>>> only thing you want to change is the processing engine. You can try to
>>> bring your hive-site.xml to %SPARK_HOME%/conf/hive-site.xml.(Ensure that
>>> the hive site xml captures the metastore connection details).
>>>
>>> Its a hack,  i havnt tried it. I have played around with the metastore
>>> and it should work.
>>>
>>> On Fri, Mar 27, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
>>> wrote:
>>>
>>>> I have few tables that are created in Hive. I wan to transform data
>>>> stored in these Hive tables using Spark SQL. Is this even possible ?
>>>>
>>>> So far i have seen that i can create new tables using Spark SQL
>>>> dialect. However when i run show tables or do desc hive_table it says table
>>>> not found.
>>>>
>>>> I am now wondering is this support present or not in Spark SQL ?
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>>>
>>> *Arush Kharbanda* || Technical Teamlead
>>>
>>> arush@sigmoidanalytics.com || www.sigmoidanalytics.com
>>>
>>
>>
>>
>> --
>> Deepak
>>
>>
>
>
> --
>
> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>
> *Arush Kharbanda* || Technical Teamlead
>
> arush@sigmoidanalytics.com || www.sigmoidanalytics.com
>



-- 
Deepak

Mime
View raw message