spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: insert hive partitioned table
Date Mon, 16 Mar 2015 14:49:50 GMT
I see. Since all Spark SQL queries must be issued from the driver side, 
you'll have to first collect all interested values to the driver side, 
and then use them to compose one or more insert statements.

Cheng

On 3/16/15 10:33 PM, patcharee wrote:
> I would like to insert the table, and the value of the partition 
> column to be inserted must be from temporary registered table/dataframe.
>
> Patcharee
>
>
> On 16. mars 2015 15:26, Cheng Lian wrote:
>>
>> Not quite sure whether I understand your question properly. But if 
>> you just want to read the partition columns, it’s pretty easy. Take 
>> the “year” column as an example, you may do this in HiveQL:
>>
>> |hiveContext.sql("SELECT year FROM speed")
>> |
>>
>> or in DataFrame DSL:
>>
>> |hiveContext.table("speed").select("year")
>> |
>>
>> Cheng
>>
>> On 3/16/15 9:59 PM, patcharee wrote:
>>
>>> Hi,
>>>
>>> I tried to insert into a hive partitioned table
>>>
>>> val ZONE: Int = Integer.valueOf(args(2))
>>> val MONTH: Int = Integer.valueOf(args(3))
>>> val YEAR: Int = Integer.valueOf(args(4))
>>>
>>> val weightedUVToDF = weightedUVToRecord.toDF()
>>> weightedUVToDF.registerTempTable("speeddata")
>>> hiveContext.sql("INSERT OVERWRITE table speed partition (year=" + 
>>> YEAR + ",month=" + MONTH + ",zone=" + ZONE + ")
>>>     select key, speed, direction from speeddata")
>>>
>>> First I registered a temporary table "speeddata". The value of the 
>>> partitioned column (year, month, zone) is from user input. If I 
>>> would like to get the value of the partitioned column from the 
>>> temporary table, how can I do that?
>>>
>>> BR,
>>> Patcharee
>> ​
>


Mime
View raw message