spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajay Chander <itsche...@gmail.com>
Subject Re: Spark support for update/delete operations on Hive ORC transactional tables
Date Wed, 22 Jun 2016 16:55:15 GMT
Thanks for the confirmation Mich!

On Wednesday, June 22, 2016, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Hi Ajay,
>
> I am afraid for now transaction heart beat do not work through Spark, so I
> have no other solution.
>
> This is interesting point as with Hive running on Spark engine there is no
> issue with this as Hive handles the transactions,
>
> I gather in simplest form Hive has to deal with its metadata for
> transaction logic but Spark somehow cannot do that.
>
> In short that is it. You need to do that through Hive.
>
> Cheers,
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 22 June 2016 at 16:08, Ajay Chander <itschevva@gmail.com
> <javascript:_e(%7B%7D,'cvml','itschevva@gmail.com');>> wrote:
>
>> Hi Mich,
>>
>> Right now I have a similar usecase where I have to delete some rows
>> from a hive table. My hive table is of type ORC, Bucketed and included
>> transactional property. I can delete from hive shell but not from my
>> spark-shell or spark app. Were you able to find any work around? Thank
>> you.
>>
>> Regards,
>> Ajay
>>
>>
>> On Thursday, June 2, 2016, Mich Talebzadeh <mich.talebzadeh@gmail.com
>> <javascript:_e(%7B%7D,'cvml','mich.talebzadeh@gmail.com');>> wrote:
>>
>>> thanks for that.
>>>
>>> I will have a look
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 2 June 2016 at 10:46, Elliot West <teabot@gmail.com> wrote:
>>>
>>>> Related to this, there exists an API in Hive to simplify the
>>>> integrations of other frameworks with Hive's ACID feature:
>>>>
>>>> See:
>>>> https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API
>>>>
>>>> It contains code for maintaining heartbeats, handling locks and
>>>> transactions, and submitting mutations in a distributed environment.
>>>>
>>>> We have used it to write to transactional tables from Cascading based
>>>> processes.
>>>>
>>>> Elliot.
>>>>
>>>>
>>>> On 2 June 2016 at 09:54, Mich Talebzadeh <mich.talebzadeh@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Spark does not support transactions because as I understand there is
>>>>> a piece in the execution side that needs to send heartbeats to Hive
>>>>> metastore saying a transaction is still alive". That has not been
>>>>> implemented in Spark yet to my knowledge."
>>>>>
>>>>> Any idea on the timelines when we are going to have support for
>>>>> transactions in Spark for Hive ORC tables. This will really be useful.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>> Dr Mich Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>
>>>>>
>>>>>
>>>>> http://talebzadehmich.wordpress.com
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>

Mime
View raw message