spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject Re: Update MySQL table via Spark/SparkR?
Date Mon, 21 Aug 2017 22:58:46 GMT
How about append and a view simulating the update? Then you do not need 2
processes...

On Tue, 22 Aug 2017 at 8:44 am, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Hi Jake,
>
> This is an issue across all RDBMs including Oracle etc. When you are
> updating you have to commit or roll back in RDBMS itself and I am not aware
> of Spark doing that.
>
> The staging table is a safer method as it follows ETL type approach. You
> create new data in the staging table in RDBMS and do the DML in the RDBMS
> itself where you can control commit or rollback. That is the way I would do
> it. A simple shell script can do both.
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 21 August 2017 at 15:50, Jake Russ <jruss@bloomintelligence.com> wrote:
>
>> Hi everyone,
>>
>>
>>
>> I’m currently using SparkR to read data from a MySQL database, perform
>> some calculations, and then write the results back to MySQL. Is it still
>> true that Spark does not support UPDATE queries via JDBC? I’ve seen many
>> posts on the internet that Spark’s DataFrameWriter does not support
>> UPDATE queries via JDBC
>> <https://issues.apache.org/jira/browse/SPARK-19335>. It will only
>> “append” or “overwrite” to existing tables. The best advice I’ve found
so
>> far, for performing this update, is to write to a staging table in MySQL
>> <https://stackoverflow.com/questions/34643200/spark-dataframes-upsert-to-postgres-table>
and
>> then perform the UPDATE query on the MySQL side.
>>
>>
>>
>> Ideally, I’d like to handle the update during the write operation. Has
>> anyone else encountered this limitation and have a better solution?
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Jake
>>
>
> --
Best Regards,
Ayan Guha

Mime
View raw message