spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ed elliott <ed.elli...@outlook.com>
Subject Re: How can I use pyspark to upsert one row without replacing entire table
Date Wed, 12 Aug 2020 21:55:11 GMT
You’ll need to do an insert and use a trigger on the table to change it into an upsert, also
make sure your mode is append rather than overwrite.

Ed

________________________________
From: Siavash Namvar <snsina@gmail.com>
Sent: Wednesday, August 12, 2020 4:09:07 PM
To: Sean Owen <srowen@gmail.com>
Cc: User <user@spark.apache.org>
Subject: Re: How can I use pyspark to upsert one row without replacing entire table

Thanks Sean,

Do you have any URL or reference to help me how to upsert in Spark? I need to update Sybase
db

On Wed, Aug 12, 2020 at 11:06 AM Sean Owen <srowen@gmail.com<mailto:srowen@gmail.com>>
wrote:
It's not so much Spark but the data format, whether it supports
upserts. Parquet, CSV, JSON, etc would not.
That is what Delta, Hudi et al are for, and yes you can upsert them in Spark.

On Wed, Aug 12, 2020 at 9:57 AM Siavash Namvar <snsina@gmail.com<mailto:snsina@gmail.com>>
wrote:
>
> Hi,
>
> I have a use case, and read data from a db table and need to update few rows based on
primary key without replacing the entire table.
>
> for instance if I have 3 following rows
>
> -------------------
> id | fname
> -------------------
>  1 | john
> -------------------
>  2 | Steve
> -------------------
>  3 | Jack
> -------------------
>
> And I would like to update the row with id=2 from Steve to Michael without replacing
the entire table and the outpur looks like
>
> -------------------
> id | fname
> -------------------
>  1 | john
> -------------------
>  2 | Michael
> -------------------
>  3 | Jack
> -------------------
>
> Keep in mind the actual db table is so huge and database is old and cannot read and replace
entire table
>
> Thanks

Mime
View raw message