spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Sharma <>
Subject Merge query using spark sql
Date Mon, 02 Apr 2018 10:23:39 GMT
I am using spark to run merge query in postgres sql.
The way its being done now is save the data to be merged in postgres as
temp tables.
Now run the  merge queries in postgres using java sql connection and
statment .
So basically this query runs in postgres.
The queries are insert into source table if it doesn't exists in source but
exists in temp table , else update.
Problem is both the tables got 400K records and thus this whole query takes
20 hours to run.
Is there any way to do it in spark itself and not run the query in PG , so
this can complete in reasonable time.


View raw message