spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tzahi File <>
Subject Issue with pyspark query
Date Wed, 10 Jun 2020 11:24:04 GMT

This is a general question regarding moving spark SQL query to PySpark, if
needed I will add some more from the errors log and query syntax.
I'm trying to move a spark SQL query to run through PySpark.
The query syntax and spark configuration are the same.
For some reason the query failed to run through PySpark with an java heap
space error.
In the Spark SQL query I'm using insert overwrite partition, while in
pyspark I'm using DF to write the data to a specific location in S3.

Are there any differences in the configuration that you might think I need
to change?


Data Engineer

View raw message