spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: Facing memory leak with Pyarrow enabled and toPandas()
Date Fri, 22 Jan 2021 08:47:52 GMT
Hi
Can you please mention the spark version, give us the code for setting up
spark session, and the operation you are talking about? It will be good to
know the amount of memory that your system has as well and number of
executors you are using per system
In general I have faced issues when doing group by or running aggregates
over datasets which are more than 2 GB but my system has lower ram.

Regards
Gourav

On Thu, 21 Jan 2021, 12:24 Divyanshu Kumar, <cool.ajourney@gmail.com> wrote:

> Hi, I am facing this issue while using toPandas() and Pyarrow
> simultaneously.
>
> pandas - toPandas() giving memory leak error with PySpark arrow enabled -
> Stack Overflow
> <https://stackoverflow.com/questions/65824894/topandas-giving-memory-leak-error-with-pyspark-arrow-enabled>
>

Mime
View raw message