spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Afshin, Bardia" <>
Subject UDF issues with spark
Date Fri, 08 Dec 2017 19:54:22 GMT
Using pyspark cli on spark 2.1.1 I’m getting out of memory issues when running the udf function
on a recordset count of 10 with a mapping of the same value (arbirtrary for testing purposes).
This is on amazon EMR release label 5.6.0 with the following hardware specs

32 vCPU, 64 GiB memory, EBS only storage
EBS Storage:100 GiB


This message is confidential, intended only for the named recipient(s) and may contain information
that is privileged or exempt from disclosure under applicable law. If you are not the intended
recipient(s), you are notified that the dissemination, distribution, or copying of this message
is strictly prohibited. If you receive this message in error or are not the named recipient(s),
please notify the sender by return email and delete this message. Thank you.
View raw message