spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Babak Alipour <babak.alip...@gmail.com>
Subject DataFrame Sort gives Cannot allocate a page with more than 17179869176 bytes
Date Fri, 30 Sep 2016 16:57:24 GMT
Greetings everyone,

I'm trying to read a single field of a Hive table stored as Parquet in
Spark (~140GB for the entire table, this single field should be just a few
GB) and look at the sorted output using the following:

sql("SELECT " + field + " FROM MY_TABLE ORDER BY " + field + " DESC")

​But this simple line of code gives:

Caused by: java.lang.IllegalArgumentException: Cannot allocate a page with
more than 17179869176 bytes

Same error for:

sql("SELECT " + field + " FROM MY_TABLE).sort(field)

and:

sql("SELECT " + field + " FROM MY_TABLE).orderBy(field)


I'm running this on a machine with more than 200GB of RAM, running in local
mode with spark.driver.memory set to 64g.

I do not know why it cannot allocate a big enough page, and why is it
trying to allocate such a big page in the first place?

I hope someone with more knowledge of Spark can shed some light on this.
Thank you!


*​Best regards,​*
*Babak Alipour ,*
*University of Florida*

Mime
View raw message