spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Babak Alipour <babak.alip...@gmail.com>
Subject Re: DataFrame Sort gives Cannot allocate a page with more than 17179869176 bytes
Date Sun, 02 Oct 2016 04:31:14 GMT
To add one more note, I tried running more smaller executors each with
32-64g memory and executor.cores 2-4 (with 2 workers as well) and I'm still
getting the same exception:

java.lang.IllegalArgumentException: Cannot allocate a page with more than
17179869176 bytes
        at
org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:241)
        at
org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:92)
        at
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.growPointerArrayIfNecessary(UnsafeExternalSorter.java:343)
        at
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:393)
        at
org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:94)
        at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
Source)
        at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
Source)
        at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
Source)
        at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
        at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
        at
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

>Babak

*Babak Alipour ,*
*University of Florida*

On Sat, Oct 1, 2016 at 11:35 PM, Babak Alipour <babak.alipour@gmail.com>
wrote:

> Do you mean running a multi-JVM 'cluster' on the single machine? How would
> that affect performance/memory-consumption? If a multi-JVM setup can
> handle such a large input, then why can't a single-JVM break down the job
> into smaller tasks?
>
> I also found that SPARK-9411 mentions making the page_size configurable
> but it's hard-limited to ((1L << 31) - 1) * 8L [1]
>
> [1] https://github.com/apache/spark/blob/master/core/src/
> main/java/org/apache/spark/memory/TaskMemoryManager.java
>
> ​Spark-9452 also talks about larger page sizes but I don't know how that
> affects my use case.​ [2]
>
> [2] https://github.com/apache/spark/pull/7891
>
>
> ​The reason provided here is that the on-heap allocator's maximum page
> size is limited by the maximum amount of data that can be stored in a
> long[]​.
> Is it possible to force this specific operation to go off-heap so that it
> can possibly use a bigger page size?
>
>
>
> ​>Babak​
>
>
> *Babak Alipour ,*
> *University of Florida*
>
> On Fri, Sep 30, 2016 at 3:03 PM, Vadim Semenov <
> vadim.semenov@datadoghq.com> wrote:
>
>> Run more smaller executors: change `spark.executor.memory` to 32g and
>> `spark.executor.cores` to 2-4, for example.
>>
>> Changing driver's memory won't help because it doesn't participate in
>> execution.
>>
>> On Fri, Sep 30, 2016 at 2:58 PM, Babak Alipour <babak.alipour@gmail.com>
>> wrote:
>>
>>> Thank you for your replies.
>>>
>>> @Mich, using LIMIT 100 in the query prevents the exception but given the
>>> fact that there's enough memory, I don't think this should happen even
>>> without LIMIT.
>>>
>>> @Vadim, here's the full stack trace:
>>>
>>> Caused by: java.lang.IllegalArgumentException: Cannot allocate a page
>>> with more than 17179869176 bytes
>>>         at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskM
>>> emoryManager.java:241)
>>>         at org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryCo
>>> nsumer.java:121)
>>>         at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalS
>>> orter.acquireNewPageIfNecessary(UnsafeExternalSorter.java:374)
>>>         at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalS
>>> orter.insertRecord(UnsafeExternalSorter.java:396)
>>>         at org.apache.spark.sql.execution.UnsafeExternalRowSorter.inser
>>> tRow(UnsafeExternalRowSorter.java:94)
>>>         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$Gen
>>> eratedIterator.sort_addToSorter$(Unknown Source)
>>>         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$Gen
>>> eratedIterator.agg_doAggregateWithoutKey$(Unknown Source)
>>>         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$Gen
>>> eratedIterator.processNext(Unknown Source)
>>>         at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(B
>>> ufferedRowIterator.java:43)
>>>         at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfu
>>> n$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>>>         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:40
>>> 8)
>>>         at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.w
>>> rite(BypassMergeSortShuffleWriter.java:125)
>>>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap
>>> Task.scala:79)
>>>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap
>>> Task.scala:47)
>>>         at org.apache.spark.scheduler.Task.run(Task.scala:85)
>>>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.s
>>> cala:274)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>> I'm running spark in local mode so there is only one executor, the
>>> driver and spark.driver.memory is set to 64g. Changing the driver's memory
>>> doesn't help.
>>>
>>> *Babak Alipour ,*
>>> *University of Florida*
>>>
>>> On Fri, Sep 30, 2016 at 2:05 PM, Vadim Semenov <
>>> vadim.semenov@datadoghq.com> wrote:
>>>
>>>> Can you post the whole exception stack trace?
>>>> What are your executor memory settings?
>>>>
>>>> Right now I assume that it happens in UnsafeExternalRowSorter ->
>>>> UnsafeExternalSorter:insertRecord
>>>>
>>>> Running more executors with lower `spark.executor.memory` should help.
>>>>
>>>>
>>>> On Fri, Sep 30, 2016 at 12:57 PM, Babak Alipour <
>>>> babak.alipour@gmail.com> wrote:
>>>>
>>>>> Greetings everyone,
>>>>>
>>>>> I'm trying to read a single field of a Hive table stored as Parquet in
>>>>> Spark (~140GB for the entire table, this single field should be just
a few
>>>>> GB) and look at the sorted output using the following:
>>>>>
>>>>> sql("SELECT " + field + " FROM MY_TABLE ORDER BY " + field + " DESC")
>>>>>
>>>>> ​But this simple line of code gives:
>>>>>
>>>>> Caused by: java.lang.IllegalArgumentException: Cannot allocate a page
>>>>> with more than 17179869176 bytes
>>>>>
>>>>> Same error for:
>>>>>
>>>>> sql("SELECT " + field + " FROM MY_TABLE).sort(field)
>>>>>
>>>>> and:
>>>>>
>>>>> sql("SELECT " + field + " FROM MY_TABLE).orderBy(field)
>>>>>
>>>>>
>>>>> I'm running this on a machine with more than 200GB of RAM, running in
>>>>> local mode with spark.driver.memory set to 64g.
>>>>>
>>>>> I do not know why it cannot allocate a big enough page, and why is it
>>>>> trying to allocate such a big page in the first place?
>>>>>
>>>>> I hope someone with more knowledge of Spark can shed some light on
>>>>> this. Thank you!
>>>>>
>>>>>
>>>>> *​Best regards,​*
>>>>> *Babak Alipour ,*
>>>>> *University of Florida*
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message