spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Bradley <jos...@databricks.com>
Subject Re: [ML]Random Forest Error : Size exceeds Integer.MAX_VALUE
Date Wed, 05 Oct 2016 06:41:51 GMT
Could you please file a bug report JIRA and also include more info about
what you ran?
* Random forest Param settings
* dataset dimensionality, partitions, etc.
Thanks!

On Tue, Oct 4, 2016 at 10:44 PM, Samkit Shah <samkit993@gmail.com> wrote:

> Hello folks,
> I am running Random Forest from ml from spark 1.6.1 on bimbo[1] dataset
> with following configurations:
>
> "-Xms16384M" "-Xmx16384M" "-Dspark.locality.wait=0s" "-Dspark.driver.extraJavaOptions=-Xss10240k
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=2
> -XX:-UseAdaptiveSizePolicy -XX:ConcGCThreads=2 -XX:-UseGCOverheadLimit  -XX:
> CMSInitiatingOccupancyFraction=75 -XX:NewSize=8g -XX:MaxNewSize=8g
> -XX:SurvivorRatio=3 -DnumPartitions=36" "-Dspark.submit.deployMode=cluster"
> "-Dspark.speculation=true" "-Dspark.speculation.multiplier=2"
> "-Dspark.driver.memory=16g" "-Dspark.speculation.interval=300ms"
>  "-Dspark.speculation.quantile=0.5" "-Dspark.akka.frameSize=768"
> "-Dspark.driver.supervise=false" "-Dspark.executor.cores=6"
> "-Dspark.executor.extraJavaOptions=-Xss10240k -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
> -XX:-UseAdaptiveSizePolicy -XX:+UseParallelGC -XX:+UseParallelOldGC
> -XX:ParallelGCThreads=6 -XX:NewSize=22g -XX:MaxNewSize=22g
> -XX:SurvivorRatio=2 -XX:+PrintAdaptiveSizePolicy -XX:+PrintGCDateStamps"
> "-Dspark.rpc.askTimeout=10" "-Dspark.executor.memory=40g"
> "-Dspark.driver.maxResultSize=3g" "-Xss10240k" "-XX:+PrintGCDetails"
> "-XX:+PrintGCTimeStamps" "-XX:+PrintTenuringDistribution"
> "-XX:+UseConcMarkSweepGC" "-XX:+UseParNewGC" "-XX:ParallelGCThreads=2"
> "-XX:-UseAdaptiveSizePolicy" "-XX:ConcGCThreads=2"
> "-XX:-UseGCOverheadLimit" "-XX:CMSInitiatingOccupancyFraction=75"
> "-XX:NewSize=8g" "-XX:MaxNewSize=8g" "-XX:SurvivorRatio=3"
> "-DnumPartitions=36" "org.apache.spark.deploy.worker.DriverWrapper"
> "spark://Worker@11.0.0.106:56419"
>
>
> I get following error:
> 16/10/04 06:55:05 WARN TaskSetManager: Lost task 8.0 in stage 19.0 (TID
> 194, 11.0.0.106): java.lang.IllegalArgumentException: Size exceeds
> Integer.MAX_VALUE
> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:869)
> at org.apache.spark.storage.DiskStore$$anonfun$getBytes$2.
> apply(DiskStore.scala:127)
> at org.apache.spark.storage.DiskStore$$anonfun$getBytes$2.
> apply(DiskStore.scala:115)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250)
> at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:129)
> at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:136)
> at org.apache.spark.storage.BlockManager.doGetLocal(
> BlockManager.scala:503)
> at org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:420)
> at org.apache.spark.storage.BlockManager.get(BlockManager.scala:625)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:154)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
> at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(
> ZippedPartitionsRDD.scala:88)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
> I have varied number of partitions from 24 to 48. I still get the same
> error. How can this problem be tackled?
>
>
> Thanks,
> Samkit
>
>
>
>
> [1]: https://www.kaggle.com/c/grupo-bimbo-inventory-demand
>

Mime
View raw message