spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiří Syrový <syrovy.j...@gmail.com>
Subject Can't zip RDDs with unequal numbers of partitions
Date Thu, 17 Mar 2016 17:03:57 GMT
Hi,

any idea what could be causing this issue? It started appearing after
changing parameter



*    spark.sql.autoBroadcastJoinThreshold to 100000*

Caused by: java.lang.IllegalArgumentException: Can't zip RDDs with unequal
numbers of partitions
        at
org.apache.spark.rdd.ZippedPartitionsBaseRDD.getPartitions(ZippedPartitionsRDD.scala:57)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at
org.apache.spark.rdd.PartitionCoalescer.<init>(CoalescedRDD.scala:172)
        at
org.apache.spark.rdd.CoalescedRDD.getPartitions(CoalescedRDD.scala:85)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:91)
        at
org.apache.spark.sql.execution.Exchange.prepareShuffleDependency(Exchange.scala:220)
        at
org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:254)
        at
org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:248)
        at
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
        ... 28 more

Mime
View raw message