spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Malak <>
Subject Re: Bug when zip with longs and too many partitions?
Date Mon, 12 May 2014 23:45:48 GMT

I've discovered that it was noticed a year ago that RDD zip() does not work when the number
of partitions does not evenly divide the total number of elements in the RDD:!msg/spark-users/demrmjHFnoc/Ek3ijiXHr2MJ

I will enter a JIRA ticket just as soon as the ASF Jira system will let me reset my password.

On Sunday, May 11, 2014 4:40 AM, Michael Malak <> wrote:

Is this a bug?

scala> sc.parallelize(1 to 2,4).zip(sc.parallelize(11 to 12,4)).collect
res0: Array[(Int, Int)] = Array((1,11), (2,12))

scala> sc.parallelize(1L to 2L,4).zip(sc.parallelize(11 to 12,4)).collect
res1: Array[(Long, Int)] = Array((2,11))

View raw message