spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Malak <michaelma...@yahoo.com>
Subject Re: Bug when zip with longs and too many partitions?
Date Mon, 12 May 2014 23:45:48 GMT


I've discovered that it was noticed a year ago that RDD zip() does not work when the number
of partitions does not evenly divide the total number of elements in the RDD:

https://groups.google.com/forum/#!msg/spark-users/demrmjHFnoc/Ek3ijiXHr2MJ

I will enter a JIRA ticket just as soon as the ASF Jira system will let me reset my password.



On Sunday, May 11, 2014 4:40 AM, Michael Malak <michaelmalak@yahoo.com> wrote:

Is this a bug?

scala> sc.parallelize(1 to 2,4).zip(sc.parallelize(11 to 12,4)).collect
res0: Array[(Int, Int)] = Array((1,11), (2,12))

scala> sc.parallelize(1L to 2L,4).zip(sc.parallelize(11 to 12,4)).collect
res1: Array[(Long, Int)] = Array((2,11))

Mime
View raw message