spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From khayyatzy <>
Subject [GitHub] incubator-spark pull request: Adding RDD unique self cross product
Date Wed, 12 Feb 2014 13:43:26 GMT
GitHub user khayyatzy opened a pull request:

    Adding RDD unique self cross product

    I am using Spark in some data analysis project and I frequently requires the unique self
cross product for a single RDD. Since I am using Spark's Java API, I added the new function
"selfCartesian" JavaRDDLike.scala. I also modify RDD.scala where it calls function "CartesianRDD2".
"CartesianRDD2" Has similar implementation to "CartesianRDD", where it only returns elements
(a, b) if a.index <= b.index. I have been using this Spark's modification for couple of
months and the function always return correct results
    I hope this new small feature would be favorable for other Spark users.
    Zuhair Khayyat

You can merge this pull request into a Git repository by running:

    $ git pull master

Alternatively you can review and apply these changes as the patch at:

commit 82a80ef264ad15fa706eb566691470308b30f63a
Author: Zuhair Khayyat <>
Date:   2014-02-12T12:46:05Z

    Adding unique self cross product of in a single RDD

commit fb8ad2eee1c4ce175f3cf4227492bbc9f3502db3
Author: Zuhair Khayyat <>
Date:   2014-02-12T12:54:11Z

    changing ClassManifest to ClassTag in unique self product classes

commit ea02451bb11274f55be3706ea86e21e43e54fd35
Author: Zuhair Khayyat <>
Date:   2014-02-12T12:59:16Z

    adding import scala.reflect.ClassTag to CartesianRDD2.scala

commit 8f81706f374773aeea0d608b6baa9d2164c8f364
Author: Zuhair Khayyat <>
Date:   2014-02-12T13:29:27Z

    removing unwanted text from CartesianRDD2.scala


View raw message