spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin kak <nitinkak...@gmail.com>
Subject Re: Partition sorting by Spark framework
Date Wed, 05 Nov 2014 22:21:37 GMT
Great!! Will try it. Thanks for answering.

On Wed, Nov 5, 2014 at 5:19 PM, Vipul Pandey <vipandey@gmail.com> wrote:

> One option is that after partitioning you call setKeyOrdering explicitly
> on a new ShuffledRDD :
>
> val rdd = // your rdd
>
>
> val srdd = new
> org.apache.spark.rdd.ShuffledRDD(rdd,rdd.partitioner.get).setKeyOrdering(Ordering[Int])
>  //assuming the type is *Int*
>
> give it a try and see if it works. I have used it in a toy RDD (and not a
> real one) and it works.
>
> check it out here :
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala
>
>
>
>
> On Nov 5, 2014, at 1:39 PM, nitinkak001 <nitinkak001@gmail.com> wrote:
>
> I need to sort my RDD partitions but the whole partition(s) might not fit
> into memory, so I cannot run the Collections Sort() method. Does Spark
> support partitions sorting by virtue of its framework? I am working on
> 1.1.0
> version.
>
> I looked up similar unanswered question:
>
> /
> http://apache-spark-user-list.1001560.n3.nabble.com/sort-order-after-reduceByKey-groupByKey-td2959.html/
>
> Thanks All!!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Partition-sorting-by-Spark-framework-tp18213.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
>

Mime
View raw message