spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lordjoe <lordjoe2...@gmail.com>
Subject Re: How to calculate percentiles with Spark?
Date Tue, 21 Oct 2014 20:39:37 GMT
A rather more general question is - assume I have an JavaRDD<K> which is
sorted -
How can I convert this into a JavaPairRDD<Integer,K> where the Integer is
tie  index -> 0...N - 1.
Easy to do on one machine
 JavaRDD<K> values = ... // create here
        
   JavaRDD<Integer,K> positions = values.mapToPair(new PairFunction<K,
Integer, K>() {
            private int index = 0;
            @Override
            public Tuple2<Integer, K> call(final K t) throws Exception {
                return new Tuple2(index++,t);
              }
        });
but will this code do the right thing on a cluster



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-calculate-percentiles-with-Spark-tp16937p16945.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message