spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From unk1102 <>
Subject How to call mapPartitions on DataFrame?
Date Wed, 23 Dec 2015 17:43:56 GMT
Hi I have the following code where I use mapPartitions on RDD but then I need
to convert it into DataFrame so why do I need to convert DataFrame into RDD
and back into DataFrame for just calling mapPartitions why can I call it
directly on DataFrame? 

FlatMapFunction<Iterator&lt;Row>,Row>() {

   public Iterable<Row>  call(Iterable<Row> rowIterator) throws Exception { 
        List rowAsList = new ArrayList<>(); 
        while(rowIterator.hasNext()) { 
          Row row =;
          rowAsList = iterate(JavaConversions.seqAsJavaList(row.toSeq())); 
          Row updatedRow = RowFactory.create(rowAsList.toArray()); 
        return rowAsList; 

When I see method signature it
is.mapPartitions(scala.Function1<Iterator&lt;Row>,Iterator<R>> f,ClassTag<R>

How to I map above code into dataframe.mapPartitions please guide I am new
to Spark.

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message