spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tanveer Ahmad - EWI <>
Subject Arrow RecordBatches to Spark Dataframe
Date Thu, 25 Jun 2020 03:35:01 GMT
Hi all,

I have a small question, if you people can help me.

In this code snippet<>,
Jether is converting an prdd (RDD) of pd.Dataframes objects to Arrow RecordBatches (slices)
and then to Spark Dataframe finally. Similarly the code in Scala<>
converts   JavaRDD to Spark Dataframe.

If I already have an ardd (RDD) of pa.RecordBatch (Arrow RecordBatches) objects, how can I
convert it to Spark Dataframe directly without using Pandas in PySpark? Thanks.

Tanveer Ahmad

View raw message