spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rishikesh Gawade <>
Subject How to combine all rows into a single row in DataFrame
Date Mon, 19 Aug 2019 20:23:51 GMT
Hi All,
I have been trying to serialize a dataframe in protobuf format. So far, I
have been able to serialize every row of the dataframe by using map
function and the logic for serialization within the same(within the lambda
function). The resultant dataframe consists of rows in serialized format(1
row = 1 serialized message).
I wish to form a single protobuf serialized message for this dataframe and
in order to do that i need to combine all the serialized rows using some
custom logic very similar to the one used in map operation.
I am assuming that this would be possible by using the reduce operation on
the dataframe, however, i am unaware of how to go about it.
Any suggestions/approach would be much appreciated.


View raw message