spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Cutler <cutl...@gmail.com>
Subject Re: Spark dataset to byte array over grpc
Date Mon, 23 Apr 2018 19:18:59 GMT
Hi Ashwin,

This sounds like it might be a good use for Apache Arrow, if you are open
to the type of format to exchange.  As of Spark 2.3, Dataset has a method
"toArrowPayload" that will convert a Dataset of Rows to a byte array in
Arrow format, although the API is currently not public.  Your client could
consume Arrow data directly or perhaps use spark.sql ColumnarBatch to read
back as Rows.

Bryan

On Mon, Apr 23, 2018 at 11:49 AM, Ashwin Sai Shankar <
ashankar@netflix.com.invalid> wrote:

> Hi!
> I'm building a spark app which runs a spark-sql query and send results to
> client over grpc(my proto file is configured to send the sql output as
> "bytes"). The client then displays the output rows. When I run spark.sql, I
> get a DataSet<Rows>. How do I convert this to byte array?
> Also is there a better way to send this output to client?
>
> Thanks,
> Ashwin
>
>

Mime
View raw message