spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Capwell <dcapw...@gmail.com>
Subject Re: how to add columns to row when column has a different encoder?
Date Wed, 28 Feb 2018 16:41:53 GMT
Anyone know a way right now to do this? As best as I can tell I need a
custom expression to pass to udf to do this.

Just finished a protobuf encoder and it feels like expression is not meant
to be public (good amount of things are private[sql]), am I wrong about
this? Am I looking at the right interface to add such a UDF?

Thanks for your help!

On Mon, Feb 26, 2018, 3:50 PM David Capwell <dcapwell@gmail.com> wrote:

> I have a row that looks like the following pojo
>
> case class Wrapper(var id: String, var bytes: Array[Byte])
>
> Those bytes are a serialized pojo that looks like this
>
> case class Inner(var stuff: String, var moreStuff: String)
>
> I right now have encoders for both the types, but I don't see how to merge
> the two into a unified row that looks like the following
>
>
> struct<id: String, inner: struct<stuff: String, moreStuff: String>>
>
> If I know how to deserialize the bytes and have a encoder, how could I get
> the above schema?  I was looking at ds.withColumn("inner", ???) but wasn't
> sure how to go from pojo + encoder to a column.  Is there a better way to
> do this?
>
> Thanks for your time reading this email
>

Mime
View raw message