spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <>
Subject Re: Spark SQL UDF Returning Rows
Date Fri, 01 Apr 2016 20:26:27 GMT
> I haven't looked at Encoders or Datasets since we're bound to 1.6 for now
> but I'll look at encoders to see if that covers it. Datasets seems like it
> would solve this problem for sure.

There is an experimental preview of Datasets in Spark 1.6

> I avoided returning a case object because even if we use reflection to
> build byte code and do it efficiently. I still need to convert my Row to a
> case object manually within my UDF, just to have it converted to a Row
> again. Even if it's fast, it's still fairly necessary.

Even if you give us a Row there's still a conversion into the binary format
of InternalRow

> The thing I guess that threw me off was that UDF1/2/3 was in a "java"
> prefixed package although there was nothing that made it java specific and
> in fact was the only way to do what I wanted in scala. For things like
> JavaRDD, etc it makes sense, but for generic things like UDF is there a
> reason they get put into a package with "java" in the name?

This was before we decided to unify the APIs for Scala and Java, so its
mostly historical.

View raw message