spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Beavers <>
Subject Re: UDF returning generic Seq
Date Tue, 26 Jul 2016 15:07:55 GMT

Thanks for the response. While those are good examples, they are able to
leverage the keytype/valuetype structure of Maps to specify an explicit
return type.

I guess maybe the more fundamental issue is that I want to support
heterogenous maps/arrays allowed by JSON: [1, "str", 2.345] or
{"name":"Chris","value":123}. Given the Spark SQL constraints that
ArrayType and MapType need explicit and consistent element types, I don't
see any way to support this in the current type system short of falling
back to binary data.

Open to other suggestions,

On Tue, Jul 26, 2016 at 9:42 AM Yong Zhang <> wrote:

> I don't know the if "ANY" will work or not, but do you take a look about
> how "map_values" UDF implemented in Spark, which return map values of an
> array/seq of arbitrary type.
> Yong
> ------------------------------
> *From:* Chris Beavers <>
> *Sent:* Monday, July 25, 2016 10:32 PM
> *To:*
> *Subject:* UDF returning generic Seq
> Hey there,
> Interested in writing a UDF that returns an ArrayType column of unknown
> subtype. My understanding is that this translated JVM-type-wise be a Seq of
> generic templated type: Seq[Any]. I seem to be hitting the constraint at
> basically necessitates a fully qualified schema on the return type (i.e.
> the templated Any is hitting the default exception throwing case at the end
> of schemaFor).
> Is there any more canonical way have a UDF produce an ArrayType column of
> unknown type? Or is my only alternative here to reduce this to BinaryType
> and use whatever encoding/data structures I want under the covers there and
> in subsequent UDFs?
> Thanks,
> Chris

View raw message