spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry Lam <chiling...@gmail.com>
Subject Re: Dataset Question: No Encoder found for Set[(scala.Long, scala.Long)]
Date Wed, 01 Feb 2017 23:38:43 GMT
Hi Koert,

Thank you for your help! GOT IT!

Best Regards,

Jerry

On Wed, Feb 1, 2017 at 6:24 PM, Koert Kuipers <koert@tresata.com> wrote:

> you can still use it as Dataset[Set[X]]. all transformations should work
> correctly.
>
> however dataset.schema will show binary type, and dataset.show will show
> bytes (unfortunately).
>
> for example:
>
> scala> implicit def setEncoder[X]: Encoder[Set[X]] = Encoders.kryo[Set[X]]
> setEncoder: [X]=> org.apache.spark.sql.Encoder[Set[X]]
>
> scala> val x = Seq(Set(1,2,3)).toDS
> x: org.apache.spark.sql.Dataset[scala.collection.immutable.Set[Int]] =
> [value: binary]
>
> scala> x.map(_ + 4).collect
> res17: Array[scala.collection.immutable.Set[Int]] = Array(Set(1, 2, 3, 4))
>
> scala> x.show
> +--------------------+
> |               value|
> +--------------------+
> |[2A 01 03 02 02 0...|
> +--------------------+
>
>
> scala> x.schema
> res19: org.apache.spark.sql.types.StructType =
> StructType(StructField(value,BinaryType,true))
>
>
> On Wed, Feb 1, 2017 at 12:03 PM, Jerry Lam <chilinglam@gmail.com> wrote:
>
>> Hi Koert,
>>
>> Thanks for the tips. I tried to do that but the column's type is now
>> Binary. Do I get the Set[X] back in the Dataset?
>>
>> Best Regards,
>>
>> Jerry
>>
>>
>> On Tue, Jan 31, 2017 at 8:04 PM, Koert Kuipers <koert@tresata.com> wrote:
>>
>>> set is currently not supported. you can use kryo encoder. there is no
>>> other work around that i know of.
>>>
>>> import org.apache.spark.sql.{ Encoder, Encoders }
>>> implicit def setEncoder[X]: Encoder[Set[X]] = Encoders.kryo[Set[X]]
>>>
>>> On Tue, Jan 31, 2017 at 7:33 PM, Jerry Lam <chilinglam@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I got an exception like the following, when I tried to implement a user
>>>> defined aggregation function.
>>>>
>>>>  Exception in thread "main" java.lang.UnsupportedOperationException:
>>>> No Encoder found for Set[(scala.Long, scala.Long)]
>>>>
>>>> The Set[(Long, Long)] is a field in the case class which is the output
>>>> type for the aggregation.
>>>>
>>>> Is there a workaround for this?
>>>>
>>>> Best Regards,
>>>>
>>>> Jerry
>>>>
>>>
>>>
>>
>

Mime
View raw message