spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Gautier <tim.gaut...@gmail.com>
Subject Re: Undocumented left join constraint?
Date Fri, 27 May 2016 20:50:31 GMT
When I run it in 1.6.1 I get this:

java.lang.RuntimeException: Error while decoding:
java.lang.RuntimeException: Null value appeared in non-nullable field:
- field (class: "scala.Int", name: "id")
- root class: "$iwC.$iwC.Test"
If the schema is inferred from a Scala tuple/case class, or a Java bean,
please try to use scala.Option[_] or other nullable types (e.g.
java.lang.Integer instead of int/scala.Int).
newinstance(class scala.Tuple2,newinstance(class
$iwC$$iwC$Test,assertnotnull(input[0,
StructType(StructField(id,IntegerType,false))].id,- field (class:
"scala.Int", name: "id"),- root class:
"$iwC.$iwC.Test"),false,ObjectType(class
$iwC$$iwC$Test),Some($iwC$$iwC@6e40bddd)),newinstance(class
$iwC$$iwC$Test,assertnotnull(input[1,
StructType(StructField(id,IntegerType,true))].id,- field (class:
"scala.Int", name: "id"),- root class:
"$iwC.$iwC.Test"),false,ObjectType(class
$iwC$$iwC$Test),Some($iwC$$iwC@6e40bddd)),false,ObjectType(class
scala.Tuple2),None)
:- newinstance(class $iwC$$iwC$Test,assertnotnull(input[0,
StructType(StructField(id,IntegerType,false))].id,- field (class:
"scala.Int", name: "id"),- root class:
"$iwC.$iwC.Test"),false,ObjectType(class
$iwC$$iwC$Test),Some($iwC$$iwC@6e40bddd))
:  +- assertnotnull(input[0,
StructType(StructField(id,IntegerType,false))].id,- field (class:
"scala.Int", name: "id"),- root class: "$iwC.$iwC.Test")
:     +- input[0, StructType(StructField(id,IntegerType,false))].id
:        +- input[0, StructType(StructField(id,IntegerType,false))]
+- newinstance(class $iwC$$iwC$Test,assertnotnull(input[1,
StructType(StructField(id,IntegerType,true))].id,- field (class:
"scala.Int", name: "id"),- root class:
"$iwC.$iwC.Test"),false,ObjectType(class
$iwC$$iwC$Test),Some($iwC$$iwC@6e40bddd))
   +- assertnotnull(input[1,
StructType(StructField(id,IntegerType,true))].id,- field (class:
"scala.Int", name: "id"),- root class: "$iwC.$iwC.Test")
      +- input[1, StructType(StructField(id,IntegerType,true))].id
         +- input[1, StructType(StructField(id,IntegerType,true))]


On Fri, May 27, 2016 at 2:48 PM Tim Gautier <tim.gautier@gmail.com> wrote:

> Interesting, I did that on 1.6.1, Scala 2.10
>
> On Fri, May 27, 2016 at 2:41 PM Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Which release did you use ?
>>
>> I tried your example in master branch:
>>
>> scala> val test2 = Seq(Test(2), Test(3), Test(4)).toDS
>> test2: org.apache.spark.sql.Dataset[Test] = [id: int]
>>
>> scala>  test1.as("t1").joinWith(test2.as("t2"), $"t1.id" === $"t2.id",
>> "left_outer").show
>> +---+------+
>> | _1|    _2|
>> +---+------+
>> |[1]|[null]|
>> |[2]|   [2]|
>> |[3]|   [3]|
>> +---+------+
>>
>> On Fri, May 27, 2016 at 1:01 PM, Tim Gautier <tim.gautier@gmail.com>
>> wrote:
>>
>>> Is it truly impossible to left join a Dataset[T] on the right if T has
>>> any non-option fields? It seems Spark tries to create Ts with null values
>>> in all fields when left joining, which results in null pointer exceptions.
>>> In fact, I haven't found any other way to get around this issue without
>>> making all fields in T options. Is there any other way?
>>>
>>> Example:
>>>
>>>     case class Test(id: Int)
>>>     val test1 = Seq(Test(1), Test(2), Test(3)).toDS
>>>     val test2 = Seq(Test(2), Test(3), Test(4)).toDS
>>>     test1.as("t1").joinWith(test2.as("t2"), $"t1.id" === $"t2.id",
>>> "left_outer").show
>>>
>>>
>>

Mime
View raw message