spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-26984) Incompatibility between Spark releases - Some(null)
Date Wed, 27 Feb 2019 18:23:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-26984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hyukjin Kwon updated SPARK-26984:
---------------------------------
    Description: 
Please refer to [https://stackoverflow.com/questions/54851205/why-does-somenull-throw-nullpointerexception-in-spark-2-4-but-worked-in-2-2/54861152#54861152.]

NB: Not sure of priority being correct - no doubt one will evaluate.

It is noted that the following:

{code}
val df = Seq(
  (1, Some("a"), Some(1)),
  (2, Some(null), Some(2)),
  (3, Some("c"), Some(3)),
  (4, None, None)).toDF("c1", "c2", "c3")}}
{code}

In Spark 2.2.1 (on mapr) the {{Some(null)}} works fine, in Spark 2.4.0 on Databricks an error
ensues.

{code}
java.lang.RuntimeException: Error while encoding: java.lang.NullPointerException assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._1 AS _1#6 staticinvoke(class org.apache.spark.unsafe.types.UTF8String,
StringType, fromString, unwrapoption(ObjectType(class java.lang.String), assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._2), true, false) AS _2#7 unwrapoption(IntegerType, assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._3) AS _3#8 at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:293)
at org.apache.spark.sql.SparkSession.$anonfun$createDataset$1(SparkSession.scala:472) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.immutable.List.foreach(List.scala:388) at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.immutable.List.map(List.scala:294)
at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:472) at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:228) ... 57
elided Caused by: java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:289)
... 66 more
{code}

You can argue it is solvable otherwise, but there may well be an existing code base that could
be affected.

 

 

  was:
Please refer to [https://stackoverflow.com/questions/54851205/why-does-somenull-throw-nullpointerexception-in-spark-2-4-but-worked-in-2-2/54861152#54861152.]

NB: Not sure of priority being correct - no doubt one will evaluate.

It is noted that the following:

{{val df = Seq( }}

{{  (1, Some("a"), Some(1)), }}

{{  (2, Some(null), Some(2)), }}

{{  (3, Some("c"), Some(3)), }}

{{  (4, None, None) ).toDF("c1", "c2", "c3")}}

In Spark 2.2.1 (on mapr) the Some(null) works fine, in Spark 2.4.0 on Databricks an error
ensues.

{{java.lang.RuntimeException: Error while encoding: java.lang.NullPointerException assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._1 AS _1#6 staticinvoke(class org.apache.spark.unsafe.types.UTF8String,
StringType, fromString, unwrapoption(ObjectType(class java.lang.String), assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._2), true, false) AS _2#7 unwrapoption(IntegerType, assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._3) AS _3#8 at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:293)
at org.apache.spark.sql.SparkSession.$anonfun$createDataset$1(SparkSession.scala:472) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.immutable.List.foreach(List.scala:388) at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.immutable.List.map(List.scala:294)
at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:472) at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:228) ... 57
elided Caused by: java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:289)
... 66 more}}

 

You can argue it is solvable otherwise, but there may well be an existing code base that could
be affected.

 

 


> Incompatibility between Spark releases - Some(null) 
> ----------------------------------------------------
>
>                 Key: SPARK-26984
>                 URL: https://issues.apache.org/jira/browse/SPARK-26984
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.0
>         Environment: Linux CentOS, Databricks.
>            Reporter: Gerard Alexander
>            Priority: Minor
>              Labels: newbie
>
> Please refer to [https://stackoverflow.com/questions/54851205/why-does-somenull-throw-nullpointerexception-in-spark-2-4-but-worked-in-2-2/54861152#54861152.]
> NB: Not sure of priority being correct - no doubt one will evaluate.
> It is noted that the following:
> {code}
> val df = Seq(
>   (1, Some("a"), Some(1)),
>   (2, Some(null), Some(2)),
>   (3, Some("c"), Some(3)),
>   (4, None, None)).toDF("c1", "c2", "c3")}}
> {code}
> In Spark 2.2.1 (on mapr) the {{Some(null)}} works fine, in Spark 2.4.0 on Databricks
an error ensues.
> {code}
> java.lang.RuntimeException: Error while encoding: java.lang.NullPointerException assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._1 AS _1#6 staticinvoke(class org.apache.spark.unsafe.types.UTF8String,
StringType, fromString, unwrapoption(ObjectType(class java.lang.String), assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._2), true, false) AS _2#7 unwrapoption(IntegerType, assertnotnull(assertnotnull(input[0,
scala.Tuple3, true]))._3) AS _3#8 at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:293)
at org.apache.spark.sql.SparkSession.$anonfun$createDataset$1(SparkSession.scala:472) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
at scala.collection.immutable.List.foreach(List.scala:388) at scala.collection.TraversableLike.map(TraversableLike.scala:233)
at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.immutable.List.map(List.scala:294)
at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:472) at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:228) ... 57
elided Caused by: java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:289)
... 66 more
> {code}
> You can argue it is solvable otherwise, but there may well be an existing code base that
could be affected.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message