spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-12754) Data type mismatch on two array<bigint> values when using filter/where
Date Tue, 10 Jan 2017 07:21:58 GMT

     [ https://issues.apache.org/jira/browse/SPARK-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hyukjin Kwon resolved SPARK-12754.
----------------------------------
    Resolution: Cannot Reproduce

I am resolving this as {{Cannot Reproduce}} because this was fixed in the master.

> Data type mismatch on two array<bigint> values when using filter/where
> ----------------------------------------------------------------------
>
>                 Key: SPARK-12754
>                 URL: https://issues.apache.org/jira/browse/SPARK-12754
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0, 1.6.0
>         Environment: OSX 10.11.1, Scala 2.11.7, Spark 1.5.0+
>            Reporter: Jesse English
>
> The following test produces the error _org.apache.spark.sql.AnalysisException: cannot
resolve '(point = array(0,9))' due to data type mismatch: differing types in '(point = array(0,9))'
(array<bigint> and array<bigint>)_
> This is not the case on 1.4.x, but has been introduced with 1.5+.  Is there a preferred
method for making this sort of arbitrarily sized array comparison?
> {code:title=test.scala}
> test("test array comparison") {
>     val vectors: Vector[Row] =  Vector(
>       Row.fromTuple("id_1" -> Array(0L, 2L)),
>       Row.fromTuple("id_2" -> Array(0L, 5L)),
>       Row.fromTuple("id_3" -> Array(0L, 9L)),
>       Row.fromTuple("id_4" -> Array(1L, 0L)),
>       Row.fromTuple("id_5" -> Array(1L, 8L)),
>       Row.fromTuple("id_6" -> Array(2L, 4L)),
>       Row.fromTuple("id_7" -> Array(5L, 6L)),
>       Row.fromTuple("id_8" -> Array(6L, 2L)),
>       Row.fromTuple("id_9" -> Array(7L, 0L))
>     )
>     val data: RDD[Row] = sc.parallelize(vectors, 3)
>     val schema = StructType(
>       StructField("id", StringType, false) ::
>         StructField("point", DataTypes.createArrayType(LongType), false) ::
>         Nil
>     )
>     val sqlContext = new SQLContext(sc)
>     var dataframe = sqlContext.createDataFrame(data, schema)
>     val  targetPoint:Array[Long] = Array(0L,9L)
>     //This is the line where it fails
>     //org.apache.spark.sql.AnalysisException: cannot resolve 
>     // '(point = array(0,9))' due to data type mismatch:
>     // differing types in '(point = array(0,9))' 
>     // (array<bigint> and array<bigint>).
>     val targetRow = dataframe.where(dataframe("point") === array(targetPoint.map(value
=> lit(value)): _*)).first()
>     assert(targetRow != null)
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message