spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enver Osmanov (Jira)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-34435) ArrayIndexOutOfBoundsException when select in different case
Date Sun, 14 Feb 2021 12:08:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-34435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enver Osmanov updated SPARK-34435:
----------------------------------
    Environment:     (was: Actual behavior:
Select column with different case after remapping fail with ArrayIndexOutOfBoundsException.

Expected behavior:

Spark shouldn't fail with ArrayIndexOutOfBoundsException.
Spark is case insensetive by default, so select should return selected column.

Test case:
{code:java}
case class User(aA: String, bb: String)
// ...
val user = User("John", "Doe")

val ds = Seq(user).toDS().map(identity)

ds.select("aa").show(false)
{code}
Additional notes:

Test case is reproduceble with Spark 3.0.1. It works fine with Spark 2.4.7.

I belive problem could be solved by changing filter in pruneDataSchema method from SchemaPruning
object from this:
{code:java}
val dataSchemaFieldNames = dataSchema.fieldNames.toSet
val mergedDataSchema =
  StructType(mergedSchema.filter(f => dataSchemaFieldNames.contains(f.name)))
{code}
to this:
{code:java}
val dataSchemaFieldNames = dataSchema.fieldNames.map(_.toLowerCase).toSet
val mergedDataSchema =
  StructType(mergedSchema.filter(f => dataSchemaFieldNames.contains(f.name.toLowerCase)))
{code})

> ArrayIndexOutOfBoundsException when select in different case
> ------------------------------------------------------------
>
>                 Key: SPARK-34435
>                 URL: https://issues.apache.org/jira/browse/SPARK-34435
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer, SQL
>    Affects Versions: 3.0.1
>            Reporter: Enver Osmanov
>            Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message