spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Assigned] (SPARK-22329) Use NEVER_INFER for `spark.sql.hive.caseSensitiveInferenceMode` by default
Date Sun, 22 Oct 2017 18:09:01 GMT


Apache Spark reassigned SPARK-22329:

    Assignee: Apache Spark

> Use NEVER_INFER for `spark.sql.hive.caseSensitiveInferenceMode` by default
> --------------------------------------------------------------------------
>                 Key: SPARK-22329
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Dongjoon Hyun
>            Assignee: Apache Spark
>            Priority: Critical
> In Spark 2.2.0, `spark.sql.hive.caseSensitiveInferenceMode` has a critical issue. 
> - SPARK-19611 uses `INFER_AND_SAVE` at 2.2.0 since Spark 2.1.0 breaks some Hive tables
backed by case-sensitive data files.
> bq. This situation will occur for any Hive table that wasn't created by Spark or that
was created prior to Spark 2.1.0. If a user attempts to run a query over such a table containing
a case-sensitive field name in the query projection or in the query filter, the query will
return 0 results in every case.
> - However, SPARK-22306 reports this also corrupts Hive Metastore schema by removing bucketing
information (BUCKETING_COLS, SORT_COLS) and changing owner.
> - Since Spark 2.3.0 supports Bucketing, BUCKETING_COLS and SORT_COLS look okay at least.
However, we need to figure out the issue of changing owners. Also, we cannot backport bucketing
patch into `branch-2.2`. We need more tests on before releasing 2.3.0.
> Hive Metastore is a shared resource and Spark should not corrupt it by default. This
issue proposes to recover that option back to `NEVER_INFO` like Spark 2.2.0 by default. Users
can take a risk by enabling `INFER_AND_SAVE` by themselves.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message