spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (SPARK-12890) Spark SQL query related to only partition fields should not scan the whole data.
Date Thu, 23 Feb 2017 15:31:44 GMT

     [ https://issues.apache.org/jira/browse/SPARK-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hyukjin Kwon updated SPARK-12890:
---------------------------------
    Comment: was deleted

(was: Actually I don't still understand what is an issue here. This might not be related with
merging schemas as it is disabled by default and any filter is not being pushed down here.
It does not automatically create a filter for a function and pushes down it as far as I know.

I mean, the referenced column would be {{date}} and given filters would be empty. So it tries
to read all the files regardless of file format as long as it supports to partitioned files.)

> Spark SQL query related to only partition fields should not scan the whole data.
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-12890
>                 URL: https://issues.apache.org/jira/browse/SPARK-12890
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Prakash Chockalingam
>
> I have a SQL query which has only partition fields. The query ends up scanning all the
data which is unnecessary.
> Example: select max(date) from table, where the table is partitioned by date.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message