spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop Johnson <anoop.k.john...@gmail.com>
Subject Query Optimizations for Data Sources
Date Tue, 03 Nov 2020 19:30:00 GMT
All --

I was reading through the Dynamic Partition Pruning (DPP) code[1] and if I
understood it correctly, DPP works for only file system-backed tables. Is
it possible to extend DPP to work on partitioned data sources as well? If
so, is there a JIRA to track this?

I have another open-ended question: are there any other major optimizations
that currently don't work for data sources?

Thanks,
Anoop

[1]
https://github.com/apache/spark/blob/afa6aee4f5ea270db5331e48ad08e0b176cdd2a0/sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala#L59-L63

Mime
View raw message