spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Heng Su <>
Subject Datasource v2 can not prune file source partitions when readDataSchema is empty
Date Tue, 14 Sep 2021 06:00:15 GMT
Hi, community:

We use spark 3.1.2

In PruneFileSourcePartitions rule, the FileScan::withFilters is called to push partition prune
filter(and this is the only place this function can be called), but it has a constraint that
“scan.readDataSchema.nonEmpty” (

We use spark sql in custom catalog and execute the count sql like:   select count(*) from
catalog.db.tbl where dt=‘0812’ ,  in which dt is a partition key.

In this case the scan.readDataSchema is empty indeed and no scan partition prune performed,
 which cause scan all partition at last.

Is it something I misunderstood? Any help is appreciated

Than you.

To unsubscribe e-mail:

View raw message