drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3966) Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean
Date Thu, 22 Oct 2015 22:17:27 GMT
Rahul Challapalli created DRILL-3966:
----------------------------------------

             Summary: Metadata Cache + Partition Pruning not hapenning when the partition
column is of type boolean
                 Key: DRILL-3966
                 URL: https://issues.apache.org/jira/browse/DRILL-3966
             Project: Apache Drill
          Issue Type: Bug
          Components: Metadata, Query Planning & Optimization
            Reporter: Rahul Challapalli


git.commit.id.abbrev=19b4b79

I have partitioned parquet files whose partition column is of type boolean.
The below plan suggests that pruning did not take place when partitioned column is of type
boolean and when metadata exists. However if I get rid of the metadata cache, partition pruning
seems to be working fine.

Query :
{code}
explain plan for select * from fewtypes_boolpartition where bool_col = false;

00-00    Screen
00-01      Project(*=[$0])
00-02        Project(T11¦¦*=[$0])
00-03          SelectionVectorRemover
00-04            Filter(condition=[=($1, false)])
00-05              Project(T11¦¦*=[$0], bool_col=[$1])
00-06                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet],
ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]],
selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, numFiles=2, usedMetadataFile=true,
columns=[`*`]]])

{code}


Error from the log :
{code}
WARN  o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune partition.
 java.lang.UnsupportedOperationException: Unsupported type: BIT
 	at org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451)
~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96)
~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212)
~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
 	at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
 	at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
 	at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
 	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61)
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
{code}

I attached the data sets required. Let me know if you need anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message