drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonny Heer <sonnyh...@gmail.com>
Subject Drill where clause vs Hive on non-partition column
Date Sun, 13 Nov 2016 19:06:57 GMT
I'm running a drill query with a where clause on a non-partitioned column
via hive storage plugin.  This query inspects all partitions (kind of
expected), but when i run the same query in Hive I can see a predicate
passed down to the query plan.  This particular query is much faster in
Hive vs Drill.  BTW these are parquet files.

Hive:

Stage-0

Fetch Operator

limit:-1

Select Operator [SEL_2]

outputColumnNames:["_col0"]

Filter Operator [FIL_4]

predicate:(my_column = 123) (type: boolean)

TableScan [TS_0]

alias:my_table


Any idea on why this is?  My guess is Hive is storing hive specific info in
the parquet file since it was created through Hive.  Although it seems
drill-hive plugin should honor this to.  Not sure, but willing to look
through code if someone can point me in the right direction.  Thanks!

--

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message