drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "qihuang.zheng"<qihuang.zh...@fraudmetrix.cn>
Subject Reading Parquet's map column
Date Mon, 10 Aug 2015 09:43:07 GMT
Hi Driller:
 I get hive_alltypes.parquethere:https://issues.apache.org/jira/browse/DRILL-2005. I create
table on hive and query it:
hive desc alltypesparquet;
c1          int
c2          boolean
c3          double
c4          string
c5          arrayint
c6          mapint,string
c7          mapstring,string
hive select c6 from alltypesparquet;

and I can easily get k,v just in one row:
hive select c6[1],c6[2] from alltypesparquet;

In Drill, I query like this:
0: jdbc:drill:zk=local select t.c6 from dfs.`/home/qihuang.zheng/hive_alltypes.parquet` t;

Not only the structure changed, but also String value x,y to eA==,eQ==.
structure: t.c6.map is now an array. so I can't query like : t.c6.key now.
I should :t.c6.map[0].key, But since I should get all key, not the first one.
the solution I can figure now is use flatten:

0: jdbc:drill:zk=local select tb.flat.key,tb.flat.`value` from(select flatten(t.c6.map) flat
from dfs.`/home/qihuang.zheng/hive_alltypes.parquet` t ) tb;
| EXPR$0 |  EXPR$1  |
| 1    | [B@2cf5c838 |
| 2    | [B@3c2beb97 |

I looks like so complicate, and now One Row to Two Row, and then I should use SQL's RowToColumn
to make
the result to Only One Row, Complicatedddd too much.

Anyone has good solution? And Why Drill's map structure is different with Hive?

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message