drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andries Engelbrecht <aengelbre...@mapr.com>
Subject Re: Accessing json fields within CSV file
Date Thu, 08 Jun 2017 19:37:40 GMT
You can use convert_from and JSON data type.

0: jdbc:drill:> select t.col1, t.col2, t.conv.key1 as key1, t.conv.key2 as key2, t.col4
from
. . . . . . . > (select columns[0] as col1 , columns[1] as col2 , convert_from(columns[2],
'JSON') as conv  , columns[3] as col4 from `/flat/psv-json/json.tbl`) t;
+-------+-------+---------+---------+-------+
| col1  | col2  |  key1   |  key2   | col4  |
+-------+-------+---------+---------+-------+
| 1     | xyz   | value1  | value2  | abc   |




If you want to use functions like flatten you will need to make sure the JSON in represented
in an array.
i.e. [{"key":1, "value": 1},{"key":2, "value":2}]

0: jdbc:drill:> select t.col1, t.col2, t.conv.key as key, t.conv.`value` as `value`, t.col4
from
. . . . . . . > (select columns[0] as col1, columns[1]as col2, flatten((convert_from(columns[2],'JSON')))
as conv,  columns[3] as col4 from `/flat/psv-json/json.tbl`) t;
+-------+-------+------+--------+-------+
| col1  | col2  | key  | value  | col4  |
+-------+-------+------+--------+-------+
| 1     | xyz   | 1    | 1      | abc   |
| 1     | xyz   | 2    | 2      | abc   |
+-------+-------+------+--------+-------+



--Andries




On 6/8/17, 2:22 AM, "ankit jain" <ankitjainiway@gmail.com> wrote:

    Hi,
    I have a few psv file with a few of the columns being a json key value map.
    Example:
    
    > 1|xyz|{"key1":"value1", "key2":"value2"}|abc|
    
    
    I am converting these files to parquet format but want to convert the json
    key and values to different columns. How is that possible?
    
    end product being:
    id name key1 key2 description
    1 xyz value1 value2 abc
    
    Right now am doing something like this but the json column wont explode:
    
    CREATE TABLE dfs.data.`/logs/logsp/`  AS SELECT
    > CAST(columns[0] AS INT)  `id`,
    > columns[1] AS `name`,
    > columns[2] AS `json_column`,
    > columns[3] AS `description`,
    > from dfs.data.`logs/events.tbl`;
    
    
    And this is what I get
    
    id name json_column description
    1 xyz {"key1":"value1", "key2":"value2"} abc
    
    Thanks in advance,
    Ankit Jain
    

Mime
View raw message