drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kathleen Li <...@maprtech.com>
Subject Re: nested array flatten and then group by - Apache drill
Date Tue, 13 Dec 2016 17:17:16 GMT
Dechang provided method by using drill flatten function with mongdb plugin works fine in my
test env by using the json document Yousuf provided.

Details as below:

I am using drill 1.8 and mongodb 3.2.9

> db.test.find()
{ "_id" : ObjectId("584f3bad7161b4e2b9fa0a7d"), "twitter_entities" : { "hashtags" : [ { "text"
: "snow", "indices" : [ 66, 71 ] }, { "text" : "skiing", "indices" : [ 73, 80 ] } ], "trends"
: [ ], "urls" : [ { "url" : "http://xxxxxxx", "expanded_url" : "http://xxxxx", "display_url"
: "xxxxx", "indices" : [ 71, 93 ] } ] } }



0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> use mongo.test;
+-------+-----------------------------------------+
|  ok   |                 summary                 |
+-------+-----------------------------------------+
| true  | Default schema changed to [mongo.test]  |
+-------+-----------------------------------------+
1 row selected (0.376 seconds)
0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> select t1.c1.text, count(t1.c1.text) from
(select
. . . . . . . . . . . . . . . . . . . . . . .> flatten(tbl.twitter_entities.hashtags) as
c1 from test as tbl) t1 group
. . . . . . . . . . . . . . . . . . . . . . .> by t1.c1.text;
+---------+---------+
| EXPR$0  | EXPR$1  |
+---------+---------+
| skiing  | 1       |
| snow    | 1       |
+---------+---------+
2 rows selected (0.372 seconds)





The only thing difference is that I noticed in the original document Yousuf provided, for
key/value pair "display_url": xxxxx”, “ is missing before xxxxx” and I was not able
to insert/import into mongodb, after I added “ —>"display_url": “xxxxx”, everything
works fine.

Regards,

Kathleen

On 12/13/16, 6:30 AM, "yousuf" <yousufuddin@css.org.sa> wrote:

>Hi Dechang,
>
>The query didn't work with mongodb.
>
>
>use mongo.test;
>
>select t1.c1.text, count(t1.c1.text) from (select 
>flatten(tbl.twitter_entities.hashtags) as c1 from test as tbl) t1 group 
>by t1.c1.text;
>+---------+---------+
>| EXPR$0  | EXPR$1  |
>+---------+---------+
>+---------+---------+
>
>Kind Regards
>
>Yousuf
>
>
>
>
>
>
>-------- Forwarded Message --------
>Subject: 	Re: nested array flatten and then group by - Apache drill
>Date: 	Mon, 12 Dec 2016 10:32:53 -0800
>From: 	Dechang Gu <dgu@maprtech.com>
>Reply-To: 	user@drill.apache.org
>To: 	user@drill.apache.org
>
>
>
>AFAIK,  for MongoDB plugin, we have very limited tests.
>Anyway, I  tried the following query on your sample json file:
>
>0: jdbc:drill:schema=dfs.tpchPar100> select t1.c1.text, count(t1.c1.text)
>from (select flatten(tbl.twitter_entities.hashtags) as c1 from
>dfs.`/jsondata/junk.json` as tbl) t1 group by t1.c1.text;
>
>+---------+---------+
>
>| EXPR$0  | EXPR$1  |
>
>+---------+---------+
>
>| skiing  | 1       |
>
>| snow    | 1       |
>
>+---------+---------+
>
>2 rows selected (0.542 seconds)
>
>
>is this the group/count you want to retrieve?
>
>
>HTH,
>Dechang
>
>On Sun, Dec 11, 2016 at 12:10 AM, yousuf <yousufuddin@css.org.sa> wrote:
>
>> Hi,
>>
>> How can I retrieve hashtags[n].text values as flatten from the following
>> json document (storage mongo). My final goal is to group text with count.
>>
>> Apache drill version : 1.8.0
>> MongoDB version: 3.2
>>
>> I've tried several examples but my final result is either empty array eg
>> [] or blank.
>>
>> {
>>     "twitter_entities": {
>>         "hashtags": [{
>>             "text": "snow",
>>             "indices": [
>>                 66,
>>                 71
>>             ]
>>         }, {
>>             "text": "skiing",
>>             "indices": [
>>                 73,
>>                 80
>>             ]
>>         }],
>>         "trends": [],
>>         "urls": [{
>>             "url": "http://xxxxxxx",
>>             "expanded_url": "http://xxxxx",
>>             "display_url": xxxxx",
>>             "indices": [
>>                 71,
>>                 93
>>             ]
>>         }]
>>     }
>> }
>>
>> Any help is appreciated
>> Regards
>> Yousuf
>>
>>
>>
>


Mime
View raw message