drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Givre <cgi...@gmail.com>
Subject Re: Querying json files from multiple subdirectories
Date Sat, 18 Jan 2020 23:15:06 GMT
Hi Prabhakar, 
You'll need to find some common identifier for the files you want to query.  
It could be something like:

SELECT 
FROM dfs.`<path>/Year*/`

Alternatively, you could have multiple SELECT queries and join them together via a UNION statement.
 IE:

SELECT * FROM dfs.`Year2013/trans.json`
UNION 
SELECT * FROM dfs.`Year2014/trans.json`



-- C

> On Jan 17, 2020, at 11:07 PM, Prabhakar Bhosaale <bhosale.p.v@gmail.com> wrote:
> 
> Hi Charls,
> Thanks for your suggestion. Actually the transactions folder will have more
> yearwise folder. But i want to query only few folders at a time. The
> 
> Regards
> Prabhakar
> 
> On Fri, Jan 17, 2020, 20:01 Charles Givre <cgivre@gmail.com> wrote:
> 
>> Hi there,
>> If you have that directory structure, the following query should work:
>> 
>> SELECT *
>> FROM dfs.<workspace>.`transactions/` as t1
>> 
>> Obviously replacing <workspace> with your workspace.  You can then join
>> that with anything that Drill can query.
>> Best,
>> -- C
>> 
>> 
>> 
>>> On Jan 17, 2020, at 1:27 AM, Prabhakar Bhosaale <bhosale.p.v@gmail.com>
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> I am new to apache drill and trying to retrieve data from json files by
>>> querying the directories.
>>> 
>>> The directory structure is
>>> 
>>>                       |------>Year2012--->trans.json
>>>                       |
>>>                       |
>>> transactions-->|
>>>                       |
>>>                       |------>Year2013--->trans.json
>>> 
>>> I would like to query trans.json from both the sub-directories as one
>> table
>>> and then join the resultant table with another table in a single query.
>>> Please help with possible options. thx
>>> 
>>> Regards
>> 
>> 


Mime
View raw message