Paul,
Actually, there was a PR which added some code so that you can read CSVs from APIs, so no
coding necessary! You have to be using the latest build of Drill 1.18-SNAPSHOT.
Follow the instructions here to set up the HTTP plugin: [1]
The key parameter is the `inputType` parameter which tells Drill to expect JSON or CSV from
the API. The default is JSON. The config below is an example configuration to do exactly
what you're describing.
"met": {
"url": "https://media.githubusercontent.com/media/metmuseum/openaccess/master/MetObjects.csv",
"method": "GET",
"headers": null,
"authType": "none",
"userName": null,
"password": null,
"postBody": null,
"params": null,
"dataPath": null,
"requireTail": false,
"inputType": "csv"
}
Good luck!
-- C
[1]: https://github.com/apache/drill/tree/master/contrib/storage-http
> On Jul 29, 2020, at 6:29 PM, Paul Rogers <par0328@gmail.com> wrote:
>
> Hi Faraz,
>
> The short answer is, "yes, but you have to write some code." Drill can
> process any tabular data, but needs a reader (a "storage plugin") to
> convert from the API's data format to Drill's value vector format. The good
> news is that, for most formats, readers already exist. Your file appears to
> be CSV: Drill provides a CSV reader. What Drill does not provide is a
> storage plugin to read CSV from a REST call. It should be easy to create
> one: just start with (or better, modify) the REST storage plugin. Instead
> of creating a JSON decoder for the data, create a CSV decoder.
>
> If you choose to go this route, we can give you pointers for how to
> proceed. Alternatively, you can use a script to download the data to a
> local file, then use the existing CSV reader to query the data. Not
> elegant, but may be fine if you do the query infrequently.
>
> - Paul
>
>
> On Wed, Jul 29, 2020 at 1:06 PM Faraz Ahmad <faraz.ahmad@outlook.com> wrote:
>
>> Hi Team,
>>
>>
>>
>> Is there any way we can able to query csv file data from GitHub using
>> Apache Drill?
>>
>>
>>
>> Currently, I can able to pull this GitHub data into Power BI by using Web
>> data connection with below URL:
>>
>>
>>
>>
>> https://raw.githubusercontent.com/itsnotaboutthecell/Power-BI-Sessions/master/An%20Introduction%20to%20Tabular%20Editor/Source%20Files/Customers.csv
>>
>>
>>
>>
>>
>> My goal is to pull this data outside of Power BI, mash up with other data
>> and then simply create a view within Drill.
>>
>> This view will then be connected to Power BI thru Drill ODBC connection.
>>
>>
>>
>> Kindly let me know if this is possible. Thanks so much!
>>
>>
>>
>>
>>
>> Regards,
>>
>> Faraz Ahmad
>>
>>
>>
|