From user-return-11269-apmail-drill-user-archive=drill.apache.org@drill.apache.org Wed Jul 29 22:43:00 2020 Return-Path: X-Original-To: apmail-drill-user-archive@www.apache.org Delivered-To: apmail-drill-user-archive@www.apache.org Received: from mailroute1-lw-us.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by minotaur.apache.org (Postfix) with ESMTP id B95DA195E6 for ; Wed, 29 Jul 2020 22:43:00 +0000 (UTC) Received: from mail.apache.org (localhost [127.0.0.1]) by mailroute1-lw-us.apache.org (ASF Mail Server at mailroute1-lw-us.apache.org) with SMTP id 67883124578 for ; Wed, 29 Jul 2020 22:43:00 +0000 (UTC) Received: (qmail 94883 invoked by uid 500); 29 Jul 2020 22:42:59 -0000 Delivered-To: apmail-drill-user-archive@drill.apache.org Received: (qmail 94806 invoked by uid 500); 29 Jul 2020 22:42:59 -0000 Mailing-List: contact user-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@drill.apache.org Delivered-To: mailing list user@drill.apache.org Received: (qmail 94790 invoked by uid 99); 29 Jul 2020 22:42:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jul 2020 22:42:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 8A6591A3460 for ; Wed, 29 Jul 2020 22:42:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.199 X-Spam-Level: X-Spam-Status: No, score=-0.199 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id VWjkH9WHbT2u for ; Wed, 29 Jul 2020 22:42:57 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2607:f8b0:4864:20::72f; helo=mail-qk1-x72f.google.com; envelope-from=cgivre@gmail.com; receiver= Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 671007F83D for ; Wed, 29 Jul 2020 22:42:56 +0000 (UTC) Received: by mail-qk1-x72f.google.com with SMTP id l6so23910128qkc.6 for ; Wed, 29 Jul 2020 15:42:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=tGOfvIK/zsnU+wzRBCXZ5ecRTer8TosCMu4hH+pSVko=; b=N4THrFUAN27ULTc6b0jplLhQgC83RP00bIOxgnGKXwVZb/FH1mLk5KlTXIrJxJJbTp YW+lssYBHcH3VHNMDjSL+xNFpy/Us3yGhWJmh55CN1NwpiqK2gGKMSb3mYK17fXsfAoC dMKSL5Dj0bhm9A5/OrOhjFHbuoVor4efOh/RmFqB72se/3xlk0b3ALdTT1T3DAdI4I75 DfGuNJRTwpCUxSU9k9jgYpOm8MMUZHFOf6qbF8y2vsQLZK1KebsJncxA6wb9lIshbR1U ZBnudb4lEdF0O1UxatHkRE1po5R6sjBevB8YYPe9IS/TPeWXItc9TnO+BN06K+8KyITR wdgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=tGOfvIK/zsnU+wzRBCXZ5ecRTer8TosCMu4hH+pSVko=; b=EFEYJmD23lrM9naMPyeM0hwQRwRcPiyeKQ0dBuxE1r20akr6GI61mRTZxM+8aW7lHE FhzCVvfQB99iIo8x5uTjbGDgNgSJpuWW9D3rCle1nBMI/A01Sqw4bZNugP7XDRGsIKeE vRdWAjxXPHTyzhHAuJe17Z5e6TW36z2Em3lYa0IgprynTUQHJ6X4KK4PPh4p9FMG/PgA Y3Yl+TphjL2kfGrSipTOQJauiL8ztADDUOTdmdjj1j/RHVInea3IH8y14M9VC4iiCRcw Qxm6TU5rNwrMX33pJYQNRCNUUZ6IF9g/E/sMlS9XK1RUBYkYLUCBP24/3NczxuwhX1Gy NmHA== X-Gm-Message-State: AOAM531BdBiY7+ElP2AB/DvamtbYBuKJVl7uhEzcfaEzRDxGSTFqXqJ1 dPXaWQv/CIujqKzDYQOGLsOdxySrfug= X-Google-Smtp-Source: ABdhPJw7FEbOZ9TwHL6O5e0hawR5lJW2gc3DDPF9TMrgyms7mYiu1DFCaOhbG/lFLRSXJPPrtAVo8A== X-Received: by 2002:a37:5603:: with SMTP id k3mr34985341qkb.90.1596062574847; Wed, 29 Jul 2020 15:42:54 -0700 (PDT) Received: from [192.168.1.15] (pool-71-179-59-182.bltmmd.fios.verizon.net. [71.179.59.182]) by smtp.gmail.com with ESMTPSA id y143sm2775573qka.22.2020.07.29.15.42.53 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 29 Jul 2020 15:42:53 -0700 (PDT) From: Charles Givre Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\)) Subject: Re: GitHub raw data as a Data source Date: Wed, 29 Jul 2020 18:42:53 -0400 References: To: user In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3608.120.23.2.1) Paul,=20 Actually, there was a PR which added some code so that you can read CSVs = from APIs, so no coding necessary! You have to be using the latest = build of Drill 1.18-SNAPSHOT.=20 Follow the instructions here to set up the HTTP plugin: [1] The key parameter is the `inputType` parameter which tells Drill to = expect JSON or CSV from the API. The default is JSON. The config = below is an example configuration to do exactly what you're describing. =20= "met": { "url": = "https://media.githubusercontent.com/media/metmuseum/openaccess/master/Met= Objects.csv", "method": "GET", "headers": null, "authType": "none", "userName": null, "password": null, "postBody": null, "params": null, "dataPath": null, "requireTail": false, "inputType": "csv" } Good luck! -- C [1]: https://github.com/apache/drill/tree/master/contrib/storage-http > On Jul 29, 2020, at 6:29 PM, Paul Rogers wrote: >=20 > Hi Faraz, >=20 > The short answer is, "yes, but you have to write some code." Drill can > process any tabular data, but needs a reader (a "storage plugin") to > convert from the API's data format to Drill's value vector format. The = good > news is that, for most formats, readers already exist. Your file = appears to > be CSV: Drill provides a CSV reader. What Drill does not provide is a > storage plugin to read CSV from a REST call. It should be easy to = create > one: just start with (or better, modify) the REST storage plugin. = Instead > of creating a JSON decoder for the data, create a CSV decoder. >=20 > If you choose to go this route, we can give you pointers for how to > proceed. Alternatively, you can use a script to download the data to a > local file, then use the existing CSV reader to query the data. Not > elegant, but may be fine if you do the query infrequently. >=20 > - Paul >=20 >=20 > On Wed, Jul 29, 2020 at 1:06 PM Faraz Ahmad = wrote: >=20 >> Hi Team, >>=20 >>=20 >>=20 >> Is there any way we can able to query csv file data from GitHub using >> Apache Drill? >>=20 >>=20 >>=20 >> Currently, I can able to pull this GitHub data into Power BI by using = Web >> data connection with below URL: >>=20 >>=20 >>=20 >>=20 >> = https://raw.githubusercontent.com/itsnotaboutthecell/Power-BI-Sessions/mas= ter/An%20Introduction%20to%20Tabular%20Editor/Source%20Files/Customers.csv= >>=20 >>=20 >>=20 >>=20 >>=20 >> My goal is to pull this data outside of Power BI, mash up with other = data >> and then simply create a view within Drill. >>=20 >> This view will then be connected to Power BI thru Drill ODBC = connection. >>=20 >>=20 >>=20 >> Kindly let me know if this is possible. Thanks so much! >>=20 >>=20 >>=20 >>=20 >>=20 >> Regards, >>=20 >> Faraz Ahmad >>=20 >>=20 >>=20