drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carol McDonald <cmcdon...@maprtech.com>
Subject Re: JDBC data source
Date Fri, 06 Mar 2015 14:16:29 GMT
jdbc connects you to a drill bit .  drill bits use storage plugins to
connect to data stores.  There needs to be a rdbms storage plugin to read
data from a rdbms.

http://drill.apache.org/architecture/

The flow of a Drill query typically involves the following steps:

   - The Drill client issues a query. A Drill client is a JDBC, ODBC,
   command line interface or a REST API. Any Drillbit in the cluster can
   accept queries from the clients. There is no master-slave concept.
   - The Drillbit then parses the query, optimizes it, and generates a
   distributed query plan that is optimized for fast and efficient execution.
   - The Drillbit that accepts the query becomes the driving Drillbit node
   for the request. It gets a list of available Drillbit nodes in the cluster
   from ZooKeeper. The driving Drillbit determines the appropriate nodes to
   execute various query plan fragments to maximize data locality.
   - The Drillbit schedules the execution of query fragments on individual
   nodes according to the execution plan.
   - The individual nodes finish their execution and return data to the
   driving Drillbit.
   - The driving Drillbit streams results back to the client.

 Storage plugin interfaces: Drill serves as a query layer on top of several
data sources. Storage plugins in Drill represent the abstractions that
Drill uses to interact with the data sources. Storage plugins provide Drill
with the following information:

• Metadata available in the source
• Interfaces for Drill to read from and write to data sources
• Location of data and a set of optimization rules to help with efficient
and faster execution of Drill queries on a specific data source

In the context of Hadoop, Drill provides storage plugins for files and
HBase/M7. Drill also integrates with Hive as a storage plugin since Hive
provides a metadata abstraction layer on top of files, HBase/M7, and
provides libraries to read data and operate on these sources (SerDes and
UDFs).

When users query files and HBase/M7 with Drill, they can do it directly or
go through Hive if they have metadata defined there. Drill integration with
Hive is only for metadata. Drill does not invoke the Hive execution engine
for any requests.

On Fri, Mar 6, 2015 at 8:43 AM, Chevalier Julien <jc@numerigraphe.com>
wrote:

> Dear List
>
> I Wonder if Drill can read data from a RDBMS through a good old JDBC driver
>
> Examples are welcome.
>
> Best regards
>
> Juju
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message