drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kunal Khatua <kkha...@mapr.com>
Subject RE: Apache Drill Question
Date Thu, 15 Jun 2017 06:54:17 GMT
Not familiar with SSHFS or GlusterFS specs, but It should, in theory, work out of the box.


You can start off Drill with having the underlying storage plugins talk to a localFS. I'm
presuming SSHFS / GlusterFS can expose the files through a local NFS-like mount.

However, if your three nodes allow their 3 local Drillbits to view the same file, it is likely
that, as a cluster, the Drillbits will interpret it as the same file (similar to HDFS). It's
something you'll need to try. A simple test would be to simply do a rowcount on a parquet
file. If you get 3x the actual count.. my theory is wrong and you'll need to figure out a
way to ensure that the 3 Drillbits don't replicate the file scans 3 times independently. Else,
you're good.

Let us know how it works out! :)

~ Kunal

P.S.: There's no such thing as a stupid question if you don't already know the answer to it.


-----Original Message-----
From: Edgardo Robles [mailto:edgardo.robles@outlook.com] 
Sent: Wednesday, June 14, 2017 5:04 PM
To: user@drill.apache.org
Subject: Apache Drill Question


Hi,

I setup a 3 node zooker/drill cluster but would like to test parquet files but do not want
to setup hdfs.  Would drill work if I used sshfs or glusterfs to store the parquet files and
the cluster be able to query the parquet file with similar performance as hdfs or is using
sshfs or glusterfs fundamentally work differently and I am trying to do something stupid.
 Thank you for any feedback. I tried to search on Google but did not find any links to drill
and glusterfs or sshfs.

-Edgardo Robles

Mime
View raw message