drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Kothari <chetan.koth...@oracle.com>
Subject RE: Query on performance using Drill and Amazon s3.
Date Mon, 20 Feb 2017 14:25:57 GMT
Hi Nitin


Where does the query execute?

Does Drill execute query on AWS and fetch results to be displayed?





-----Original Message-----
From: Nitin Pawar [mailto:nitinpawar432@gmail.com] 
Sent: Monday, February 20, 2017 6:19 PM
To: user@drill.apache.org
Subject: Re: Query on performance using Drill and Amazon s3.


how are you doing select * .. using drill UI or sqlline?

where are you running it from ?

is the drill hosted in aws or on your local machine?


I think majority of the time is spent on displaying the result set instead of querying the
file if the drill server is on aws.

If the drill server is local then it might be your network which might take a lot of time
based on s3 bucket location and where your drill server is


On Mon, Feb 20, 2017 at 5:37 PM, PROJJWAL SAHA <HYPERLINK "mailto:proj.saha@gmail.com"proj.saha@gmail.com>


> Hello all,


> I am using 1GB data in the form of .tsv file, stored in Amazon S3 

> using Drill 1.8. I am using default configurations of Drill using S3 

> storage plugin coming out of the box. The drill bits are configured on 

> a 5 node cluster with 32GB RAM and 4VCPU.


> I see that select * from xxx; query takes 23 mins to fetch 1,040,000 rows.


> Is this the expected behaviour ?

> I am looking for any quick tuning that can improve the performance or 

> any other suggestions.


> Attaching is the JSON profile for this query.


> Regards,

> Projjwal






Nitin Pawar


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message