drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Gilmore <a...@pharmadata.net.au>
Subject Re: Query performance and clustering
Date Thu, 26 Mar 2015 06:15:34 GMT
The most consuming of the operators is the hash aggregate, followed by the
Parquet group scan, which makes a fair bit of sense.

The memory utilization on the boxes is about 6-7GB out of the possible 8GB
(so usually have some free memory).  I checked the logs but didn't see any
alerts re GC.  Do I need debug logging for this or?


Regards,


*Adam Gilmore*

Director of Technology

adam@pharmadata.net.au


+61 421 997 655 (Mobile)

1300 733 876 (AU)

+617 3171 9902 (Intl)


*PharmaData*

Data Intelligence Solutions for Pharmacy

www.PharmaData.net.au <http://www.pharmadata.net.au/>



[image: pharmadata-sig]



*Disclaimer*

This communication including any attachments may contain information that
is either confidential or otherwise protected from disclosure and is
intended solely for the use of the intended recipient. If you are not the
intended recipient please immediately notify the sender by e-mail and
delete the original transmission and its contents. Any unauthorised use,
dissemination, forwarding, printing, or copying of this communication
including any file attachments is prohibited. The recipient should check
this email and any attachments for viruses and other defects. The Company
disclaims any liability for loss or damage arising in any way from this
communication including any file attachments.

On Thu, Mar 26, 2015 at 1:38 AM, David Tucker <dtucker@maprtech.com> wrote:

> I’ll second Adnries’ comment about measurable performance in AWS : you
> should not expect consistency there (especially with instance types that
> are smaller than a physical server, such as the c3.xlarge instances you’re
> using).
>
> How does the memory utilization look during your queries ?   Memory
> pressures often manifest as CPU loading, especially in the pathological
> case of excessive Java garbage collection.   Drill does an excellent job of
> separating the data being queried from the traditional Java heap … but
> there can still be some pressure there.   Check the drillbit logs and see
> if GC’s are occuring more frequently as your query count goes up.
>
> — David
>
>
> On Mar 25, 2015, at 8:09 AM, Andries Engelbrecht <
> aengelbrecht@maprtech.com> wrote:
>
> > What version of Drill are you running?
> >
> > It sounds like you are CPU bound, and the query time increases 10x with
> a 30x increase in concurrency (which looks pretty good at first glance)
> > At a high level this seems to be pretty reasonable, hard to give more
> specifics without seeing the query profiles. What is consuming the most
> time (and resource) in the query profiles? Perhaps there are some gains to
> be had in optimizing the queries.
> >
> > If the cluster is primarily used for Drill you may want to adjust the
> planner.width.max_per_node system parameter to consume more of the cores on
> the nodes.
> > See what the current setting in in sys.options, and adjust to no more
> than the number of cores on the node. Experimenting with this may help a
> bit.
> > You also may want to experiment with planner.width.max_per_query.
> > I have not looked into the queue mechanisms in detail yet, but it
> doesn’t seem that the cluster is having issues with how it is managing
> concurrency.
> >
> > Keep in mind AWS can be inconsistent in terms of performance, so hard to
> measure exacts on a cloud platform.
> >
> > —Andries
> >
> > On Mar 25, 2015, at 5:44 AM, Adam Gilmore <dragoncurve@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> I'm doing some testing on query performance, especially in a clustered
> >> environment.
> >>
> >> The test data is 5 Parquet files with 2.2 million records in each file
> >> (total of ~11m).
> >>
> >> The cluster is an Amazon EMR cluster with a total of 10 drillbits
> >> (c3.xlarge instances).
> >>
> >> A single SUM() with a GROUP BY results in a ~700ms query.
> >>
> >> We setup about 30 agents running a query every second (total 30 queries
> per
> >> second) and the performance drops to queries at about 6-7 seconds.
> >>
> >> The bottleneck seems to be entirely CPU based - all drillbits' CPUs are
> >> fairly swamped.
> >>
> >> Looking at the plans, the Parquet scan still performs fairly well, but
> the
> >> hash aggregate gets gradually slower and slower (obviously competing for
> >> CPU time).
> >>
> >> Is this the expected query times for such a setup?  Is there anything
> >> further I can investigate to gain more performance?
> >
>
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message