drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Gilmore <dragoncu...@gmail.com>
Subject Re: Query performance and clustering
Date Thu, 26 Mar 2015 06:15:44 GMT
The most consuming of the operators is the hash aggregate, followed by the
Parquet group scan, which makes a fair bit of sense.

The memory utilization on the boxes is about 6-7GB out of the possible 8GB
(so usually have some free memory).  I checked the logs but didn't see any
alerts re GC.  Do I need debug logging for this or?

On Thu, Mar 26, 2015 at 1:38 AM, David Tucker <dtucker@maprtech.com> wrote:

> I’ll second Adnries’ comment about measurable performance in AWS : you
> should not expect consistency there (especially with instance types that
> are smaller than a physical server, such as the c3.xlarge instances you’re
> using).
>
> How does the memory utilization look during your queries ?   Memory
> pressures often manifest as CPU loading, especially in the pathological
> case of excessive Java garbage collection.   Drill does an excellent job of
> separating the data being queried from the traditional Java heap … but
> there can still be some pressure there.   Check the drillbit logs and see
> if GC’s are occuring more frequently as your query count goes up.
>
> — David
>
>
> On Mar 25, 2015, at 8:09 AM, Andries Engelbrecht <
> aengelbrecht@maprtech.com> wrote:
>
> > What version of Drill are you running?
> >
> > It sounds like you are CPU bound, and the query time increases 10x with
> a 30x increase in concurrency (which looks pretty good at first glance)
> > At a high level this seems to be pretty reasonable, hard to give more
> specifics without seeing the query profiles. What is consuming the most
> time (and resource) in the query profiles? Perhaps there are some gains to
> be had in optimizing the queries.
> >
> > If the cluster is primarily used for Drill you may want to adjust the
> planner.width.max_per_node system parameter to consume more of the cores on
> the nodes.
> > See what the current setting in in sys.options, and adjust to no more
> than the number of cores on the node. Experimenting with this may help a
> bit.
> > You also may want to experiment with planner.width.max_per_query.
> > I have not looked into the queue mechanisms in detail yet, but it
> doesn’t seem that the cluster is having issues with how it is managing
> concurrency.
> >
> > Keep in mind AWS can be inconsistent in terms of performance, so hard to
> measure exacts on a cloud platform.
> >
> > —Andries
> >
> > On Mar 25, 2015, at 5:44 AM, Adam Gilmore <dragoncurve@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> I'm doing some testing on query performance, especially in a clustered
> >> environment.
> >>
> >> The test data is 5 Parquet files with 2.2 million records in each file
> >> (total of ~11m).
> >>
> >> The cluster is an Amazon EMR cluster with a total of 10 drillbits
> >> (c3.xlarge instances).
> >>
> >> A single SUM() with a GROUP BY results in a ~700ms query.
> >>
> >> We setup about 30 agents running a query every second (total 30 queries
> per
> >> second) and the performance drops to queries at about 6-7 seconds.
> >>
> >> The bottleneck seems to be entirely CPU based - all drillbits' CPUs are
> >> fairly swamped.
> >>
> >> Looking at the plans, the Parquet scan still performs fairly well, but
> the
> >> hash aggregate gets gradually slower and slower (obviously competing for
> >> CPU time).
> >>
> >> Is this the expected query times for such a setup?  Is there anything
> >> further I can investigate to gain more performance?
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message