drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Re: CPU Resource Management
Date Tue, 04 Aug 2015 19:01:44 GMT
Does the planner.width.max_per_node basically set the max CPU cores it can
use? so lets say I have a node with 20 physical cores (40 vcores), and I
want my drill bit to use 20 of them, is it as simple as
planner.width.max_per_node=20?   I guess I am trying to figure out a way to
basically tell the  bit for all queries, that 20 is the max it can use
because that's where I am going line things up with mesos. Additionally, I
think setting that at a "query" level is not good, because I could have a
homogeneous cluster, and a system wide value of 20 would work, but what I
have some drill bits that are set to be 10 cores, and another set to be 20
because of the difference in sizes.  That's where having a "bit level"
limitation on maximum cpu resources a bit can take could be advantageous,
especially considering a frame work that may be able to spin up and spin
down nodes based on cluster resource management.

On Tue, Aug 4, 2015 at 11:54 AM, Andries Engelbrecht <
aengelbrecht@maprtech.com> wrote:

> It is probably best to control thins more carefully when using more
> specialized environments such as Mesos, than relying on default install
> options.
> Since the CPU/execution threads in Drill is dynamic you are probably
> better of just using
> alter system set `planner.width.max_per_node` = <thread count>
> to control the CPU utilization.
>
> Do keep in mind the suggestions by Jacques to take concurrency into
> account, etc when using the queue and width parameters.
>
> For scripting you can also use sqlline —run=<path/to/script file>  to
> change the drill config for dynamic options on the fly.
>
> Have not tried multiple small drillbits, but will likely not be optimal
> for resource optimization and management/configuration will be more
> challenging.
>
> —Andries
>
>
>
> > On Aug 4, 2015, at 9:30 AM, Timothy Chen <tnachen@gmail.com> wrote:
> >
> > Hi John,
> >
> > I think Drill will not detect the number of cpus that it was limited
> > to by Mesos, since Mesos uses cgroup limits and doesn't really limit
> > the number of processors that it can run on.
> >
> > And yes I think a custom per node drill bit setting is required, which
> > is a perfect motivation to have a Drill Mesos Framework that can
> > automatically set these configuration for you.
> >
> > Tim
> >
> >
> >
> > On Tue, Aug 4, 2015 at 8:23 AM, John Omernik <john@omernik.com> wrote:
> >> This is interesting, but also leads to more questions. :) *I hope you
> don't
> >> mind.
> >>
> >> If I execute Drill using cgroups isolation with Marathon/Mesos, and
> tell a
> >> certain bit to use 4 CPU shares on a 8 CPU node, Is drill going to be
> aware
> >> that it's limited to 4 CPUS and plan accordingly, or will use some sort
> of
> >> system call to determine the number of cores, not the number of
> >> cores/shares it has access to?  I could see that being an issue in the
> >> default calculation.
> >>
> >> So that leads me to the next question, if I am running Drill in a shared
> >> environment like this, to actually work with this, I have to do a custom
> >> per_node sitting per drill bit and have that line up with my cgroup
> >> resource allocation with Marathon Mesos... correct?
> >>
> >> Is there any plans to making this more of a hard env variable that can
> be
> >> passed to the drill bit on start up?  This seems to make the
> coordination a
> >> lot easier.  Any other options that may make sense?
> >>
> >> That leads me to another question?  Is it better to have one big drill
> bit
> >> per node for multiple users to work with, or smaller, say per department
> >> drill bits (but multiple of them) per node.   Just looking for planning
> >> purposes.
> >>
> >> Thanks for you help !!
> >>
> >> John
> >>
> >> On Tue, Aug 4, 2015 at 9:18 AM, Jacques Nadeau <jacques@dremio.com>
> wrote:
> >>
> >>> Internally, there are also some soft capabilities.  These include using
> >>> planner.max.width.per.node and queues:
> >>>
> >>>
> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/
> >>>
> >>> --
> >>> Jacques Nadeau
> >>> CTO and Co-Founder, Dremio
> >>>
> >>> On Tue, Aug 4, 2015 at 6:38 AM, John Omernik <john@omernik.com> wrote:
> >>>
> >>>> I am looking to work with drill in a managed cluster (having it play
> nice
> >>>> with Mesos).  While I can limit the ram in the drill-env.sh, the CPU
> is
> >>> not
> >>>> limitable, therefore, drill can just grab all the CPU resources it
> wants.
> >>>> Is there any plans to include some self limiting to Drill on CPU
> >>> resources?
> >>>> In the docs it says use CGroups, which I need to read up on, but
> >>> frameworks
> >>>> like Spark and Impala allow you to set the CPU resources in the
> >>> framework.
> >>>> Is CGroups going to get me similar behavior to those? Are there
> >>>> disadvantages to setting these resources in drill itself?
> >>>>
> >>>> Thanks
> >>>>
> >>>> John
> >>>>
> >>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message