drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andries Engelbrecht <aengelbre...@maprtech.com>
Subject Re: CPU Resource Management
Date Tue, 04 Aug 2015 20:24:07 GMT
It is a bit more involved that just setting the one parameter.

See the link Jacques posted earlier for a better explanation.
https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/ <https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/>

—Andries

> On Aug 4, 2015, at 12:01 PM, John Omernik <john@omernik.com> wrote:
> 
> Does the planner.width.max_per_node basically set the max CPU cores it can
> use? so lets say I have a node with 20 physical cores (40 vcores), and I
> want my drill bit to use 20 of them, is it as simple as
> planner.width.max_per_node=20?   I guess I am trying to figure out a way to
> basically tell the  bit for all queries, that 20 is the max it can use
> because that's where I am going line things up with mesos. Additionally, I
> think setting that at a "query" level is not good, because I could have a
> homogeneous cluster, and a system wide value of 20 would work, but what I
> have some drill bits that are set to be 10 cores, and another set to be 20
> because of the difference in sizes.  That's where having a "bit level"
> limitation on maximum cpu resources a bit can take could be advantageous,
> especially considering a frame work that may be able to spin up and spin
> down nodes based on cluster resource management.
> 
> On Tue, Aug 4, 2015 at 11:54 AM, Andries Engelbrecht <
> aengelbrecht@maprtech.com> wrote:
> 
>> It is probably best to control thins more carefully when using more
>> specialized environments such as Mesos, than relying on default install
>> options.
>> Since the CPU/execution threads in Drill is dynamic you are probably
>> better of just using
>> alter system set `planner.width.max_per_node` = <thread count>
>> to control the CPU utilization.
>> 
>> Do keep in mind the suggestions by Jacques to take concurrency into
>> account, etc when using the queue and width parameters.
>> 
>> For scripting you can also use sqlline —run=<path/to/script file>  to
>> change the drill config for dynamic options on the fly.
>> 
>> Have not tried multiple small drillbits, but will likely not be optimal
>> for resource optimization and management/configuration will be more
>> challenging.
>> 
>> —Andries
>> 
>> 
>> 
>>> On Aug 4, 2015, at 9:30 AM, Timothy Chen <tnachen@gmail.com> wrote:
>>> 
>>> Hi John,
>>> 
>>> I think Drill will not detect the number of cpus that it was limited
>>> to by Mesos, since Mesos uses cgroup limits and doesn't really limit
>>> the number of processors that it can run on.
>>> 
>>> And yes I think a custom per node drill bit setting is required, which
>>> is a perfect motivation to have a Drill Mesos Framework that can
>>> automatically set these configuration for you.
>>> 
>>> Tim
>>> 
>>> 
>>> 
>>> On Tue, Aug 4, 2015 at 8:23 AM, John Omernik <john@omernik.com> wrote:
>>>> This is interesting, but also leads to more questions. :) *I hope you
>> don't
>>>> mind.
>>>> 
>>>> If I execute Drill using cgroups isolation with Marathon/Mesos, and
>> tell a
>>>> certain bit to use 4 CPU shares on a 8 CPU node, Is drill going to be
>> aware
>>>> that it's limited to 4 CPUS and plan accordingly, or will use some sort
>> of
>>>> system call to determine the number of cores, not the number of
>>>> cores/shares it has access to?  I could see that being an issue in the
>>>> default calculation.
>>>> 
>>>> So that leads me to the next question, if I am running Drill in a shared
>>>> environment like this, to actually work with this, I have to do a custom
>>>> per_node sitting per drill bit and have that line up with my cgroup
>>>> resource allocation with Marathon Mesos... correct?
>>>> 
>>>> Is there any plans to making this more of a hard env variable that can
>> be
>>>> passed to the drill bit on start up?  This seems to make the
>> coordination a
>>>> lot easier.  Any other options that may make sense?
>>>> 
>>>> That leads me to another question?  Is it better to have one big drill
>> bit
>>>> per node for multiple users to work with, or smaller, say per department
>>>> drill bits (but multiple of them) per node.   Just looking for planning
>>>> purposes.
>>>> 
>>>> Thanks for you help !!
>>>> 
>>>> John
>>>> 
>>>> On Tue, Aug 4, 2015 at 9:18 AM, Jacques Nadeau <jacques@dremio.com>
>> wrote:
>>>> 
>>>>> Internally, there are also some soft capabilities.  These include using
>>>>> planner.max.width.per.node and queues:
>>>>> 
>>>>> 
>> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/
>>>>> 
>>>>> --
>>>>> Jacques Nadeau
>>>>> CTO and Co-Founder, Dremio
>>>>> 
>>>>> On Tue, Aug 4, 2015 at 6:38 AM, John Omernik <john@omernik.com>
wrote:
>>>>> 
>>>>>> I am looking to work with drill in a managed cluster (having it play
>> nice
>>>>>> with Mesos).  While I can limit the ram in the drill-env.sh, the
CPU
>> is
>>>>> not
>>>>>> limitable, therefore, drill can just grab all the CPU resources it
>> wants.
>>>>>> Is there any plans to include some self limiting to Drill on CPU
>>>>> resources?
>>>>>> In the docs it says use CGroups, which I need to read up on, but
>>>>> frameworks
>>>>>> like Spark and Impala allow you to set the CPU resources in the
>>>>> framework.
>>>>>> Is CGroups going to get me similar behavior to those? Are there
>>>>>> disadvantages to setting these resources in drill itself?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> John
>>>>>> 
>>>>> 
>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message