drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andries Engelbrecht <aengelbre...@maprtech.com>
Subject Re: CPU Resource Management
Date Tue, 04 Aug 2015 16:54:10 GMT
It is probably best to control thins more carefully when using more specialized environments
such as Mesos, than relying on default install options.
Since the CPU/execution threads in Drill is dynamic you are probably better of just using
alter system set `planner.width.max_per_node` = <thread count>
to control the CPU utilization.

Do keep in mind the suggestions by Jacques to take concurrency into account, etc when using
the queue and width parameters.

For scripting you can also use sqlline —run=<path/to/script file>  to change the drill
config for dynamic options on the fly.

Have not tried multiple small drillbits, but will likely not be optimal for resource optimization
and management/configuration will be more challenging.

—Andries



> On Aug 4, 2015, at 9:30 AM, Timothy Chen <tnachen@gmail.com> wrote:
> 
> Hi John,
> 
> I think Drill will not detect the number of cpus that it was limited
> to by Mesos, since Mesos uses cgroup limits and doesn't really limit
> the number of processors that it can run on.
> 
> And yes I think a custom per node drill bit setting is required, which
> is a perfect motivation to have a Drill Mesos Framework that can
> automatically set these configuration for you.
> 
> Tim
> 
> 
> 
> On Tue, Aug 4, 2015 at 8:23 AM, John Omernik <john@omernik.com> wrote:
>> This is interesting, but also leads to more questions. :) *I hope you don't
>> mind.
>> 
>> If I execute Drill using cgroups isolation with Marathon/Mesos, and tell a
>> certain bit to use 4 CPU shares on a 8 CPU node, Is drill going to be aware
>> that it's limited to 4 CPUS and plan accordingly, or will use some sort of
>> system call to determine the number of cores, not the number of
>> cores/shares it has access to?  I could see that being an issue in the
>> default calculation.
>> 
>> So that leads me to the next question, if I am running Drill in a shared
>> environment like this, to actually work with this, I have to do a custom
>> per_node sitting per drill bit and have that line up with my cgroup
>> resource allocation with Marathon Mesos... correct?
>> 
>> Is there any plans to making this more of a hard env variable that can be
>> passed to the drill bit on start up?  This seems to make the coordination a
>> lot easier.  Any other options that may make sense?
>> 
>> That leads me to another question?  Is it better to have one big drill bit
>> per node for multiple users to work with, or smaller, say per department
>> drill bits (but multiple of them) per node.   Just looking for planning
>> purposes.
>> 
>> Thanks for you help !!
>> 
>> John
>> 
>> On Tue, Aug 4, 2015 at 9:18 AM, Jacques Nadeau <jacques@dremio.com> wrote:
>> 
>>> Internally, there are also some soft capabilities.  These include using
>>> planner.max.width.per.node and queues:
>>> 
>>> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Tue, Aug 4, 2015 at 6:38 AM, John Omernik <john@omernik.com> wrote:
>>> 
>>>> I am looking to work with drill in a managed cluster (having it play nice
>>>> with Mesos).  While I can limit the ram in the drill-env.sh, the CPU is
>>> not
>>>> limitable, therefore, drill can just grab all the CPU resources it wants.
>>>> Is there any plans to include some self limiting to Drill on CPU
>>> resources?
>>>> In the docs it says use CGroups, which I need to read up on, but
>>> frameworks
>>>> like Spark and Impala allow you to set the CPU resources in the
>>> framework.
>>>> Is CGroups going to get me similar behavior to those? Are there
>>>> disadvantages to setting these resources in drill itself?
>>>> 
>>>> Thanks
>>>> 
>>>> John
>>>> 
>>> 


Mime
View raw message