drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: query performance with unequal drillbits
Date Mon, 27 Aug 2018 22:33:11 GMT
Paul,

Thanks for the reality side of this. Configuring a system to handle unusual
setups can definitely be a challenge.

Btw, the general term for running several sub-scale workers on each node to
allow more flexibility is "micro-sharding".



On Mon, Aug 27, 2018 at 3:24 PM Paul Rogers <par0328@yahoo.com.invalid>
wrote:

> Hi All,
>
> For those following along who have not tried Ted's idea (running multiple
> Drillbits per host), note that when running two or more Drillbits per node,
> the admin is responsible for choosing non-conflicting port numbers.
>
> The port numbers are configured in drill-override.conf. See
> drill-override-example.conf for more info. By default, drill-override.conf
> is in $DRILL_HOME/conf, which would seem to imply that you must create a
> separate copy of the Drill distro for each Drillbit on each node. You'd
> then start Drill by pointing to the Drillbit-specific distro:
>
> $DRILL_HOME1/bin/drillbit.sh start
>
> For Drillbits 1, 2, 3...
>
> An alternative is to use the site directory feature. You still need a
> separate site directory per Drillbit, but they can share the Drill distro.
>
> $DRILL_HOME/bin/drillbit.sh start --site $DRILL_SITE1
>
> For a common $DRILL_HOME but separate sites for 1, 2, 3...
>
> Yet another approach is to pass the ports on the command line. The config
> system is supposed to allow this. I've not personally tested this, so
> caveat emptor:
>
> $DRILL_HOME/bin/drillbit.sh start -Ddrill.exec.rpc.user.server.port=31110
>
> You could wrap the above in a script so you can share both the Drill
> distro and config across Drillbits.
>
> Thanks,
> - Paul
>
>
>
>     On Monday, August 27, 2018, 6:17:11 AM PDT, John Omernik <
> john@omernik.com> wrote:
>
>  I will +1 Ted's idea. By doing small drillbits, it does take a bit more
> overhead, but you also have an ability to scale your Drill cluster size
> (especially using the Drillbit shutdown features added recently).
>
>
>
> On Wed, Aug 22, 2018 at 8:23 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
>
> > Cool
> >
> > On Wed, Aug 22, 2018, 17:07 scott <tcots8888@gmail.com> wrote:
> >
> > > Thanks Ted and Paul. I've been experimenting with the "hack" method. It
> > > works somewhat, and I guess will have to do.
> > >
> > > On Tue, Aug 21, 2018 at 2:50 PM Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> > >
> > > > A cheap hack is to use multiple smaller drillbits. Put more drillbits
> > on
> > > > the hefty machines and fewer on the weaker ones.
> > > >
> > > > This increases overheads, but it might help you out.
> > > >
> > > >
> > > >
> > > > On Tue, Aug 21, 2018 at 1:48 PM scott <tcots8888@gmail.com> wrote:
> > > >
> > > > > Hi community,
> > > > > I am trying to find a way to tune Drill so that weaker drillbits
> get
> > > less
> > > > > data to work on so that the weak link doesn't drag my performance
> > > down. I
> > > > > have drillbits running on a variety of hardware and sometimes these
> > > > shared
> > > > > resources get really slow. It seems that the query plan always
> evenly
> > > > > divides the data fragments so that each drillbit gets the same data
> > to
> > > > chew
> > > > > on. How do I make it give weaker drillbits less data?
> > > > >
> > > > > Alternatively, is there a way to limit and queue fragments of the
> > query
> > > > and
> > > > > leave them unassigned, then assign to drillbits as their resources
> > > become
> > > > > free, similar to MapReduce?
> > > > >
> > > > > Thanks for you time,
> > > > > Scott
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message