drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleksandr Kalinin <alexk...@gmail.com>
Subject Re: Drill Hangout tomorrow 08/21
Date Tue, 21 Aug 2018 16:04:04 GMT
Hi Volodymyr,

Just recalling on recent discussions in DEV list, it would be interesting
to see if following topics are addressed in the Drill metadata management

1. Avoiding repetition of Hive mistakes (mainly relying on RDBMS)
Just to substantiate this point of view from practical experience, and if
we reflect on ambition to integrate and operate Drill in mission-critical
environment, following aspects could be listed:
  - Need of DBA support if cluster is subject to service level
objectives/agreements, which is somehow remote from Hadoop world. Need of
strong DBA skills if resulting DB workload is challenging in terms of
performance tuning.
  - Common RDBMS setups offer active-standby HA model. In secure
environments, e.g. environments which are subject to PCI-DSS compliancy,
that implies frequent OS patching and reboot (in reality every 30 days
max), thus causing an additional coordination effort and service outage for
duration of the failovers.
  - Active-active HA clusters like Galera / Percona are free of above
disadvantage, but require specific skill set which is not widespread in DBA
community. Also they are sensitive to even disk IO performance across the
cluster which may require additional hardware adjustment and IO isolation.
  - Need of backup / restore mechanism, which is probably lesser of concerns

2. Bottleneck in foreman when performing initial metadata collection (and
eventually pruning) on large amount of Parquet files
  - From discussion in the mailing list it was not fully clear whether
metastore will address it
  - Or shall this discussion be continued outside of metastore initiative
from your point of view?

I hope it would be OK with you and Vitalii to share some thoughts on this.

Thanks & Best Regards,

On Mon, Aug 20, 2018 at 10:50 PM Volodymyr Vysotskyi <volodymyr@apache.org>

> Hi all,
> I and Vitalii Diravka want to give the presentation with our ideas
> connected with Drill Metadata management project (DRILL-6552
> <https://issues.apache.org/jira/browse/DRILL-6552>).
> We will be happy to discuss it and choose the right way for further
> development.
> Kind regards,
> Volodymyr Vysotskyi
> On Mon, Aug 20, 2018 at 10:35 PM Hanumath Rao Maduri <hanu.ncr@gmail.com>
> wrote:
> > The Apache Drill Hangout will be held tomorrow at 10:00am PST; please let
> > us know should you have a topic for tomorrow's hangout. We will also ask
> > for topics at the beginning of the hangout.
> >
> > Hangout Link -
> > https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
> >
> > Regards,
> > Hanu
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message