drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <prog...@mapr.com>
Subject "Batch size control" project update
Date Wed, 08 Nov 2017 22:08:29 GMT
Hi All,

The “batch size control” project seeks to further improve Drill’s internal memory management
by controlling the size of batches created in Drill’s scan and “combining” operators.
We’ve provided updates in the Hangouts. This note is for folks who may have missed those
discussions.

Drill currently faces two pesky problems in certain data sets:

* Readers consume a fixed number of records (often 1K, 4K or 64K.) If an individual column
happens to be “wide”, then Drill may allocate a vector larger than 16 MB. Because of how
Drill uses the Netty memory allocator, such “jumbo” vectors can lead to direct memory
fragmentation and premature OOM (out-of-memory) errors.

* If no individual column is not wide, the overall row may be, resulting in very large record
batches, too large, in fact for the new memory-limited sort and hash agg operators to consume.
(We’ve seen 100 MB batches sent to a sort limited to use 30 MB of memory.) This leads to
an OOM error because the downstream operator exceeds the memory limit set by the planner.

The same problems can arise in “combining” operators such as flatten, project, join, etc.
These operators take incoming batches and produce new, possibly larger, batches. These operations
can cause the same memory issues as described above for readers.

The goal of the “batch size control” project is to limit individual vectors to no more
than 16 MB, and to limit overall batch size to some configured maximum. In practice, the maximum
batch size will eventually be set by the Drill planner as part of its memory allocation planning.

The design is inspired by the set of “complex (vector) writers” that already exist in
Drill. The existing vector writers (and thus the new set as well) are modeled after JSON structure
builders. They write scalars, tuples and arrays. (“Tuple” is a fancy name for what Hive
calls a “struct” and Drill calls a “map”. Both tuples and arrays occur in JSON-like
data sources.)

Unlike the existing set, the new set also monitors memory use and takes the required actions.
The key challenge in limiting size is that we don’t know ahead of time the size of the data
we wish to write to a vector, nor can most readers “backtrack” and rewrite a record if
we discover one of the values causes a vector to overflow.

The key contribution of the new vector writers is to monitor vector and batch sizes. They
employ an algorithm to handle “overflow” when a value is discovered to be too large for
the remaining space. The vector writers roll the current row over to a new batch, allow the
reader to continue writing the current record, then tell the reader that the batch is full
and should be sent downstream. When the reader starts with the next batch, the batch will
already contain the “overflow” row that ended the prior batch.

By using the new mechanism, readers and other operators gain control over their batch sizes
without the need for each reader to implement its own solution.

There is much more to the solution (such as consolidating projection handling in the scan
operator.) These topics can be the subject of future updates.

For more information, see:

* DRILL-5211: https://issues.apache.org/jira/browse/DRILL-5211
* Drill PR #914: https://github.com/apache/drill/pull/914

We welcome your comments, questions and suggestions. Please post them in the above JIRA ticket,
above PR, or to this dev list.

Thanks,

- Paul


Mime
View raw message