drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-5023) ExternalSortBatch does not spill fully, throws off spill calculations
Date Fri, 17 Feb 2017 22:22:44 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Paul Rogers resolved DRILL-5023.
    Resolution: Fixed

> ExternalSortBatch does not spill fully, throws off spill calculations
> ---------------------------------------------------------------------
>                 Key: DRILL-5023
>                 URL: https://issues.apache.org/jira/browse/DRILL-5023
>             Project: Apache Drill
>          Issue Type: Sub-task
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
> The {{ExternalSortBatch}} (ESB) operator sorts records, spilling to disk as needed to
operate within a defined memory budget.
> When needed, ESB spills accumulated record batches to disk. However, when doing so, the
ESB carves off the first spillable batch and holds it in memory:
> {code}
>     // 1 output container is kept in memory, so we want to hold on to it and transferClone
>     // allows keeping ownership
>     VectorContainer c1 = VectorContainer.getTransferClone(outputContainer, oContext);
>     c1.buildSchema(BatchSchema.SelectionVectorMode.NONE);
>     c1.setRecordCount(count);
> ...
>     BatchGroup newGroup = new BatchGroup(c1, fs, outputFile, oContext);
> {code}
> When the spill batch size gets larger (to fix DRILL-5022), the result is that nothing
is spilled as the first spillable batch is simply stored back into memory on the (supposedly)
spilled batches list.
> The desired behavior is for all spillable batches to be written to disk. If the first
batch is held back to work around some issue (to keep a schema, say?), then fine a different
solution that allows the actual data to spill.

This message was sent by Atlassian JIRA

View raw message