tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kuhu Shukla <kshu...@yahoo-inc.com.INVALID>
Subject Re: DefaultSorter/OrderedWriter and Empty Partitions
Date Fri, 03 Feb 2017 20:19:08 GMT
Thank you Siddharth for your inputs. I have opened https://issues.apache.org/jira/browse/TEZ-3605 to
track this change.


    On Friday, February 3, 2017 12:52 PM, Siddharth Seth <sseth@apache.org> wrote:

 Makes sense to not write out unnecessary data. There's also a case where DefaultSorter writes
out empty files when there's no output. Have to be careful about the configuration which enables
empty partitions via events.
On Thu, Feb 2, 2017 at 3:57 PM, Kuhu Shukla <kshukla@yahoo-inc.com.invalid> wrote:

Hello all,

In DefaultSorter, for the OrderedPartitioned case, is the inclusion of empty partition headers
and segments in the file.out.index and file.out needed?

In the Unordered case, we prune out the empty partitions in the writer and it might be worth
doing the same in the Ordered case unless this behavior was intentional. Would appreciate
some comments on this.

Thanks and Regards,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message