tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Travis Woodruff (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TEZ-3582) Exception swallowed in PipelinedSorter causing incorrect results
Date Wed, 18 Jan 2017 15:21:26 GMT
Travis Woodruff created TEZ-3582:
------------------------------------

             Summary: Exception swallowed in PipelinedSorter causing incorrect results
                 Key: TEZ-3582
                 URL: https://issues.apache.org/jira/browse/TEZ-3582
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.8.4
            Reporter: Travis Woodruff


I've run into a potentially serious issue with yarn-tez mapreduce.

We've recently moved from using classic mapreduce on hadoop 1.0.3 to using Tez, and a user
noticed a data inconsistency in some results calculated via yarn-tez.

On investigation, I've determined that an error occurred during key deserialization while
sorting. 

In this case, {{PipelinedSorter.SpanMerger.ready()}} caught the resulting {{ExecutionException}},
logged the message (though it should really be logging the stack trace as well), and returned
false. {{PipelinedSorter.spill()}} interpreted the returned false as an empty spill and continued
with no indication that an error occur. This resulted in data that existed in the sort buffer
after the error record being lost.

I suspect that there may also be an error somewhere else in the sort code that is causing
buffer corruption (or index corruption), since we've been using this mapreduce code for years
and have never seen a deserialization error here; however, I can't confirm that there isn't
a subtle error on our side.

In any case, the fact that Tez is silently swallowing errors is a critical issue for us, as
we can't trust the results it produces.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message