sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boglarka Egyed (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SQOOP-3243) Importing LOB data causes "Stream closed" error on encrypted HDFS
Date Tue, 24 Oct 2017 10:05:00 GMT
Boglarka Egyed created SQOOP-3243:

             Summary: Importing LOB data causes "Stream closed" error on encrypted HDFS
                 Key: SQOOP-3243
                 URL: https://issues.apache.org/jira/browse/SQOOP-3243
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.6
            Reporter: Boglarka Egyed

Importing LOB data into encrypted zone causes "Stream closed" error:

17/10/12 07:16:04 INFO mapreduce.Job: Running job: job_1507777811520_5091
17/10/12 07:16:13 INFO mapreduce.Job: Job job_1507777811520_5091 running in uber mode : false
17/10/12 07:16:13 INFO mapreduce.Job: map 0% reduce 0%
17/10/12 07:22:37 INFO mapreduce.Job: Task Id : attempt_1507777811520_5091_m_000000_0, Status
Error: java.io.IOException: Stream closed
at org.apache.hadoop.crypto.CryptoOutputStream.checkStream(CryptoOutputStream.java:268)
at org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:255)
at java.io.FilterOutputStream.flush(FilterOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:141)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at org.apache.commons.io.output.ProxyOutputStream.close(ProxyOutputStream.java:117)
at org.apache.sqoop.io.LobFile$V0Writer.close(LobFile.java:1669)
at org.apache.sqoop.lib.LargeObjectLoader.close(LargeObjectLoader.java:96)
at org.apache.sqoop.mapreduce.AvroImportMapper.cleanup(AvroImportMapper.java:79)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The root cause of this issue seems to be in LobFile.close method, which is being invoked from
the Map cleanup. In line 1669, from the stacktrace, it's trying to close countingOut OS. However,
at line 1664, out OS is already being closed. However, out OS is just a wrapper of countingOut
OS, so at the end, both are pointing to same instance of CryptoOutputStream. When the call
reaches line 1669, CryptoOutputStream instance is already closed by line 1664. The problem
happens because java.io.BufferedOutputStream will try to call flush on the underlying OS it's
wrapping (in this case, CryptoOutputStream), reaching line 255 of CryptoOutputStream.

This message was sent by Atlassian JIRA

View raw message