hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11270) Seek behavior difference between NativeS3FsInputStream and DFSInputStream
Date Fri, 01 Jul 2016 11:16:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358813#comment-15358813

Steve Loughran commented on HADOOP-11270:

Usual S3x patch question: which S3 installation have you run the full hadoop-aws test suite
against? Jenkins doesn't test that bit, see.

Also: is there a seek test that we need? I've done a lot of extra work on seek tests on S3A,
and actually hoped that I'd fixed this issue there. If S3n still has it, then the other S3
and object store clients may still have it too. 

Could you see what you can add to {{AbstractContractSeekTest}} in branch-2 or trunk to create
the problem before your patch goes in, make it go away after. And, if s3a, s3, swift and azure
have the issue, have their subclasses skip that test for now ... that'd be extra patches

> Seek behavior difference between NativeS3FsInputStream and DFSInputStream
> -------------------------------------------------------------------------
>                 Key: HADOOP-11270
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11270
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.5.1
>            Reporter: Venkata Puneet Ravuri
>            Assignee: Venkata Puneet Ravuri
>         Attachments: HADOOP-11270.02.patch, HADOOP-11270.03.patch, HADOOP-11270.04.patch,
> There is a difference in behavior while seeking a given file present
> in S3 using NativeS3FileSystem$NativeS3FsInputStream and a file present in HDFS using
> If we seek to the end of the file incase of NativeS3FsInputStream, it fails with exception
"java.io.EOFException: Attempted to seek or read past the end of the file". That is because
a getObject request is issued on the S3 object with range start as value of length of file.
> This is the complete exception stack:-
> Caused by: java.io.EOFException: Attempted to seek or read past the end of the file
> at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462)
> at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
> at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at org.apache.hadoop.fs.s3native.$Proxy17.retrieve(Unknown Source)
> at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:205)
> at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:96)
> at org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:67)
> at java.io.DataInputStream.skipBytes(DataInputStream.java:220)
> at org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:739)
> at org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1720)
> at org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1898)
> at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:149)
> at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:44)
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
> ... 15 more

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message