hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
Date Wed, 10 Jul 2013 15:07:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704636#comment-13704636
] 

Steve Loughran commented on HADOOP-9361:
----------------------------------------

The latest patch now has tests for : create, open, delete, mkdir and seek. I'm ignoring the
rename tests as I need to fully understand what HADOOP-6240 has defined first.

h3. seek
# I've been through the code and fixed wherever a -ve seek was either ignored or raised an
{{IOException}} into an {{EOFException}}. This
included changes to {{ChecksumFileSystem}}, {{RawLocalFileSystem}}, {{BufferedFSInputStream}}
(which also handles a null inner stream without NPEing), {{FSInputChecker.java}}
# pulled in the test from  HADOOP-9307 to do many random seeks and reads; the #of seeks is
configurable, so that remote blobstore tests don't take forever unless you want it to (or
are running them in-cluster)
# some filesystems let you seek over a closed stream. I've fixed the NPE in {{BufferedFSInputStream}},
not sure it is worth the
effort of fixing this everywhere.

h3. NativeS3 issues/changes changes
* {{Jets3tNativeFileSystemStore}} converts the relevant S3 error code {{"InvalidRange"}} into
an EOFException
* Amazon S3 rejects a seek(0) in a zero-byte file; not fixed yet as you need to know the file
length to do it up front. Maybe an EOFException on a seek could be downgraded to a no-op if
the seek offset is 0.
* throws a {{FileAlreadyExistsException}} if trying to create a file over an existing one,
and {{!overwrite}}
* I'm deliberately skipping the test where we expect creating a file over a dir to fail even
if overwrite is true, because blobstores use 0-byte files as a pretend directory. 
* It's failing a test which creates overwrites a directory which has children. This could
be picked up (look for children if overwriting a 0-byte file)
* It fails a test that a newly created file exists while the write is still in progress; as
the blobstores only write at the end of the file, it doesn't. this is potentially a race condition
-we could create a marker file here and overwrite it on the close.

h3. FTP
I'll cover that in in HADOOP-9712 as its mostly bugs in a niche FS.

h3. LocalFS

* throws {{FileNotFoundException}} when attempting to create a dir where the destination or
a parent is a directory. This happens inside the JDK and has to be a WONTFIX, unless it is
caught and wrapped.
{code}
testOverwriteNonEmptyDirectory(org.apache.hadoop.fs.contract.localfs.TestLocalCreateContract)
 Time elapsed: 38 sec  <<< ERROR!
java.io.FileNotFoundException: /Users/stevel/Projects/hadoop-trunk/hadoop-common-project/hadoop-common/target/test/data/testOverwriteNonEmptyDirectory
(File exists)
	at java.io.FileOutputStream.open(Native Method)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:227)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:223)
	at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:286)
	at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:273)
	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:384)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:443)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)
	at org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset(ContractTestUtils.java:130)
	at org.apache.hadoop.fs.contract.AbstractCreateContractTest.testOverwriteNonEmptyDirectory(AbstractCreateContractTest.java:115)
{code}

# if you call {{mkdir(path-to-a-file)}} you get a 0 return code -but no exception is thrown.
This is inconsistent with 
HDFS.
{code}
testNoMkdirOverFile(org.apache.hadoop.fs.contract.localfs.TestLocalDirectoryContract)  Time
elapsed: 46 sec  <<< FAILURE!
java.lang.AssertionError: mkdirs succeeded over a file: ls file:/Users/stevel/Projects/hadoop-trunk/hadoop-common-project/hadoop-common/target/test/data/testNoMkdirOverFile[00]
RawLocalFileStatus{path=file:/Users/stevel/Projects/hadoop-trunk/hadoop-common-project/hadoop-common/target/test/data/testNoMkdirOverFile;
isDirectory=false; length=1024; replication=1; blocksize=33554432; modification_time=1373457007000;
access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false}

	at org.junit.Assert.fail(Assert.java:93)
	at org.apache.hadoop.fs.contract.AbstractDirectoryContractTest.testNoMkdirOverFile(AbstractDirectoryContractTest.java:68)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)

{code}


h3. HDFS Ambiguities

*you can't {{rm /}} an empty root dir, or {{rm -rf /}} a non-empty root dir. This may be a
good design choice for safety; not consistent with
the behaviours of all (tested) filesystems. I haven't tested FTP or local FS though, for obvious
reasons (these tests are only run if you subclass the relevant test, *and* explicitly enable
it)

* {{{FileAlreadyExistsException}}} is thrown instead of {{{ParentNotDirectoryException}}}
when a {{mkdir}} is make with a parent file
{code}
ttestMkdirOverParentFile(org.apache.hadoop.fs.contract.hdfs.TestHDFSDirectoryContract)  Time
elapsed: 48 sec  <<< ERROR!
org.apache.hadoop.fs.FileAlreadyExistsException: Parent path is not a directory: /test/testMkdirOverParentFile
testMkdirOverParentFile
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.mkdirs(FSDirectory.java:1906)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3182)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3141)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3114)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:692)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48089)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:605)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1033)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1880)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1876)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1489)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1874)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
	at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2324)
	at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2293)
	at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:568)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1915)
	at org.apache.hadoop.fs.contract.AbstractDirectoryContractTest.testMkdirOverParentFile(AbstractDirectoryContractTest.java:95)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException):
Parent path is not a directory: /test/testMkdirOverParentFile testMkdirOverParentFile
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.mkdirs(FSDirectory.java:1906)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3182)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3141)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3114)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:692)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48089)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:605)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1033)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1880)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1876)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1489)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1874)

	at org.apache.hadoop.ipc.Client.call(Client.java:1314)
	at org.apache.hadoop.ipc.Client.call(Client.java:1266)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at com.sun.proxy.$Proxy16.mkdirs(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:163)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:82)
	at com.sun.proxy.$Proxy16.mkdirs(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467)
	at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2322)
	... 37 more

{code}
                
> Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9361
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, test
>    Affects Versions: 3.0.0, 2.1.0-beta
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch,
HADOOP-9361-004.patch
>
>
> {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets
tested downstream, other filesystems, such as blobstore bindings, don't.
> The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258
shows is incomplete.
> I propose 
> # writing more tests which clarify expected behavior
> # testing operations in the interface being in their own JUnit4 test classes, instead
of one big test suite. 
> # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename,
atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases
if a feature is missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message