hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-5125) TestDFSIO should write less compressible data
Date Wed, 03 Apr 2013 17:39:16 GMT
Todd Lipcon created MAPREDUCE-5125:

             Summary: TestDFSIO should write less compressible data
                 Key: MAPREDUCE-5125
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5125
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: test
    Affects Versions: 1.1.2, 2.0.3-alpha
            Reporter: Todd Lipcon
            Priority: Minor

Currently, TestDFSIO writes a short repeating string of sequential (byte)0 through (byte)50.
This makes its output very compressible (I measured 250:1 by LZOing the resulting file). This
makes the results of TestDFSIO very hard to compare when running on HDFS vs other file systems
which may include some compression on the network, disk, or both -- what is ostensibly a benchmark
of IO throughput yields completely skewed results towards the system with compression.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message