Yes, I have tried various settings for setCaching() and I have setCacheBlocks(false) On Apr 30, 2013, at 9:17 PM, Ted Yu wrote: > From http://hbase.apache.org/book.html#mapreduce.example : > > scan.setCaching(500); // 1 is the default in Scan, which will > be bad for MapReduce jobs > scan.setCacheBlocks(false); // don't set to true for MR jobs > > I guess you have used the above setting. > > 0.94.x releases are compatible. Have you considered upgrading to, say > 0.94.7 which was recently released ? > > Cheers > > On Tue, Apr 30, 2013 at 9:01 PM, Bryan Keller wrote: > >> I have been attempting to speed up my HBase map-reduce scans for a while >> now. I have tried just about everything without much luck. I'm running out >> of ideas and was hoping for some suggestions. This is HBase 0.94.2 and >> Hadoop 2.0.0 (CDH4.2.1). >> >> The table I'm scanning: >> 20 mil rows >> Hundreds of columns/row >> Column keys can be 30-40 bytes >> Column values are generally not large, 1k would be on the large side >> 250 regions >> Snappy compression >> 8gb region size >> 512mb memstore flush >> 128k block size >> 700gb of data on HDFS >> >> My cluster has 8 datanodes which are also regionservers. Each has 8 cores >> (16 HT), 64gb RAM, and 2 SSDs. The network is 10gbit. I have a separate >> machine acting as namenode, HMaster, and zookeeper (single instance). I >> have disk local reads turned on. >> >> I'm seeing around 5 gbit/sec on average network IO. Each disk is getting >> 400mb/sec read IO. Theoretically I could get 400mb/sec * 16 = 6.4gb/sec. >> >> Using Hadoop's TestDFSIO tool, I'm seeing around 1.4gb/sec read speed. Not >> really that great compared to the theoretical I/O. However this is far >> better than I am seeing with HBase map-reduce scans of my table. >> >> I have a simple no-op map-only job (using TableInputFormat) that scans the >> table and does nothing with data. This takes 45 minutes. That's about >> 260mb/sec read speed. This is over 5x slower than straight HDFS. >> Basically, with HBase I'm seeing read performance of my 16 SSD cluster >> performing nearly 35% slower than a single SSD. >> >> Here are some things I have changed to no avail: >> Scan caching values >> HDFS block sizes >> HBase block sizes >> Region file sizes >> Memory settings >> GC settings >> Number of mappers/node >> Compressed vs not compressed >> >> One thing I notice is that the regionserver is using quite a bit of CPU >> during the map reduce job. When dumping the jstack of the process, it seems >> like it is usually in some type of memory allocation or decompression >> routine which didn't seem abnormal. >> >> I can't seem to pinpoint the bottleneck. CPU use by the regionserver is >> high but not maxed out. Disk I/O and network I/O are low, IO wait is low. >> I'm on the verge of just writing the dataset out to sequence files once a >> day for scan purposes. Is that what others are doing?