Hey rock stars,
I'm having problems loading large amounts of data into a table (about
120 GB, 250million rows). My Map task runs fine, but when it comes to
reducing, things start burning. 'top' inidcates that I only have ~
100M of RAM free on my datanodes, and every process starts thrashing
... even ssh and ping. Then I start to get errors like:
"org.apache.hadoop.hbase.client.RegionOfflineException: region
offline: joinedcontent,,1244513452487"
and:
"Task attempt_200906082135_0001_r_000002_0 failed to report status for
603 seconds. Killing!"
I'm running Hadoop .19.1 and HBase .19.3, with 1 master/name node and
8 regionservers. 2 x Dual Core Intel 3.2 GHz procs, 4 GB of RAM. 16
map tasks, 8 reducers. I've set the MAX_HEAP in hadoop-env to 768, and
the one in hbase-env is at its default with 1000. I've also done all
the performance enchancements in the Wiki with the file handlers, the
garbage collection, and the epoll limits.
What am I missing? :)
Cheers,
Bradford
|