Hi all,
I have a small hbase cluster that I have recently filled with about 500M records (some of
them quite large). One of the things that I notice when I do different types of map / reduce
jobs over my table is that the network becomes a bottleneck. Currently I am running single
gig Ethernet on this cluster, but it has 4 network ports.
My question is this: is it possible to set up hadoop/hbase to take advantage of multiple networks
connecting the computers?
Could I specify multiple network connections in the config file?
Would it make sense to put the region servers on a different network than the data nodes?
Would it be more efficient to bond multiple channels at the OS level?
Thanks for the suggestions,
Dave
|