hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Settings
Date Thu, 27 Aug 2009 07:41:51 GMT
> dfs.datanode.socket.write.timeout => 0

This isn't needed any more given the locally patched Hadoop jar we distribute containing the
fix for HDFS-127




________________________________
From: stack <stack@duboce.net>
To: hbase-user@hadoop.apache.org
Sent: Thursday, August 27, 2009 6:29:38 AM
Subject: Re: Settings

On Wed, Aug 26, 2009 at 7:40 AM, Lars George <lars@worldlingo.com> wrote:

> Hi,
>
> It seems over the years I tried various settings in both Hadoop and HBase
> and when redoing a cluster it is always a question if we should keep that
> setting or not - since the issue it "suppressed" was fixed already. Maybe we
> should have a wiki page with the current settings and more advanced ones and
> when and how to use them. I find often that the description itself in the
> various default files are often as ambiguous as the setting key itself.



I'd rather fix the description so its clear rather than add extra info out
in a wiki; wiki pages tend to rot.



- fs.default.name => hdfs://<master-hostname>:9000/
>
> This is usually in core-site.xml in Hadoop. Is the client or server needing
> this key at all? Did I copy it in the hbase site file by mistake?
>


There probably was a reason long ago but, yeah, you shouldn't need this (as
Schubert says).



> - hbase.cluster.distributed => true
>
> For true replication and stand alone ZK installations.
>
> - dfs.datanode.socket.write.timeout => 0
>
> This is used in DataNode but here more importantly in DFSClient. Its
> default is fixed to apparently 8 minutes, no default file (I would have
> assumed hdfs-default.xml) has it listed.
>
> We set it to 0 to avoid the socket timing out on low use etc. because the
> DFSClient reconnect is not handled gracefully. I trust setting it to 0 is
> what we recommend for HBase and is still valid?
>


For background on this, see
http://wiki.apache.org/hadoop/Hbase/Troubleshooting#6.  It shouldn't be
needed anymore, especially with hadoop-4681 in place but IIRC, apurtell had
trouble bringing up a cluster one time when it shouldn't have been needed
but the only way to get it up was to set this to zero.   We should test.
BTW, this is a client-side config.  You have it below in hadoop.  Shouldn't
be needed there, not by hbase at least (maybe you have other clients of hdfs
that had this issue?).



>
> - hbase.regionserver.lease.period => 600000
>
> Default was changed from 60 to 120 seconds. Over time I had issues and have
> set it to 10mins. Good or bad?



There is an issue to check that this is even used any more. Lease is in zk
now.  I don't think this has an effect any more.


>
> - hbase.hregion.memstore.block.multiplier => 4
>
> This is up from the default 2. Good or bad?
>


Means that we'll fill more RAM before we bring down the writes gate, up to
2x the flush size (So if 64MB is default time to flush, we'll keep taking on
writes till we get to 2x64MB).  2x is good for the 64M default I'd say --
especially during virulent upload with lots of Stores.



>
> - hbase.hregion.max.filesize => 536870912
>
> Again twice as much as the default. Opinions?


Means you should have less regions overall for perhaps some small compromise
in performance (TBD).  I think that in 0.21 we'll likely up the region
default size to this or larger.  Need to test.  Leave it I'd say if
performance is OK for you and if you have lots of regions.


> - hbase.regions.nobalancing.count => 20
>
> This seems to be missing from the hbase-default.xml but is set to 4 in the
> code if not specified. The above I got from Ryan to improve startup of
> HBase. It means that while a RS is still opening up to 20 regions it can
> start rebalance regions. Handled by the ServerManager during message
> processing. Opinions?
>


If it works for you, keep it.  This whole startup and region reassignment is
going to be redone in 0.21.  These configurations will likely change at that
time.



>
> - hbase.regions.percheckin => 20
>
> This is the count of regions assigned in one go. Handled in RegionmManager
> and the default is 10. Here we tell it to assign regions in larger batches
> to speed up the cluster start. Opinions?


See previous note.



>
>
> - hbase.regionserver.handler.count => 30
>
> Up from 10 as I had often the problem that the UI was not responsive while
> a import MR job would run. All handlers were busy doing the inserts. JD
> mentioned it may be set to a higher default value?
>

No harm here.  Do the math.  Is it likely that you'll have 30 clients
concurrently trying to get stuff out of a regionserver?  If so, keep it I'd
say.



>
>
> Hadoop:
> ----------
>
> - dfs.block.size => 134217728
>
> Up from the default 64MB. I have done this in the past as my data size per
> "cell" is larger than the usual few bytes. I can have a few KB up to just
> above 1 MB per value. Still making sense?



No opinion.  Whatever works for you.


>
>
> - dfs.namenode.handler.count => 20
>
> This was upped from the default 10 quite some time ago (more than a year
> ago). So is this still required?
>

Probably.  Check it during a time of high load.  Are all in use?



>
> - dfs.datanode.socket.write.timeout => 0
>
> This is the matching entry to the above I suppose. This time for the
> DataNode. Still required?


See comment near top.



>
>
> - dfs.datanode.max.xcievers => 4096
>
> Default is 256 and often way to low. What is a good value you would use?
> What is the drawback setting it high?
>
>
Its effectively max for how many threads can run in a datanode.   Our load
on datanodes is a good deal less in 0.20.0 hbase.  My sense is that the
default is still too low (its too low for hadoop in general I've heard from
MapReduce heads).


St.Ack



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message