hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: High ingest rate and FIN_WAIT1 problems
Date Fri, 16 Jul 2010 19:56:04 GMT
I've been running with this setting on both the HDFS side and the
HBase side for over a year now, it's a bit of voodoo but you might be
running into well known suckage of HDFS.  Try this one and restart
your hbase & hdfs.

The FIN_WAIT2/TIME_WAIT happens more on large concurrent gets, not so
much for inserts.

<property>
<name>dfs.datanode.socket.write.timeout</name>
<value>0</value>
</property>

-ryan


On Fri, Jul 16, 2010 at 9:33 AM, Thomas Downing
<tdowning@proteus-technologies.com> wrote:
> Thanks for the response.
>
> My understanding is that TCP_FIN_TIMEOUT affects only FIN_WAIT2,
> my problem is with FIN_WAIT1.
>
> While I do see some sockets in TIME_WAIT, they are only a few, and the
> number is not growing.
>
> On 7/16/2010 12:07 PM, Hegner, Travis wrote:
>>
>> Hi Thomas,
>>
>> I ran into a very similar issue when running slony-I on postgresql to
>> replicate 15-20 databases.
>>
>> Adjusting the TCP_FIN_TIMEOUT parameters for the kernel may help to slow
>> (or hopefully stop), the leaking sockets. I found some notes about adjusting
>> TCP parameters here:
>> http://www.hikaro.com/linux/tweaking-tcpip-syctl-conf.html
>>
>> with the specific excerpt regarding the TCP_FIN_TIMEOUT:
>>
>> TCP_FIN_TIMEOUT
>> This setting determines the time that must elapse before TCP/IP can
>> release a closed connection and reuse its resources. During this TIME_WAIT
>> state, reopening the connection to the client costs less than establishing a
>> new connection. By reducing the value of this entry, TCP/IP can release
>> closed connections faster, making more resources available for new
>> connections. Addjust this in the presense of many connections sitting in the
>> TIME_WAIT state:
>>
>>
>> # echo 30>  /proc/sys/net/ipv4/tcp_fin_timeout
>>
>> Try setting this lower on your master. You may also consider these on the
>> same link:
>>
>> TCP_TW_RECYCLE
>> It enables fast recycling of TIME_WAIT sockets. The default value is 0
>> (disabled). The sysctl documentation incorrectly states the default as
>> enabled. It can be changed to 1 (enabled) in many cases. Known to cause some
>> issues with hoststated (load balancing and fail over) if enabled, should be
>> used with caution.
>>
>>
>> echo 1>  /proc/sys/net/ipv4/tcp_tw_recycle
>>
>> TCP_TW_REUSE
>> This allows reusing sockets in TIME_WAIT state for new connections when it
>> is safe from protocol viewpoint. Default value is 0 (disabled). It is
>> generally a safer alternative to tcp_tw_recycle
>>
>>
>> echo 1>  /proc/sys/net/ipv4/tcp_tw_reuse
>>
>>
>> The above commands will not persist reboots, but the link explains how.
>> The list experts may be able to give more insight on which, if any, of these
>> settings are safe to manipulate, and what risks or issues you may encounter
>> specifically with Hbase while adjusting these settings.
>>
>> Hope This Helps,
>>
>> Travis Hegner
>>
>>
>> -----Original Message-----
>> From: Thomas Downing [mailto:tdowning@proteus-technologies.com]
>> Sent: Friday, July 16, 2010 10:33 AM
>> To: user@hbase.apache.org
>> Subject: High ingest rate and FIN_WAIT1 problems
>>
>> Hi,
>>
>> I am a complete HBase and HDFS newbie, so I apologize in advance for
>> the inevitable bloopers.
>>
>> We are doing feasibility testing on NoSql data store options, with rather
>> high ingest rate requirements.  So far, HBase is looking good, with only
>> one issue identified. Running at an ingest rate of ~30K rows per second
>> on a 4 2.2Mhz CPU 8G RAM machine I am slowly leaking sockets.
>>
>> This is a single node setup - no replication.  The CPU load is only about
>> 50%-60%, with the majority of that in userland, system and iowait are
>> averaging less than 3%.  There is no swapping going on.
>>
>> The problem is that on the datanode there are a large number of sockets
>> in FIN_WAIT1, with corresponding peers on master in ESTABLISHED.
>> These pairs hang around for quite some time, at at my ingest rate this
>> means that the total sockets held by datanode and master is slowly going
>> up.
>>
>> If my understanding of TCP is correct, then this indicates that the master
>> peer has stopped reading incoming data from the datanode - i.e, it is
>> sending a window of zero; and that the datanode has called close(2) on
>> it's peer.
>>
>> There was a thread some time ago:
>>
>> http://www.mail-archive.com/hbase-user@hadoop.apache.org/msg03329.html
>>
>> There was no real conclusion.  I have played with the config params as
>> suggested on that thread, but no luck yet.  Also, in that case the problem
>> seemed to be between datanodes for replication operations - not the case
>> with me.  Changing timeouts to avoid the slow increase might not really
>> solve the problem IFF the master peer has in fact ceased to read it's
>> socket.  The data outstanding in the TCP stack buffer would be lost.
>> Whether that would imply data loss is beyond me.
>>
>> I am posting this here as although the only logs with errors/exceptions
>> are the datanode logs, netstat and wireshark seem to indicate that the
>> problem is on the master side.
>>
>> The master, namenode, regionserver and zookeeper and logs shows no
>> warning or errors.  The datanode log shows this, over and over:
>>
>> 2010-07-16 00:33:09,269 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(127.0.0.1:50010,
>> storageID=DS-1028643313-10.1.1.200-50010-1279026099917, infoPort=50075,
>> ipcPort=50020):Got exception while serving blk_3684861726145519813_22386
>> to /127.0.0.1:
>> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
>> channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:50010
>> remote=/127.0.0.1:54774]
>>          at
>>
>> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>>          at
>>
>> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>>          at
>>
>> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>>          at java.lang.Thread.run(Thread.java:619)
>>
>> 2010-07-16 00:33:09,269 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(127.0.0.1:50010,
>> storageID=DS-1028643313-10.1.1.200-50010-1279026099917, infoPort=50075,
>> ipcPort=50020):DataXceiver
>> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
>> channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:50010
>> remote=/127.0.0.1:54774]
>>          at
>>
>> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>>          at
>>
>> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>>          at
>>
>> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>>          at
>>
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>>          at java.lang.Thread.run(Thread.java:619)
>>
>> It there is any other info that might help, or any steps you would like
>> me to take, just let me know.
>>
>> Thanks
>>
>> Thomas Downing
>>
>> The information contained in this communication is confidential and is
>> intended only for the use of the named recipient.  Unauthorized use,
>> disclosure, or copying is strictly prohibited and may be unlawful.  If you
>> have received this communication in error, you should know that you are
>> bound to confidentiality, and should please immediately notify the sender or
>> our IT Department at  866.459.4599.
>>
>> --
>> Follow this link to mark it as spam:
>>
>> http://mailfilter.proteus-technologies.com/cgi-bin/learn-msg.cgi?id=7A97E27EB7.A6D64
>>
>>
>>
>
>

Mime
View raw message