hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Downing <tdown...@proteus-technologies.com>
Subject Re: High ingest rate and FIN_WAIT1 problems
Date Fri, 16 Jul 2010 16:33:16 GMT
Thanks for the response.

My understanding is that TCP_FIN_TIMEOUT affects only FIN_WAIT2,
my problem is with FIN_WAIT1.

While I do see some sockets in TIME_WAIT, they are only a few, and the
number is not growing.

On 7/16/2010 12:07 PM, Hegner, Travis wrote:
> Hi Thomas,
>
> I ran into a very similar issue when running slony-I on postgresql to replicate 15-20
databases.
>
> Adjusting the TCP_FIN_TIMEOUT parameters for the kernel may help to slow (or hopefully
stop), the leaking sockets. I found some notes about adjusting TCP parameters here: http://www.hikaro.com/linux/tweaking-tcpip-syctl-conf.html
>
> with the specific excerpt regarding the TCP_FIN_TIMEOUT:
>
> TCP_FIN_TIMEOUT
> This setting determines the time that must elapse before TCP/IP can release a closed
connection and reuse its resources. During this TIME_WAIT state, reopening the connection
to the client costs less than establishing a new connection. By reducing the value of this
entry, TCP/IP can release closed connections faster, making more resources available for new
connections. Addjust this in the presense of many connections sitting in the TIME_WAIT state:
>
>
> # echo 30>  /proc/sys/net/ipv4/tcp_fin_timeout
>
> Try setting this lower on your master. You may also consider these on the same link:
>
> TCP_TW_RECYCLE
> It enables fast recycling of TIME_WAIT sockets. The default value is 0 (disabled). The
sysctl documentation incorrectly states the default as enabled. It can be changed to 1 (enabled)
in many cases. Known to cause some issues with hoststated (load balancing and fail over) if
enabled, should be used with caution.
>
>
> echo 1>  /proc/sys/net/ipv4/tcp_tw_recycle
>
> TCP_TW_REUSE
> This allows reusing sockets in TIME_WAIT state for new connections when it is safe from
protocol viewpoint. Default value is 0 (disabled). It is generally a safer alternative to
tcp_tw_recycle
>
>
> echo 1>  /proc/sys/net/ipv4/tcp_tw_reuse
>
>
> The above commands will not persist reboots, but the link explains how. The list experts
may be able to give more insight on which, if any, of these settings are safe to manipulate,
and what risks or issues you may encounter specifically with Hbase while adjusting these settings.
>
> Hope This Helps,
>
> Travis Hegner
>
>
> -----Original Message-----
> From: Thomas Downing [mailto:tdowning@proteus-technologies.com]
> Sent: Friday, July 16, 2010 10:33 AM
> To: user@hbase.apache.org
> Subject: High ingest rate and FIN_WAIT1 problems
>
> Hi,
>
> I am a complete HBase and HDFS newbie, so I apologize in advance for
> the inevitable bloopers.
>
> We are doing feasibility testing on NoSql data store options, with rather
> high ingest rate requirements.  So far, HBase is looking good, with only
> one issue identified. Running at an ingest rate of ~30K rows per second
> on a 4 2.2Mhz CPU 8G RAM machine I am slowly leaking sockets.
>
> This is a single node setup - no replication.  The CPU load is only about
> 50%-60%, with the majority of that in userland, system and iowait are
> averaging less than 3%.  There is no swapping going on.
>
> The problem is that on the datanode there are a large number of sockets
> in FIN_WAIT1, with corresponding peers on master in ESTABLISHED.
> These pairs hang around for quite some time, at at my ingest rate this
> means that the total sockets held by datanode and master is slowly going
> up.
>
> If my understanding of TCP is correct, then this indicates that the master
> peer has stopped reading incoming data from the datanode - i.e, it is
> sending a window of zero; and that the datanode has called close(2) on
> it's peer.
>
> There was a thread some time ago:
>
> http://www.mail-archive.com/hbase-user@hadoop.apache.org/msg03329.html
>
> There was no real conclusion.  I have played with the config params as
> suggested on that thread, but no luck yet.  Also, in that case the problem
> seemed to be between datanodes for replication operations - not the case
> with me.  Changing timeouts to avoid the slow increase might not really
> solve the problem IFF the master peer has in fact ceased to read it's
> socket.  The data outstanding in the TCP stack buffer would be lost.
> Whether that would imply data loss is beyond me.
>
> I am posting this here as although the only logs with errors/exceptions
> are the datanode logs, netstat and wireshark seem to indicate that the
> problem is on the master side.
>
> The master, namenode, regionserver and zookeeper and logs shows no
> warning or errors.  The datanode log shows this, over and over:
>
> 2010-07-16 00:33:09,269 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(127.0.0.1:50010,
> storageID=DS-1028643313-10.1.1.200-50010-1279026099917, infoPort=50075,
> ipcPort=50020):Got exception while serving blk_3684861726145519813_22386
> to /127.0.0.1:
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/127.0.0.1:50010
> remote=/127.0.0.1:54774]
>           at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>           at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>           at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>           at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>           at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>           at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>           at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>           at java.lang.Thread.run(Thread.java:619)
>
> 2010-07-16 00:33:09,269 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(127.0.0.1:50010,
> storageID=DS-1028643313-10.1.1.200-50010-1279026099917, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/127.0.0.1:50010
> remote=/127.0.0.1:54774]
>           at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>           at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>           at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>           at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>           at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>           at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>           at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>           at java.lang.Thread.run(Thread.java:619)
>
> It there is any other info that might help, or any steps you would like
> me to take, just let me know.
>
> Thanks
>
> Thomas Downing
>
> The information contained in this communication is confidential and is intended only
for the use of the named recipient.  Unauthorized use, disclosure, or copying is strictly
prohibited and may be unlawful.  If you have received this communication in error, you should
know that you are bound to confidentiality, and should please immediately notify the sender
or our IT Department at  866.459.4599.
>
> --
> Follow this link to mark it as spam:
> http://mailfilter.proteus-technologies.com/cgi-bin/learn-msg.cgi?id=7A97E27EB7.A6D64
>
>
>    


Mime
View raw message