hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Forehand <luke.foreh...@networkedinsights.com>
Subject Re: Hanging regionservers
Date Fri, 16 Jul 2010 22:01:47 GMT
Using Ryan Rawson's suggested config tweaks, we have just completed a successful job run with
a 15GB sequence file, no hang.  I'm setting up to have multiple files process this weekend
with the new settings.  :-)  I believe the dfs socket write timeout being indefinite was the
trick.

I'll post my results on Monday.  Thanks for the support thus far!

-Luke

On 7/15/10 10:17 PM, "Ryan Rawson" <ryanobjc@gmail.com> wrote:

I'm not seeing anything in that logfile, you are seeing compactions
for various regions, but im not seeing flushes (typical during insert
loads) and nothing else. One thing we look to see is a log message
"Blocking updates" which indicates that a particular region has
decided it's holding up to prevent taking too many inserts.

Like I said, you could be seeing this on a different regionserver, if
all the clients are blocked on 1 regionserver and can't get to the
others then most will look idle and only one will actually show
anything interesting in the log.

Can you check for this behaviour?

Also if you want to tweak the config with the values I pasted that should help.

On Thu, Jul 15, 2010 at 7:25 PM, Luke Forehand
<luke.forehand@networkedinsights.com> wrote:
> It looks like we are going straight from the default config, no expicit setting of anything.
>
> On 7/15/10 9:03 PM, "Ryan Rawson" <ryanobjc@gmail.com> wrote:
>
> In this case the regionserver isn't actually doing anything - all the
> IPC thread handlers are waiting in their queue handoff thingy (how
> they get socket/work to do).
>
> Something elsewhere perhaps?  Check the logs of your jobs, there might
> be something interesting there.
>
> One thing that frequently happens is you overrun 1 regionserver with
> edits and it isnt flushing fast enough, so it pauses updates and all
> clients end up stuck on it.
>
> What was that config again?  I use these settings:
>
> <property>
>  <name>hbase.hstore.blockingStoreFiles</name>
>  <value>15</value>
> </property>
>
> <property>
>  <name>dfs.datanode.socket.write.timeout</name>
>  <value>0</value>
> </property>
>
> <property>
>  <name>hbase.hregion.memstore.block.multiplier</name>
>  <value>8</value>
> </property>
>
> perhaps try these ones?
>
> -ryan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message