hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mukund murrali <mukundmurra...@gmail.com>
Subject Re: HConnection thread waiting on blocking queue indefinitely
Date Tue, 09 Jun 2015 06:05:11 GMT
Hi

I wrote a sample program with default client configurations and created a
single connection. I spawn client threads > hbase.hconnection.threads.max
from my client application and each thread insert data to hbase cluster.
Once a region split happens, all the hconnection threads(core pool and max
pool size were kept at 256) stalled at BoundedCompletionService.take()
indefinitely. Even after the split completed it never resumed.

So does it mean I have to create more instances of connection object for a
cluster in such scenarios (which is really not needed) ? There was no
exception (I expected a RejectedExecution) also in client side. So changing
the  hbase.hconnection.threads.max, hbase.hconnection.threads.core can
create such problem?



On Sat, Jun 6, 2015 at 5:02 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Not very sure on what could be the problem when the meta update happened.
> I would think that when the region split happened, there was some issue on
> the meta update (as you said in the later mail). The splitted regions would
> not have been updated properly in the META.  So any client updates/reads
> happening to this region would have stalled and hence your client
> application also stalled.
>
> As I said the logs would be important here to know what happened.  This
> could be one of a case and could be identified with the logs.
>
> Regards
> Ram
>
> On Sat, Jun 6, 2015 at 1:25 PM, mukund murrali <mukundmurrali9@gmail.com>
> wrote:
>
> > Sorry for misleading by specifying it as meta split. It was meta update
> > during a user region split. This had caused the stallation probably. We
> > have right now reverting client configs. Till now we didn't face the
> issue
> > again. Those changes causing some kindof exceptions or timeout was what
> we
> > expected, but clients stalling indefinitely is what worrying us.
> >
> > On Friday 5 June 2015, Vladimir Rodionov <vladrodionov@gmail.com> wrote:
> >
> > > I would suggest reverting client config changes back to defaults. At
> > least
> > > we will know if the issue is somehow related to client config changes.
> > > On Jun 5, 2015 6:15 AM, "ramkrishna vasudevan" <
> > > ramkrishna.s.vasudevan@gmail.com <javascript:;>> wrote:
> > >
> > > > Hbase:meta getting split? It may b some user region, can u check
> that?
> > If
> > > > ur meta was splitting then there is something wrong.
> > > > Can u attach the log snippets.
> > > >
> > > > Sent from phone. Excuse typos.
> > > > On Jun 5, 2015 6:00 PM, "mukund murrali" <mukundmurrali9@gmail.com
> > > <javascript:;>> wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > In our case there at that instance when the client thread stalled,
> > > there
> > > > > was a hbase:meta region split happening. So what went wrong? If
> there
> > > is
> > > > a
> > > > > split why should hconnection thread stall? Since we changed the
> > client
> > > > > configuration caused this? I am once again specifying our client
> > > related
> > > > > changes we did
> > > > >
> > > > > hbase.client.retries.number => 5
> > > > > zookeeper.recovery.retry => 0
> > > > > zookeeper.session.timeout => 1000
> > > > > zookeeper.recovery.retry.
> > > > > intervalmilli => 1
> > > > > hbase.rpc.timeout => 30000.
> > > > >
> > > > > Is zk timeout too low?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Jun 5, 2015 at 11:37 AM, ramkrishna vasudevan <
> > > > > ramkrishna.s.vasudevan@gmail.com <javascript:;>> wrote:
> > > > >
> > > > > > When you started  your client server was the META table assigned.
> > > May
> > > > be
> > > > > > some thing happened around that time and the client app was
just
> > > > waiting
> > > > > on
> > > > > > the meta table to be assigned.  It would have retried - Can
you
> > check
> > > > the
> > > > > > logs.?
> > > > > >
> > > > > > So the best part here is the stand alone client was able to
be
> > > > > successful -
> > > > > > which means the new clients were able to talk successfully with
> the
> > > > > > server.  And hence the restart of your client has solved  your
> > > problem.
> > > > > It
> > > > > > may be difficult to trouble shoot the exact issue with the
> limited
> > > > info -
> > > > > > but see if your client app regularly gets stalled and then it
is
> > > better
> > > > > to
> > > > > > trouble shoot your app and the way it accesses the server.
> > > > > >
> > > > > > On Fri, Jun 5, 2015 at 11:21 AM, PRANEESH KUMAR <
> > > > > praneesh.sankar@gmail.com <javascript:;>
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > The client connection was in stalled state. But there was
only
> > one
> > > > > > > hconnection thread found in our thread dump, which was
waiting
> > > > > > indefinitely
> > > > > > > in BoundedCompletionService.take call. Meanwhile we ran
a
> > > standalone
> > > > > test
> > > > > > > program which was successful.
> > > > > > >
> > > > > > > Once we restarted the client server, the problem got resolved.
> > > > > > >
> > > > > > > The basic doubt is, when the hconnection thread stalled,
why
> the
> > > > HBase
> > > > > > > client failed to create any more hconnections(max pool
size was
> > > 10).
> > > > In
> > > > > > > case of problem with table/meta regions how come the test
> program
> > > > > > > succeeded.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Praneesh
> > > > > > >
> > > > > > > On Fri, Jun 5, 2015 at 10:21 AM, ramkrishna vasudevan <
> > > > > > > ramkrishna.s.vasudevan@gmail.com <javascript:;>>
wrote:
> > > > > > >
> > > > > > > > Can you tell us more. Is your client not working at
all and
> it
> > is
> > > > > > > stalled ?
> > > > > > > > Are you seeing some results but you find it slow than
you
> > > expected?
> > > > > > > >
> > > > > > > > What type of workload are you running?  All the tables
are
> > > healthy?
> > > > > > Are
> > > > > > > > you able to read or write to them individually using
the
> hbase
> > > > shell?
> > > > > > > >
> > > > > > > > On Fri, Jun 5, 2015 at 10:18 AM, PRANEESH KUMAR <
> > > > > > > praneesh.sankar@gmail.com <javascript:;>
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Ram,
> > > > > > > > >
> > > > > > > > > The cluster ran without any problem for about
2 to 3 days
> > with
> > > > low
> > > > > > > load,
> > > > > > > > > once we enabled it for high load we immediately
faced this
> > > issue.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Praneesh.
> > > > > > > > >
> > > > > > > > > On Thursday 4 June 2015, ramkrishna vasudevan
<
> > > > > > > > > ramkrishna.s.vasudevan@gmail.com <javascript:;>>
wrote:
> > > > > > > > >
> > > > > > > > > > Is your cluster in working condition.  Can
you see if the
> > > META
> > > > > has
> > > > > > > been
> > > > > > > > > > assigned properly?  If the META table is
not initialized
> > and
> > > > > opened
> > > > > > > > then
> > > > > > > > > > your client thread will hang.
> > > > > > > > > >
> > > > > > > > > > Regards
> > > > > > > > > > Ram
> > > > > > > > > >
> > > > > > > > > > On Thu, Jun 4, 2015 at 9:05 PM, PRANEESH
KUMAR <
> > > > > > > > > praneesh.sankar@gmail.com <javascript:;>
> > > > > > > > > > <javascript:;>>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > >
> > > > > > > > > > > We are using Hbase-1.0.0. We also facing
the same issue
> > > that
> > > > > > client
> > > > > > > > > > > connection thread is waiting at
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1200).
> > > > > > > > > > >
> > > > > > > > > > > Any help is appreciated.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Praneesh
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message