hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: Query on OutOfOrderScannerNextException
Date Sun, 07 Jun 2015 14:07:45 GMT
The reason we throw this exception is as below

Yes looking for few rows from a big region. It takes time to fill the #rows
as requested by client side. By this time the client gets an rpc timeout.
So client side will retry the call on same scanner. Remember with this next
call client says give me next N rows from where you are. The old failed
call was in progress and would have advanced some rows. So this retry call
will miss those rows.... to avoid this and to distinguish this case we have
this scan seqno and this exception.  On seeing this  the cliebt will close
the scanner and create a new one with proper start row . But this retry way
happens only one more time.  Again this call also migt be timing out.  So
have to re all adjust the timeout and/or scan caching value.   Yes the
heart beat mechaniam avoids such timeout for long running scans.  Hope this
explanation helps.


On Sunday, June 7, 2015, Arun Mishra <arunmishra@me.com> wrote:
> Thanks Vladimir. I am using option 2 as a short term fix for now. I will
definitely look into key design.
> Regards,
> Arun.
>> On Jun 6, 2015, at 3:18 PM, Vladimir Rodionov <vladrodionov@gmail.com>
>> The scanner fails at the very beginning. The reason is because they need
>> very few rows from a large file and HBase needs
>> to fill RPC buffer (which is 100 rows, yes?) before it can return first
>> batch. This takes more than 60 sec and scanner fails (do not ask me why
>> not the timeout exception)
>> 1. HBASE-13090 will help (can be back ported I presume to 1.0 and 0.98.x)
>> 2. Smaller region size will help
>> 3. Smaller  hbase.client.scanner.caching will help
>> 4. Larger hbase.client.scanner.timeout.period will help
>> 5. Better data store design (rowkeys) is preferred.
>> Too many options to choose from.
>> -Vlad
>>> On Sat, Jun 6, 2015 at 3:04 PM, Arun Mishra <arunmishra@me.com> wrote:
>>> Thanks TED.
>>> Regards,
>>> Arun.
>>>> On Jun 6, 2015, at 2:34 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>> HBASE-13090 'Progress heartbeats for long running scanners' solves the
>>>> problem you faced.
>>>> It is in the 1.1.0 release.
>>>> FYI
>>>>> On Sat, Jun 6, 2015 at 12:54 PM, Arun Mishra <arunmishra@me.com>
>>>>> Hello,
>>>>> I have a query on OutOfOrderScannerNextException. I am using hbase
>>> 0.98.6
>>>>> with 45 nodes.
>>>>> I have a mapreduce job which scan 1 table for last 1 day worth data
>>> using
>>>>> timerange. It has been running fine for months without any failure.
>>>>> last couple of days it has been failing with below exception. I have
>>> traced
>>>>> the failure to a single region. This region has 1 store and 1 hfile of
>>>>> 5+GB. What we realized was that, we were writing some bulk data, which
>>> used
>>>>> to land on this region. After we stopped writing this data, this
>>> has
>>>>> been receiving very few writes per day.
>>>>> When mapreduce job runs, it creates a map task for this region and
>>>>> task fails with OutOfOrderScannerNextException. I was able to
>>>>> this error by running a scan command with same start/stop row and
>>> timerange
>>>>> option. Finally, we split this region to be small enough for scan
>>> command
>>>>> to work.
>>>>> My query is if there is any option, apart from increasing the timeout,
>>>>> which can solve this use case? I am thinking of a use case where data
>>> comes
>>>>> in for 3 days a week in bulk and then nothing for next 3 days. Kind of
>>>>> creating a data hole in region.
>>>>> My understanding is that I am hit with this error because I have big
>>> store
>>>>> files and timerange scan is reading entire file even though it
>>>>> very few rowkeys for that timerange.
>>>>> hbase.client.scanner.caching = 100
>>>>> hbase.client.scanner.timeout.period = 60s
>>>>> scan 'dummytable',{ STARTROW=>'dummyrowkey-start',
>>>>> STOPROW=>'dummyrowkey-end', LIMIT=>1000,
>>>>> TIMERANGE=>[1433462400000,1433548800000]}
>>>>> ROW                                           COLUMN+CELL
>>>>> ERROR:
>>> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
>>>>> Expected nextCallSeq: 1 But the nextCallSeq got from client: 0;
>>>>> request=scanner_id: 33648 number_of_rows: 100 close_scanner: false
>>>>> next_call_seq: 0
>>>>> at
>>>>> at
>>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
>>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>> at
>>>>> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>> Regards,
>>>>> Arun

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message