kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: [KUDU Tablet]unrecoverable crash
Date Fri, 19 Feb 2016 20:49:43 GMT
Hi Nick,

Are you able to determine the tablet ID that is failing to restart?
The log line indicates that it's thread ID 6285. If you look farther
up the log with 'grep " 6285 " kudu-tserver.INFO', you should see a
log message indicating that that thread is starting to bootstrap a
particular tablet.

Is this a replicated table, or num_replicas=1? If it's replicated, we
can probably recover by removing the corrupt replica and letting it
grab a new copy from one of the other replicas. Otherwise, we'll have
to do some more serious "surgery" which we can assist you with.

Either way, see if you can figure out the bad tablet ID. Then, if it's
possible to send a copy of the WAL directory for this tablet to me off
list, I can try to do some post-mortem analysis to see what went
wrong.

Thanks
-Todd

On Fri, Feb 19, 2016 at 12:37 PM, Nick Wolf <nickwolf7@gmail.com> wrote:
> KUDU Tablet crashed with following fatal error.
>
> F0219 12:15:11.389806  6285 mvcc.cc:542] Check failed: _s.ok() Bad status:
> Illegal state: Timestamp: 5963266013874102274 is already committed. Current
> Snapshot: MvccSnapshot[committed={T|T < 5963266013874118554 or (T in
> {5963266013874118554})}]
>
> It throws the same fatal error and crashes immediately no matter how many
> times i try to restart the service.
>
> Any ideas to get out of this situation? I don't want to lose the data.
>
>
> --Nick
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message