hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: random timestamp insert
Date Tue, 16 Jun 2009 20:58:46 GMT
The opposite of optimistic locking is 'pessimistic locking' which means
'explicit locks'.  When you are expecting the # of concurrent writes to the
same row to be low, optimistic locking is vastly superior performance.

Generally you can use optimistic locking in nearly all cases.  Even huge
ordering systems use it.  I'm sure you can make it fit your application
needs.

Also remember that HBase is not really a transactional DB, you get row locks
and atomic updates on rows, but that is about it.

-ryan

On Tue, Jun 16, 2009 at 1:51 PM, Alexandre Jaquet <alexjaquet@gmail.com>wrote:

> checkAndSave have to looks nice but
>
> optimistic concurrency control is based on the assumption that most
> database
> transactions <http://en.wikipedia.org/wiki/Database_transaction> don't
> conflict with other transactions
>
> In most case but what's happening if we are in a non optimistic mode ?
>
>
> 2009/6/16 Ryan Rawson <ryanobjc@gmail.com>
>
> > The IPC threading can become an issue on a really busy server.  There is
> by
> > default 10 IPC listener threads, once you have 10 concurrent operations
> you
> > must wait for one to end to do the next one.  You can up this if it ends
> up
> > becoming a problem.  It has to be bounded or else resource consumption
> will
> > eventually crash.
> >
> > The only area this becomes a problem is explicit row locking - if you
> take
> > out a lock in one client, then a different client comes to get the same
> > lock, the second client has to wait, and while waiting it consumes a IPC
> > thread.
> >
> > But you shouldn't need to use explicit row locking.
> > - Mutations (puts, deletes) take out a row lock then release it.
> > - There is a checkAndSave() which allows you to get some kinds of
> > optimistic
> > concurrency
> > - you can use the multi-version mechanism to test for optimistic lock
> > failure
> > - atomicIncrement allows you to maintain sequences/counters without the
> use
> > of locks.
> >
> > I would recommend from designing a schema/application that uses row
> locks.
> > Use one of the other excellent mechanisms provided.  If your needs are
> > really above and beyond those, lets talk in detail.  A column oriented
> > store
> > has all sorts of powerful things available to it that rdbms dont have.
> >
> > On Tue, Jun 16, 2009 at 1:22 PM, Alexandre Jaquet <alexjaquet@gmail.com
> > >wrote:
> >
> > > Thanks Ryan for your explanation,
> > >
> > > But as I understand IPC call genereate dead lock over consomation  of
> > > services ? What is the exact role of a region server ?
> > >
> > > Thanks again.
> > >
> > > 2009/6/16 Ryan Rawson <ryanobjc@gmail.com>
> > >
> > > > Hey,
> > > >
> > > > So the issue there was when you are using the row-lock support built
> > in,
> > > > the
> > > > waiters for a row lock use up a IPC responder thread. There is only
> so
> > > many
> > > > of them. Then your clients start failing as regionservers are busy
> > > waiting
> > > > for locks to be released.
> > > >
> > > > The suggestion there was to use zookeeper-based locks.  The
> suggestion
> > is
> > > > still valid.
> > > >
> > > > I don't get your question about if timestamp is better than "Long
> > > > versioning".  A timestamp is a long - it's default value is
> > > > System.currentTimeMillis(), thus it's the milliseconds since epoch
> 1970
> > -
> > > a
> > > > slight variation on the time_t.
> > > >
> > > > Generally I would recommend people avoid setting timestamps unless
> they
> > > > have
> > > > special needs.  Timestamps order multiple version for a given
> > row/column,
> > > > thus if you 'mess it up', you get wrong data returned.
> > > >
> > > > I personally believe that timestamps are not necessairly the best way
> > to
> > > > store time-series data.  While in 0.20 we have better query
> mechanisms
> > > (all
> > > > values between X and Y is the general mechanism), you can probably do
> > > > better
> > > > with indexes.
> > > >
> > > > -ryan
> > > >
> > > > On Tue, Jun 16, 2009 at 1:04 PM, Alexandre Jaquet <
> > alexjaquet@gmail.com
> > > > >wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I'm also evaluting hbase for some applications and found an old
> post
> > > > about
> > > > > transactions and concurrent access
> > > > >
> > > > > http://osdir.com/ml/java.hadoop.hbase.user/2008-05/msg00169.html
> > > > >
> > > > > Does timestamp is really better than Long versioning ?
> > > > >
> > > > > Any workaround ?
> > > > >
> > > > > 2009/6/16 Xinan Wu <wuxinan@gmail.com>
> > > > >
> > > > > > I am aware that inserting data into hbase with random timestamp
> > order
> > > > > > results indeterminate result.
> > > > > >
> > > > > > e.g. comments here
> > > > > > https://issues.apache.org/jira/browse/HBASE-1249#action_12682369
> > > > > >
> > > > > > I've personally experienced indeterminate results before when
I
> > > insert
> > > > > > in random timestamp order (i.e., multiple versions with same
> > > timestamp
> > > > > > in the same cell, out-of-order timestamp when getting multiple
> > > > > > versions).
> > > > > >
> > > > > > In other words, we don't want to go back in time in inserting
> > cells.
> > > > > > Deletion is ok. But is updating pretty much the same story as
> > > > > > inserting?
> > > > > >
> > > > > > i.e., if I make sure the timestamp does exist in the cell, and
> then
> > I
> > > > > > _update_ it with that timestamp (and same value length),
> sometimes
> > > > > > hbase still just inserts a new version without touching the
old
> > one,
> > > > > > and of course timestamps of this cell becomes out of order.
Even
> if
> > I
> > > > > > delete all versions in that cell and reinsert in the time order,
> > the
> > > > > > result is still out of order. I assume if I do a major compact
> > > between
> > > > > > delete all and reinsert, it would be ok, but that's not a good
> > > > > > solution. Is there any good way to update a version of a cell
in
> > the
> > > > > > past? or that simply won't work?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message