hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manjeet Singh <manjeet.chand...@gmail.com>
Subject Re: How can I achieve HBase row level atomicity?
Date Mon, 07 Nov 2016 17:39:13 GMT
Yes Anoop you are right
my input source is kafka pipeline we have 7 spark ETL jobs which are
responsible for aggregation, and to get last value from Hbase and put the
updated one.
we set zero version so we always have only one copy in Hbase.
Problem is if my 2 different ETL job having the same rowkey and one ETL
process get the value and at same time second also get that value and
update it now first ETL job will replace that updated value.

it can happen in same ETL job too.

Thanks
Manjeet

On Mon, Nov 7, 2016 at 7:07 PM, Anoop John <anoop.hbase@gmail.com> wrote:

> Seems u want to get an already present row and do some op and put the
> updated value.  What is that op?  If u can explain those we can try
> help u with ways (if available)  As such, the above kind of API do not
> guarantee u any atomicity.
>
> -Anoop-
>
> On Fri, Nov 4, 2016 at 4:12 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > bq. from some api its going for update (means get is performed)
> >
> > Update on hbase would correlate with Put or Delete (not sure what 'get'
> > above means).
> >
> > Looks like your concern is that two concurrent updates may overwrite the
> > data for same rowkey.
> >
> > Have you considered using:
> >
> >   public boolean checkAndPut(final byte [] row,
> >
> >       final byte [] family, final byte [] qualifier, final byte [] value,
> >
> >       final Put put)
> >
> > Cheers
> >
> > On Thu, Nov 3, 2016 at 10:57 AM, Manjeet Singh <
> manjeet.chandhok@gmail.com>
> > wrote:
> >
> >> Hi Ted,
> >>
> >> code not required for this case and how MVCC can help?
> >>
> >> question is if I have record which come in hbase and from some api its
> >> going for update (means get is performed)
> >> now other operation also performed get operation both update on same
> rwokey
> >> at end we will not have correct value.
> >>
> >> in seprate mail thread i asked same type of question regarding rowkey
> lock,
> >> but still i did't get correct anser
> >>
> >> Thanks
> >> Manjeet
> >>
> >> On Wed, Nov 2, 2016 at 11:36 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >>
> >> > Were you including code in the image (which didn't come through) ?
> >> >
> >> > MultiVersionConcurrencyControl is involved in answering your question.
> >> >
> >> > See http://hbase.apache.org/book.html#hregion.scans
> >> >
> >> > Cheers
> >> >
> >> > On Wed, Nov 2, 2016 at 10:57 AM, Manjeet Singh <
> >> manjeet.chandhok@gmail.com
> >> > >
> >> > wrote:
> >> >
> >> > > Hi All
> >> > >
> >> > > I have ETL process for inserting data into hbase for this I have
> spark
> >> > > jobs which are responsible for reading data from kafka topics. so
my
> >> > > question is if I have some rowkey already exist in hbase and I have
> 3
> >> > spark
> >> > > job running and they all try to update on same rowkey how hbase deal
> >> for
> >> > > atomicity?
> >> > >
> >> > > for more understanding i have 3 rowkey coming from 3 seprate spark
> job
> >> > and
> >> > > all trying to update same rowkey which is already exist in hbase
> table.
> >> > >
> >> > > [image: Inline image 1]
> >> > >
> >> > > --
> >> > > luv all
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> luv all
> >>
>



-- 
luv all

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message