hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srikanth Srungarapu <srikanth...@gmail.com>
Subject Re: hbase is not deleting the cell when a Put with a KeyValue, KeyValue.Type.Delete is submitted
Date Thu, 21 Aug 2014 20:13:46 GMT
Hi,
Did you try taking a look at
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/RowMutations.html?

Thanks,
Srikanth.


On Thu, Aug 21, 2014 at 1:03 PM, Armaselu, Cristian <carmaselu@epsilon.com>
wrote:

> Consider 2 JVM executing the following on the same row key
> JVM1
>
>     (1) Put to update new "columns",
>     (2) Delete to remove "column" X value (since incoming data is null for
> column X)
>
> JVM2
>
>     (1) Put to update new "columns",
>     (2) Delete to remove "column" Y value (since incoming data is null for
> column X)
>
> What happens if the hbase API requests are applied in this order (since 2
> JVM, 2 threads of execution)
> JVM1 (1) Put
> JVM2 (2) Delete
> JVM1 dies
> JVM2 (2) Delete
> Or you can think of any other combination with 4 operations instead of 2
>
> The data is corrupted, none of the 2 new records were applied, we don't
> have the previous record stored in hbase.
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Wednesday, August 20, 2014 1:58 PM
> To: user@hbase.apache.org
> Subject: Re: hbase is not deleting the cell when a Put with a KeyValue,
> KeyValue.Type.Delete is submitted
>
> bq. Batch does not guarantee the order of the mutations sent over
>
> Did you get the above from javadoc of the method ? javadoc gives example
> of order between Get and Put.
>
> In you case, the Put and Delete are for the same row. Therefore they would
> be executed atomically.
>
>
> On Wed, Aug 20, 2014 at 7:43 AM, Armaselu, Cristian <carmaselu@epsilon.com
> >
> wrote:
>
> > Batch does not guarantee the order of the mutations sent over
> > (Put/Delete,etc).
> > We need an atomic change of a row.
> >
> > Cristian Armaselu
> > Solution Architect
> > Shared Technology Services
> >
> > 6021 Connection Drive
> > Irving, TX 75039
> > carmaselu@epsilon.com
> >
> > The information contained in this communication is confidential, and
> > is intended only for the sole use of the recipient named above. If the
> > reader of this message is not the intended recipient, you are hereby
> > notified that any dissemination, distribution, or copying of this
> > communication is strictly prohibited. If you have received this
> > communication in error, please re-send this communication to the
> > sender and delete the original message or any copy of it from your
> computer system. Thank you.
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: Wednesday, August 20, 2014 7:51 AM
> > To: user@hbase.apache.org
> > Cc: user@hbase.apache.org
> > Subject: Re: hbase is not deleting the cell when a Put with a
> > KeyValue, KeyValue.Type.Delete is submitted
> >
> > Can you use this API ?
> >
> > https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable
> > .html#batch(java.util.List,%20java.lang.Object[])
> >
> > On Aug 20, 2014, at 5:41 AM, "Armaselu, Cristian"
> > <carmaselu@epsilon.com>
> > wrote:
> >
> > > Is not atomic.
> > > A Put is atomic while a Put and a Delete are not.
> > >
> > >
> > > Cristian Armaselu
> > > Solution Architect
> > > Shared Technology Services
> > >
> > > 6021 Connection Drive
> > > Irving, TX 75039
> > > carmaselu@epsilon.com
> > >
> > > The information contained in this communication is confidential, and
> > > is
> > intended only for the sole use of the recipient named above. If the
> > reader of this message is not the intended recipient, you are hereby
> > notified that any dissemination, distribution, or copying of this
> > communication is strictly prohibited. If you have received this
> > communication in error, please re-send this communication to the
> > sender and delete the original message or any copy of it from your
> computer system. Thank you.
> > >
> > > -----Original Message-----
> > > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > > Sent: Tuesday, August 19, 2014 11:29 PM
> > > To: user@hbase.apache.org
> > > Subject: Re: hbase is not deleting the cell when a Put with a
> > > KeyValue, KeyValue.Type.Delete is submitted
> > >
> > > Here is slightly modified test case where I used Delete for Column C
> > > -
> > the test passed on master branch:
> > > http://pastebin.com/LPQ6XfUD
> > >
> > > Just wonder if this formation can unblock you.
> > >
> > > Cheers
> > >
> > >
> > > On Tue, Aug 19, 2014 at 1:11 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > >> Can you log a JIRA and attach your test there ?
> > >>
> > >>
> > >> BTW, table should be crated in the test using (sample) code such as
> > >> the
> > >> following:
> > >>
> > >>     HTableDescriptor desc = new
> > >> HTableDescriptor(TableName.valueOf(TABLENAME));
> > >>
> > >>     desc.addFamily(hcd);
> > >>
> > >>     TEST_UTIL.getHBaseAdmin().createTable(desc);
> > >>
> > >> I also slightly modified the assertions so that they compile:
> > >>
> > >>    assertEquals("Column A value should be a",
> > >> Bytes.toString(result.getValue(familly,
> > >> Bytes.toBytes("A"))).equals("a"), true);
> > >>
> > >>
> > >> Cheers
> > >>
> > >>
> > >> On Tue, Aug 19, 2014 at 12:45 PM, Armaselu, Cristian <
> > >> carmaselu@epsilon.com> wrote:
> > >>
> > >>> Hello,
> > >>>
> > >>>
> > >>>
> > >>> To make clarify more on the subject I made another simpler test case:
> > >>>
> > >>>
> > >>>
> > >>> *Code executed:*
> > >>>
> > >>>    @Test
> > >>>
> > >>>    public void testHbasePutDeleteCell() throws Exception {
> > >>>
> > >>>        TableName tableName = TableName.valueOf("my_test");
> > >>>
> > >>>        Configuration configuration = HBaseConfiguration.create();
> > >>>
> > >>>        HTableInterface table = new HTable(configuration,
> > >>> tableName);
> > >>>
> > >>>        final String rowKey = "12345";
> > >>>
> > >>>        final byte[] familly = Bytes.toBytes("default");
> > >>>
> > >>>        // put one row
> > >>>
> > >>>        Put put = new Put(Bytes.toBytes(rowKey));
> > >>>
> > >>>        put.add(familly, Bytes.toBytes("A"), Bytes.toBytes("a"));
> > >>>
> > >>>        put.add(familly, Bytes.toBytes("B"), Bytes.toBytes("b"));
> > >>>
> > >>>        put.add(familly, Bytes.toBytes("C"), Bytes.toBytes("c"));
> > >>>
> > >>>        table.put(put);
> > >>>
> > >>>        // get row back and assert the values
> > >>>
> > >>>        Get get = new Get(Bytes.toBytes(rowKey));
> > >>>
> > >>>        Result result = table.get(get);
> > >>>
> > >>>        Assert.isTrue(Bytes.toString(result.getValue(familly,
> > >>> Bytes.toBytes("A"))).equals("a"), "Column A value should be a");
> > >>>
> > >>>        Assert.isTrue(Bytes.toString(result.getValue(familly,
> > >>> Bytes.toBytes("B"))).equals("b"), "Column B value should be b");
> > >>>
> > >>>        Assert.isTrue(Bytes.toString(result.getValue(familly,
> > >>> Bytes.toBytes("C"))).equals("c"), "Column C value should be c");
> > >>>
> > >>>        // put the same row again with C column deleted
> > >>>
> > >>>        put = new Put(Bytes.toBytes(rowKey));
> > >>>
> > >>>        put.add(familly, Bytes.toBytes("A"), Bytes.toBytes("a"));
> > >>>
> > >>>        put.add(familly, Bytes.toBytes("B"), Bytes.toBytes("b"));
> > >>>
> > >>>        put.add(new KeyValue(Bytes.toBytes(rowKey), familly,
> > >>> Bytes.toBytes("C"), HConstants.LATEST_TIMESTAMP,
> > >>> KeyValue.Type.DeleteColumn));
> > >>>
> > >>>        table.put(put);
> > >>>
> > >>>        // get row back and assert the values
> > >>>
> > >>>        get = new Get(Bytes.toBytes(rowKey));
> > >>>
> > >>>        result = table.get(get);
> > >>>
> > >>>        Assert.isTrue(Bytes.toString(result.getValue(familly,
> > >>> Bytes.toBytes("A"))).equals("a"), "Column A value should be a");
> > >>>
> > >>>        Assert.isTrue(Bytes.toString(result.getValue(familly,
> > >>> Bytes.toBytes("B"))).equals("b"), "Column A value should be b");
> > >>>
> > >>>        Assert.isTrue(result.getValue(familly, Bytes.toBytes("C"))
> > >>> == null, "Column C should not exists");
> > >>>
> > >>>    }
> > >>>
> > >>>
> > >>>
> > >>> This assertion fails, the cell is not deleted but rather the value
> > >>> is
> > >>> empty:
> > >>>
> > >>> hbase(main):029:0> scan 'my_test'
> > >>>
> > >>> ROW
> > >>> COLUMN+CELL
> > >>>
> > >>>
> > >>> 12345
> column=default:A,
> > >>> timestamp=1408473082290,
> > >>> value=a
> > >>>
> > >>>
> > >>> 12345
> column=default:B,
> > >>> timestamp=1408473082290,
> > >>> value=b
> > >>>
> > >>>
> > >>> 12345
> column=default:C,
> > >>> timestamp=1408473082290, value=
> > >>>
> > >>>
> > >>>
> > >>> This behavior is different than previous 4.8.x Cloudera version
> > >>> and is currently corrupting all hive queries involving is null or
> > >>> is not null operators on the columns mapped to hbase
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> *Cristian Armaselu*
> > >>>
> > >>> Solution Architect
> > >>>
> > >>> Shared Technology Services
> > >>>
> > >>>
> > >>>
> > >>> 6021 Connection Drive
> > >>>
> > >>> Irving, TX 75039
> > >>>
> > >>> carmaselu@epsilon.com
> > >>>
> > >>> [image: Epsilon_logo_PMS_wTag_horiz]
> > >>>
> > >>> The information contained in this communication is confidential,
> > >>> and is intended only for the sole use of the recipient named
> > >>> above. If the reader of this message is not the intended
> > >>> recipient, you are hereby notified that any dissemination,
> > >>> distribution, or copying of this communication is strictly
> > >>> prohibited. If you have received this communication in error,
> > >>> please re-send this communication to the sender and delete the
> > >>> original message or any copy of it from your
> > computer system. Thank you.
> > >>>
> > >>>
> > >>>
> > >>> *From:* Armaselu, Cristian
> > >>> *Sent:* Monday, August 18, 2014 4:20 PM
> > >>> *To:* 'user@hbase.apache.org'
> > >>> *Subject:* hbase is not deleting the cell when a Put with a
> > >>> KeyValue, KeyValue.Type.Delete is submitted
> > >>>
> > >>>
> > >>>
> > >>> Hello,
> > >>>
> > >>>
> > >>>
> > >>> We’re running Hbase 0.96.1.1 under CDH5.0.2 and we’re seeing a
> > >>> different behavior than with CDH 4.8 (Hbase .94.xx)
> > >>>
> > >>>
> > >>>
> > >>> Running the code below is creating an empty cell instead of no
> > >>> cell for this line of code
> > >>>
> > >>>        String tname = "my_test";
> > >>>
> > >>>        Configuration configuration = HBaseConfiguration.create();
> > >>>
> > >>>        HBaseAdmin baseAdmin = new HBaseAdmin(configuration);
> > >>>
> > >>>        baseAdmin.disableTable(tname);
> > >>>
> > >>>        baseAdmin.deleteTable(tname);
> > >>>
> > >>>        TableName tableName = TableName.valueOf(tname);
> > >>>
> > >>>        HTableDescriptor tableDescriptor = new
> > >>> HTableDescriptor(tableName);
> > >>>
> > >>>        HColumnDescriptor columnDescriptor = new
> > >>> HColumnDescriptor("default");
> > >>>
> > >>>        tableDescriptor.addFamily(columnDescriptor);
> > >>>
> > >>>        baseAdmin.createTable(tableDescriptor);
> > >>>
> > >>>        final String rowKey = "12345";
> > >>>
> > >>>        final Put p = new Put(Bytes.toBytes(rowKey));
> > >>>
> > >>>        for (int j = 0; j < 3; j++) {
> > >>>
> > >>>            for (int i = 0; i < 6; i++) {
> > >>>
> > >>>                final byte[] family = Bytes.toBytes("default");
> > >>>
> > >>>                final byte[] column = Bytes.toBytes("c" + i);
> > >>>
> > >>>                if (i == 5) {
> > >>>
> > >>>                    p.add(new KeyValue(p.getRow(), family, column,
> > >>> HConstants.LATEST_TIMESTAMP, KeyValue.Type.Delete));
> > >>>
> > >>>                } else {
> > >>>
> > >>>                    p.add(family, column, Bytes.toBytes("c" + i +
> > >>> "_value_" + (j+1)));
> > >>>
> > >>>                }
> > >>>
> > >>>            }
> > >>>
> > >>>        }
> > >>>
> > >>>        HTableInterface table = new HTable(configuration,
> > >>> tableName);
> > >>>
> > >>>        table.put(p);
> > >>>
> > >>>
> > >>>
> > >>> *hbase shell*
> > >>>
> > >>>
> > >>>
> > >>> hbase(main):003:0* scan 'my_test'
> > >>>
> > >>> ROW
> > >>> COLUMN+CELL
> > >>>
> > >>>
> > >>> 12345                                              column=default:c0,
> > >>> timestamp=1408396641845,
> > >>> value=c0_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c1,
> > >>> timestamp=1408396641845,
> > >>> value=c1_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c2,
> > >>> timestamp=1408396641845,
> > >>> value=c2_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c3,
> > >>> timestamp=1408396641845,
> > >>> value=c3_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c4,
> > >>> timestamp=1408396641845,
> > >>> value=c4_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c5,
> > >>> timestamp=1408396641845,
> > >>> value=
> > >>>
> > >>>
> > >>> 1 row(s) in 0.3580 seconds
> > >>>
> > >>>
> > >>>
> > >>> Is there any way we can get the old behavior back?
> > >>>
> > >>> To be more exact I expect the hbase shell scan to return the
> following:
> > >>>
> > >>>
> > >>>
> > >>> hbase(main):003:0* scan 'my_test'
> > >>>
> > >>> ROW
> > >>> COLUMN+CELL
> > >>>
> > >>>
> > >>> 12345                                              column=default:c0,
> > >>> timestamp=1408396641845,
> > >>> value=c0_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c1,
> > >>> timestamp=1408396641845,
> > >>> value=c1_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c2,
> > >>> timestamp=1408396641845,
> > >>> value=c2_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c3,
> > >>> timestamp=1408396641845,
> > >>> value=c3_value_3
> > >>>
> > >>>
> > >>> 12345                                              column=default:c4,
> > >>> timestamp=1408396641845,
> > >>> value=c4_value_3
> > >>>
> > >>>
> > >>> 1 row(s) in 0.3580 seconds
> > >>>
> > >>>
> > >>>
> > >>> Thanks,
> > >>>
> > >>> *Cristian Armaselu*
> > >>>
> > >>> Solution Architect
> > >>>
> > >>> Shared Technology Services
> > >>>
> > >>>
> > >>>
> > >>> 6021 Connection Drive
> > >>>
> > >>> Irving, TX 75039
> > >>>
> > >>> carmaselu@epsilon.com
> > >>>
> > >>> [image: Epsilon_logo_PMS_wTag_horiz]
> > >>>
> > >>> The information contained in this communication is confidential,
> > >>> and is intended only for the sole use of the recipient named
> > >>> above. If the reader of this message is not the intended
> > >>> recipient, you are hereby notified that any dissemination,
> > >>> distribution, or copying of this communication is strictly
> > >>> prohibited. If you have received this communication in error,
> > >>> please re-send this communication to the sender and delete the
> > >>> original message or any copy of it from your
> > computer system. Thank you.
> > >>>
> > >>>
> > >>>
> > >>> ------------------------------
> > >>>
> > >>> This e-mail and files transmitted with it are confidential, and
> > >>> are intended solely for the use of the individual or entity to
> > >>> whom this e-mail is addressed. If you are not the intended
> > >>> recipient, or the employee or agent responsible to deliver it to
> > >>> the intended recipient, you are hereby notified that any
> > >>> dissemination, distribution or copying of this communication is
> strictly prohibited.
> > >>> If you are not one of the named
> > >>> recipient(s) or otherwise have reason to believe that you received
> > >>> this message in error, please immediately notify sender by e-mail,
> > >>> and destroy the original message. Thank You.
> > >
> > > ________________________________
> > >
> > > This e-mail and files transmitted with it are confidential, and are
> > intended solely for the use of the individual or entity to whom this
> > e-mail is addressed. If you are not the intended recipient, or the
> > employee or agent responsible to deliver it to the intended recipient,
> > you are hereby notified that any dissemination, distribution or
> > copying of this communication is strictly prohibited. If you are not
> > one of the named
> > recipient(s) or otherwise have reason to believe that you received
> > this message in error, please immediately notify sender by e-mail, and
> > destroy the original message. Thank You.
> >
> > ________________________________
> >
> > This e-mail and files transmitted with it are confidential, and are
> > intended solely for the use of the individual or entity to whom this
> > e-mail is addressed. If you are not the intended recipient, or the
> > employee or agent responsible to deliver it to the intended recipient,
> > you are hereby notified that any dissemination, distribution or
> > copying of this communication is strictly prohibited. If you are not
> > one of the named
> > recipient(s) or otherwise have reason to believe that you received
> > this message in error, please immediately notify sender by e-mail, and
> > destroy the original message. Thank You.
> >
>
> ________________________________
>
> This e-mail and files transmitted with it are confidential, and are
> intended solely for the use of the individual or entity to whom this e-mail
> is addressed. If you are not the intended recipient, or the employee or
> agent responsible to deliver it to the intended recipient, you are hereby
> notified that any dissemination, distribution or copying of this
> communication is strictly prohibited. If you are not one of the named
> recipient(s) or otherwise have reason to believe that you received this
> message in error, please immediately notify sender by e-mail, and destroy
> the original message. Thank You.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message