hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Delete.deleteColumn not working with HFileOutputFormat?
Date Wed, 22 Oct 2014 12:47:57 GMT
Once you come up with a unit test, file a JIRA. 


On Oct 22, 2014, at 1:59 AM, Jan Lukavský <jan.lukavsky@firma.seznam.cz> wrote:

> Hi Ted,
> sure, there was a typo in the subject. The problem is with Delete#deleteColumn, fixed
that in the subject. Since we are not planning to upgrade our CDH4 distribution (we are planning
to upgrade to CDH5 as a next step), I'm afraid I cannot simply test this on the version you
mentioned. I can try to create a unittest for this. Should I file a JIRA?
> Thanks,
> Jan
> On 10/21/2014 06:05 PM, Ted Yu wrote:
>> bq. When using Delete#deleteColumns everything seems to be working fine
>> Please confirm that the issue you observe was with Delete#deleteColumn
>> (different from the method mentioned in subject).
>> Can you tried with 0.94.24 (the latest 0.94 release) ?
>> If you can capture this using a unit test, that would great.
>> Thanks
>> On Tue, Oct 21, 2014 at 8:23 AM, Jan Lukavský <jan.lukavsky@firma.seznam.cz>
>> wrote:
>>> Hi all,
>>> we are using HBase version 0.94.6-cdh4.3.1 and I have a suspicion that a
>>> Delete written to hbase through HFileOutputFormat might be ignored (and not
>>> delete any data) in the following scenario:
>>>  * a Delete object is used to delete the data at the client side
>>>  * call to "deleteColumn" instead of "deleteColumns" is used, which means
>>> that the underlaying KeyValue will not have an associated timestamp (will
>>> have HConstants.LATEST_TIMESTAMP)
>>>  * the Delete object is then converted to KeyValues and these are written
>>> into the output format's record writer
>>> I think (our systems seems to behave this way) the problem is in the way
>>> the KeyValue is processed in the RegionServer, even though I was not able
>>> to track the problem in the source code. Can anyone else confirm this? When
>>> using Delete#deleteColumns everything seems to be working fine (the
>>> KeyValues have different type). Is this expected or should it be considered
>>> a bug? And if so, where it should be fixed? I think it could be on the side
>>> of the record writer (maybe by throwing an exception), or in the region
>>> server (if possible, this might be non-trivial, because of the
>>> Delete#deleteColumn semantics).
>>> Any opinions?
>>> Thanks,
>>>  Jan
> -- 
> Jan Lukavský
> Vedoucí týmu vývoje
> Seznam.cz, a.s.
> Radlická 3494/10
> 15000, Praha 5
> jan.lukavsky@firma.seznam.cz
> http://www.seznam.cz

View raw message