hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: question about writing to columns with lots of versions in map task
Date Mon, 03 Oct 2011 18:31:45 GMT
I would advise against setting the timestamps yourself and instead
reduce in order to prune the versions you don't need to insert in
HBase.

J-D

On Sat, Oct 1, 2011 at 11:05 AM, Christopher Dorner
<christopher.dorner@gmail.com> wrote:
> Hi again,
>
> i think i solved my issue.
>
> I simply use the byte offset of the row currently read by the Mapper as the
> timestamp for the Put. This is unique for my input file, which contains one
> triple for each row. So the timestamps are unique.
>
> Regards,
> Christopher
>
>
> Am 01.10.2011 13:19, schrieb Christopher Dorner:
>>
>> Hallo,
>>
>> I am reading a File containing RDF triples in a Map-job. the RDF triples
>> then are stored in a table, where columns can have lots of versions.
>> So i need to store many values for one rowKey in the same column.
>>
>> I made the observation, that reading the file is very fast and thus some
>> values are put into the table with the same timestamp and therefore
>> overriding an existing value.
>>
>> How can i avoid that? The timestamps are not necessary for later usage.
>>
>> Could i simply use some sort of custom counter?
>>
>> How would that work in fully distributed mode? I am working on
>> pseudo-distributed-mode for testing purpose right now.
>>
>> Thank You and Regards,
>> Christopher
>
>

Mime
View raw message