hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: question about writing to columns with lots of versions in map task
Date Mon, 03 Oct 2011 18:31:45 GMT
I would advise against setting the timestamps yourself and instead
reduce in order to prune the versions you don't need to insert in


On Sat, Oct 1, 2011 at 11:05 AM, Christopher Dorner
<christopher.dorner@gmail.com> wrote:
> Hi again,
> i think i solved my issue.
> I simply use the byte offset of the row currently read by the Mapper as the
> timestamp for the Put. This is unique for my input file, which contains one
> triple for each row. So the timestamps are unique.
> Regards,
> Christopher
> Am 01.10.2011 13:19, schrieb Christopher Dorner:
>> Hallo,
>> I am reading a File containing RDF triples in a Map-job. the RDF triples
>> then are stored in a table, where columns can have lots of versions.
>> So i need to store many values for one rowKey in the same column.
>> I made the observation, that reading the file is very fast and thus some
>> values are put into the table with the same timestamp and therefore
>> overriding an existing value.
>> How can i avoid that? The timestamps are not necessary for later usage.
>> Could i simply use some sort of custom counter?
>> How would that work in fully distributed mode? I am working on
>> pseudo-distributed-mode for testing purpose right now.
>> Thank You and Regards,
>> Christopher

View raw message