hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ri...@laposte.net
Subject Re: Read access pattern
Date Mon, 29 Apr 2013 17:05:06 GMT

Thanx for the quick answer.

> For the next key, I think you can simply use your current key as your
> scanner first key. You will then find the one which is just after.
> Then you will have to verify the MD5 hash to make sure it's still for
> the same object.
Right, this is basically easy.

> First, if you know that you are storing data about every 10 seconds,
> set the startRow with something like
> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
> lines you will have until you find your current line, and keep the
> last one.

Actually it is impossible to know the timerange for which there will be a next entry

>
> Else, if you don't know, you will have to start with
> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
> might have to skip MANY lines before finding the right one. Do I don't
> really recommend that.

ouch, obviously not very efficient. I assume even with a filter ?
> Message du 29/04/13 18:18
> De : "Jean-Marc Spaggiari"
> A : user@hbase.apache.org
> Copie à :
> Objet : Re: Read access pattern
>
> Hum.
>
> For the next key, I think you can simply use your current key as your
> scanner first key. You will then find the one which is just after.
> Then you will have to verify the MD5 hash to make sure it's still for
> the same object.
>
> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
> String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
>
> If you want to find the one just before, quickly, I see 2 options.
>
> First, if you know that you are storing data about every 10 seconds,
> set the startRow with something like
> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
> lines you will have until you find your current line, and keep the
> last one.
>
> Else, if you don't know, you will have to start with
> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
> might have to skip MANY lines before finding the right one. Do I don't
> really recommend that.
>
> JM
>
> 2013/4/29 Shahab Yunus :
> > I think you cannot use the scanner simply to to a range scan here as your
> > keys are not monotonically increasing. You need to apply logic to
> > decode/reverse your mechanism that you have used to hash your keys at the
> > time of writing. You might want to check out the SemaText library which
> > does distributed scans and seem to handle the scenarios that you want to
> > implement.
> > http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >
> >
> > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
> >
> >> Hi,
> >>
> >> I have a rowkey defined by :
> >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> >> (Long.MAX_VALUE - changeDate.getTime()));
> >>
> >> How could I get the previous and next row for a given rowkey ?
> >> For instance, I have the following ordered keys :
> >>
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> >>
> >> If I choose the rowkey :
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
> >> correct scan to get the previous and next key ?
> >> Result would be :
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>
> >> Thank you !
> >> R.
> >>
> >> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
> >> tente ?
> >> Je crée ma boîte mail www.laposte.net
> >>
> 

Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
Je crée ma boîte mail www.laposte.net

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message