hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mingtao Zhang <mail2ming...@gmail.com>
Subject Re: Rowkey, Consistant Hashing, MD5?
Date Wed, 23 Jul 2014 18:34:43 GMT
Thank you all. Moved to Murmur hash.

Best Regards,
Mingtao


On Mon, Jul 21, 2014 at 10:58 PM, Ishan Chhabra <ichhabra@rocketfuel.com>
wrote:

> No *guarantees* on collision, but yes, it is a deterministic mapping and
> you won't see collisions in that range (provided you choose enough bits).
>
> See MurmurHash here: http://en.wikipedia.org/wiki/MurmurHash
>
> and to understand collision probabilities, read this:
> http://en.wikipedia.org/wiki/Birthday_problem
>
>
> On Mon, Jul 21, 2014 at 7:55 PM, Mingtao Zhang <mail2mingtao@gmail.com>
> wrote:
>
> > Thank you all!
> >
> > Sorry, I think 'consistent hashing' is wrong word.
> >
> > For my use case, I need to store this 'prefix' (either hashed/not) into
> > another table.
> >
> > Will this murmur hashing guarantee next time same string will map to same
> > bytes? And no collision for around 2^10 records?
> >
> > Mingtao Sent from iPhone
> >
> > > On Jul 21, 2014, at 10:28 PM, Ishan Chhabra <ichhabra@rocketfuel.com>
> > wrote:
> > >
> > > Mingtao,
> > > If I understand correctly, you want to prefix the key with a hash (as
> > > mentioned in the book) to get a good distribution. Use MurmurHash
> (there
> > is
> > > an implementation in HBase code itself) as it is fast and gives a
> uniform
> > > distribution.
> > >
> > > "Consistent Hashing" is not the correct term to use here if I
> understand
> > > your intent correctly.
> > >
> > >
> > >> On Mon, Jul 21, 2014 at 2:44 PM, Liam Slusser <lslusser@gmail.com>
> > wrote:
> > >>
> > >> MD5 isn't a consistent hashing algorithm.  Consistent hashing is a
> > scheme
> > >> that provides a hash table functionality in a way that the adding or
> > >> removing of one slot does not significantly change the mapping of keys
> > to
> > >> slots.  With that said, a lot of consistent hashing algorithms USE
> > >> md5...but it alone won't get you all the way there.
> > >>
> > >> Some light bedtime reading:
> > >> http://en.wikipedia.org/wiki/Consistent_hashing
> > >>
> > >> liam
> > >>
> > >>
> > >> On Mon, Jul 21, 2014 at 7:18 AM, Mingtao Zhang <
> mail2mingtao@gmail.com>
> > >> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I am trying to find a consistant hasing algorithm for the first
> portion
> > >> of
> > >>> the row key.
> > >>>
> > >>> I saw the document/book that MD5 is mentioned everything.
> > >>>
> > >>> But I have trouble to persuade myself that MD5 (
> > >>> http://en.wikipedia.org/wiki/MD5) is considered as consistant
> hasing.
> > >>>
> > >>> Could any of you point me to the library contains the hashing you are
> > >>> using?
> > >>>
> > >>> Thanks in advance!
> > >>>
> > >>> Best Regards,
> > >>> Mingtao
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message