hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <timrobertson...@gmail.com>
Subject Re: Geospatial Key Design ?
Date Tue, 13 Apr 2010 06:12:14 GMT
I've not used geohash much but thinking aloud...

I think the final stage of geohash is base 32 encoded, so would it not
partition naturally? If you leave it in the non encoded form then
highly dense data I believe would be reflected in dense regions

What are your search requirements?  If you are doing a lot of buffered
point search (all records within 0.01 degree of x/y) then I would
suspect (bot don't know for sure) that you would benefit from data
locality during server side scanning if they were together in the
region; arguing against the spatial partitioning strategy.

(Completely off topic: I do a lot with google density map overlays at
the moment, and have a Map Reduce portable version of if you are interested.  Can do density
by pixel, 2x2 pixels, 4x4 pixels etc)


On Tue, Apr 13, 2010 at 12:18 AM, Wade Arnold <wade.arnold@t8webware.com> wrote:
> We have been working on using Hbase for some geospatial queries for
> agronomic data. Via mapreduce we have created a secondary index to point at
> the raw records. Our issue is that the density of geohash/UTM/Zip/(lat,long)
> data sets is that they are naturally dense. For our use case the Midwest is
> very dense and New York and San Francisco donĀ¹t exist. I am sure for 4sqr
> and localized advertising engines this is the opposite. Do to the density of
> they key we keep on having region server density issues. I was wondering if
> anyone on the list has added any additional dimension on top of a geohash in
> order to create better partitioning?
> Wade Arnold

View raw message