hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Spatial data posting in HBase
Date Sat, 12 Oct 2013 18:33:03 GMT

In terms of efficiency... 

A general solution that can be applied to all problems in all areas is going to be best. 
Geohash gets ugly when you're around the equator.  You can have two points literally a couple
of km away that would have two very different geo hashes. 

So if you tile the globe, depending on the size of the tile, you calculate the tile, its surrounding
tiles (if necessary) and then sweep through the data to find your object. 

I'm not suggesting you not to use geohash, just that its not going to be the most efficient.

Note that the the downside to tiling is that if you're doing a geospatial index... your data
volume explodes because you are storing references to the data at different tile levels.

Its a trade off. 

On Oct 12, 2013, at 2:34 AM, Adrien Mogenet <adrien.mogenet@gmail.com> wrote:

> Michael, don't you think Geohashes can be satisfying and well-suited for
> many cases anyway? Searching in a bounding box or arbitrary polygon is not
> that heavy with Geohash, even on edge conditions. The biggest risk IMHO is
> to have to deal with tons of invalid extra points if the geohash query is
> not accurate enough and your points distribution is very sparse so that
> many points will be found in a geohash despite they don't respond to your
> query criteria.
> However, if your query embeds enough bits of precision, Geohashes offer
> some nice guarantees for distributed databases and your queries should
> remain efficient enough.
> Another worst case of course is to look for K-NN since Geohash is not a
> real longest-common-prefix algorithm but once again, if your points
> distribution is approximately well balanced, this works not that bad
> without doing lots of recursive queries or fetching tons of useless data
> (but I do agree looking into your tiles would probably be more appropriate
> in that case).
> I'm planning to write an article on that points, so further technical
> arguments are welcome :-}
> On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <msegel_hadoop@hotmail.com>wrote:
>> HBase in Action goes through great depth of showing you how you could
>> implement GIS information in HBase.
>> Unfortunately there are issues with Geohash and edge conditions which make
>> it difficult to use when you're dealing with data on an edge of a quadrant.
>> A better way would be to create a point (geospatial point object) and
>> store it in a single column.
>> (This goes beyond the example of what's in the book. ) And then index the
>> data by tiles.
>> The downside is that you end up creating a lot more data…
>> Take a look at some of the stuff Boris Lublinsky published on InfoQ. There
>> are also other articles on the net….
>> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com>
>> wrote:
>>> The point is that there are options (multiple different hammers) if
>>> HBase support for geospatial is not there or doesn't meet OP's needs.
>>> Otis
>>> --
>>> Solr & ElasticSearch Support -- http://sematext.com/
>>> Performance Monitoring -- http://sematext.com/spm
>>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
>>> <msegel_hadoop@hotmail.com> wrote:
>>>> And Solr has what to do with storing data in HBase?
>>>> I guess its true… if all you have is a hammer…
>>>> The point I was raising was that geohash isn't the most efficient way
>> to go when you look at the problem at a global level…
>>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
>> otis.gospodnetic@gmail.com> wrote:
>>>>> Consider using Solr, which provides a lot of geospatial search support.
>>>>> Otis
>>>>> Solr & ElasticSearch Support
>>>>> http://sematext.com/
>>>>> On Sep 24, 2013 8:29 AM, "cto" <ankur5.c@tcs.com> wrote:
>>>>>> Hi ,
>>>>>> I am very new in HBase. Could you please let me know , how to insert
>>>>>> spatial
>>>>>> data (Latitude / Longitude) in HBase using Java .
>>>>>> --
>>>>>> View this message in context:
>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
> -- 
> Adrien Mogenet
> http://www.borntosegfault.com

The opinions expressed here are mine, while they may reflect a cognitive thought, that is
purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

View raw message