hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <msegel_had...@hotmail.com>
Subject Re: Spatial data posting in HBase
Date Sun, 13 Oct 2013 13:04:15 GMT
Yes, you can..  but you're doing more work to calculate  the geohash when you don't have to.


On Oct 13, 2013, at 5:33 AM, Adrien Mogenet <adrien.mogenet@gmail.com> wrote:

> This is also what I had in mind. Computing the neighbors and/or the higher
> level of a "tile" is a quite easy bit manipulation. Dealing with equator
> corner cases must not be considered as an issue.
> 
> 
> On Sun, Oct 13, 2013 at 1:16 AM, Nick Dimiduk <ndimiduk@gmail.com> wrote:
> 
>> You can treat a geohash of a fixed precision as a tile and calculate the
>> neighbors of that tile. This is precisely what I did in the chapter in
>> HBaseIA. In that way, it's no different than a tile system.
>> 
>> 
>> On Sat, Oct 12, 2013 at 11:33 AM, Michael Segel
>> <michael_segel@hotmail.com>wrote:
>> 
>>> Adrien,
>>> 
>>> In terms of efficiency...
>>> 
>>> A general solution that can be applied to all problems in all areas is
>>> going to be best.
>>> Geohash gets ugly when you're around the equator.  You can have two
>> points
>>> literally a couple of km away that would have two very different geo
>> hashes.
>>> 
>>> So if you tile the globe, depending on the size of the tile, you
>> calculate
>>> the tile, its surrounding tiles (if necessary) and then sweep through the
>>> data to find your object.
>>> 
>>> I'm not suggesting you not to use geohash, just that its not going to be
>>> the most efficient.
>>> 
>>> Note that the the downside to tiling is that if you're doing a geospatial
>>> index... your data volume explodes because you are storing references to
>>> the data at different tile levels.
>>> 
>>> Its a trade off.
>>> 
>>> 
>>> 
>>> On Oct 12, 2013, at 2:34 AM, Adrien Mogenet <adrien.mogenet@gmail.com>
>>> wrote:
>>> 
>>>> Michael, don't you think Geohashes can be satisfying and well-suited
>> for
>>>> many cases anyway? Searching in a bounding box or arbitrary polygon is
>>> not
>>>> that heavy with Geohash, even on edge conditions. The biggest risk IMHO
>>> is
>>>> to have to deal with tons of invalid extra points if the geohash query
>> is
>>>> not accurate enough and your points distribution is very sparse so that
>>>> many points will be found in a geohash despite they don't respond to
>> your
>>>> query criteria.
>>>> 
>>>> However, if your query embeds enough bits of precision, Geohashes offer
>>>> some nice guarantees for distributed databases and your queries should
>>>> remain efficient enough.
>>>> 
>>>> Another worst case of course is to look for K-NN since Geohash is not a
>>>> real longest-common-prefix algorithm but once again, if your points
>>>> distribution is approximately well balanced, this works not that bad
>>>> without doing lots of recursive queries or fetching tons of useless
>> data
>>>> (but I do agree looking into your tiles would probably be more
>>> appropriate
>>>> in that case).
>>>> 
>>>> I'm planning to write an article on that points, so further technical
>>>> arguments are welcome :-}
>>>> 
>>>> On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <
>>> msegel_hadoop@hotmail.com>wrote:
>>>> 
>>>>> HBase in Action goes through great depth of showing you how you could
>>>>> implement GIS information in HBase.
>>>>> 
>>>>> Unfortunately there are issues with Geohash and edge conditions which
>>> make
>>>>> it difficult to use when you're dealing with data on an edge of a
>>> quadrant.
>>>>> 
>>>>> A better way would be to create a point (geospatial point object) and
>>>>> store it in a single column.
>>>>> (This goes beyond the example of what's in the book. ) And then index
>>> the
>>>>> data by tiles.
>>>>> 
>>>>> 
>>>>> The downside is that you end up creating a lot more data…
>>>>> 
>>>>> Take a look at some of the stuff Boris Lublinsky published on InfoQ.
>>> There
>>>>> are also other articles on the net….
>>>>> 
>>>>> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <
>>> otis.gospodnetic@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> The point is that there are options (multiple different hammers)
if
>>>>>> HBase support for geospatial is not there or doesn't meet OP's needs.
>>>>>> 
>>>>>> Otis
>>>>>> --
>>>>>> Solr & ElasticSearch Support -- http://sematext.com/
>>>>>> Performance Monitoring -- http://sematext.com/spm
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
>>>>>> <msegel_hadoop@hotmail.com> wrote:
>>>>>>> And Solr has what to do with storing data in HBase?
>>>>>>> 
>>>>>>> I guess its true… if all you have is a hammer…
>>>>>>> 
>>>>>>> The point I was raising was that geohash isn't the most efficient
>> way
>>>>> to go when you look at the problem at a global level…
>>>>>>> 
>>>>>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
>>>>> otis.gospodnetic@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Consider using Solr, which provides a lot of geospatial search
>>> support.
>>>>>>>> 
>>>>>>>> Otis
>>>>>>>> Solr & ElasticSearch Support
>>>>>>>> http://sematext.com/
>>>>>>>> On Sep 24, 2013 8:29 AM, "cto" <ankur5.c@tcs.com> wrote:
>>>>>>>> 
>>>>>>>>> Hi ,
>>>>>>>>> 
>>>>>>>>> I am very new in HBase. Could you please let me know
, how to
>> insert
>>>>>>>>> spatial
>>>>>>>>> data (Latitude / Longitude) in HBase using Java .
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> View this message in context:
>>>>>>>>> 
>>>>> 
>>> 
>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Adrien Mogenet
>>>> http://www.borntosegfault.com
>>> 
>>> The opinions expressed here are mine, while they may reflect a cognitive
>>> thought, that is purely accidental.
>>> Use at your own risk.
>>> Michael Segel
>>> michael_segel (AT) hotmail.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Adrien Mogenet
> http://www.borntosegfault.com


Mime
View raw message