hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "tim robertson" <timrobertson...@gmail.com>
Subject Re: Lucene from HBase - raw values in Lucene index or not?
Date Wed, 17 Dec 2008 21:33:36 GMT
Hi St.Ack,

I'm in Copenhagen, Denmark - you?
Would be great to have some people interested in helping / offering
advice - thank you for the interest.
I will start getting the code in a better state and get a wiki going.

Hmmm... most of the stuff I have come across in my work requires
"balancing" rather than ranking as such - I presume they are
different.  E.g. I get 100,000 points to display on a map but the
mapping server works only for 1000 points in real time.  Therefore we
need to limit to 1000, but we want them "geospatially balanced" and
not all bunched up in one corner of the map to be a little more
representative of the spread.  Typically we have tackled this with
grouping to grids.  E.g. a map generated with the OGC WMS protocol:
Of course I wondered about MapReduce generating all the images, but
there are way too many tiles to process for all zoom levels for all

Baah - I am abusing the list to talk about Lucene indexes... sorry.


On Wed, Dec 17, 2008 at 9:00 PM, stack <stack@duboce.net> wrote:
> Adding to Jon's comments:
> It looks like the katta searchers have support for distributed idf.  Not
> sure about solr (though seems to be talk of it around SOLR-303).  My guess
> is that soon after you get searching working, you'd miss it if it wasn't
> there (Results warped by uneven term distribution across your shards).
> I for one would be very interested in helping such a project along.  Where
> are you located Tim?
> St.Ack

View raw message