spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeetendra Gangele <>
Subject Re: Need help in SparkSQL
Date Thu, 23 Jul 2015 05:14:08 GMT
Query will be something like that

1. how many users visited 1 BHK flat in last 1 hour in given particular area
2. how many visitor for flats in give area
3. list all user who bought given property in last 30 days

Further it may go too complex involving multiple parameters in my query.

The problem is HBase is designing row key to get this data efficiently.

Since I have multiple fields to query upon base may not be a good choice?

i dont dont to iterate the result set which Hbase returns and give the
result because this will kill the performance?

On 23 July 2015 at 01:02, Jörn Franke <> wrote:

> Can you provide an example of an and query ? If you do just look-up you
> should try Hbase/ phoenix, otherwise you can try orc with storage index
> and/or compression, but this depends on how your queries look like
> Le mer. 22 juil. 2015 à 14:48, Jeetendra Gangele <> a
> écrit :
>> HI All,
>> I have data in MongoDb(few TBs) which I want to migrate to HDFS to do
>> complex queries analysis on this data.Queries like AND queries involved
>> multiple fields
>> So my question in which which format I should store the data in HDFS so
>> that processing will be fast for such kind of queries?
>> Regards
>> Jeetendra


Find my attached resume. I have total around 7 years of work experience.
I worked for Amazon and Expedia in my previous assignments and currently I
am working with start- up technology company called Insideview in hyderabad.


View raw message