sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagat <jagatsi...@gmail.com>
Subject Re: Is my Use Case possible with Hive?
Date Mon, 14 May 2012 15:13:34 GMT

I would attack the problem by following

Do all max possible aggregations calculations inside pig and then store
final results in table to query using hive.

Use partitions in hive by having idea from where clause in queries

Use index to improve performance further

Sent from Mobile , short and crisp.
On 14-May-2012 12:39 PM, "Bhavesh Shah" <bhavesh25shah@gmail.com> wrote:

> Hello all,
> My Use Case is:
> 1) I have a relational database which has a very large data. (MS SQL
> Server)
> 2) I want to do analysis on these huge data  and want to generate reports
> on it after analysis.
> Like this I have to generate various reports based on different analysis.
> I tried to implement this using Hive. What I did is:
> 1) I imported all tables in Hive from MS SQL Server using SQOOP.
> 2) I wrote many queries in Hive which is executing using JDBC on Hive
> Thrift Server
> 3) I am getting the correct result in table form, which I am expecting
> 4) But the problem is that the time which require to execute is too much
> long.
>    (My complete program is executing in near about 3-4 hours on *small
> amount of data*).
>    I decided to do this using Hive.
>     And as I told previously how much time Hive consumed for execution. my
> organization is expecting to complete this task in near about less than
> 1/2 hours
> Now after spending too much time for complete execution for this task what
> should I do?
> I want to ask one thing that:
> *Is this Use Case is possible with Hive?* If possible what should I do in
> my program to increase the performance?
> *And If not possible what is the other good way to implement this Use
> Case?*
> Please reply me.
> Thanks
> --
> Regards,
> Bhavesh Shah

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message