spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhiliang Zhu <zchl.j...@yahoo.com.INVALID>
Subject Re: the spark job is so slow - almost frozen
Date Mon, 18 Jul 2016 10:33:53 GMT
Thanks a lot for your reply .
In effect , here we tried to run the sql on kettle, hive and spark hive (by HiveContext) respectively,
the job seems frozen  to finish to run .
In the 6 tables , need to respectively read the different columns in different tables for
specific information , then do some simple calculation before output . join operation is
used most in the sql . 
Best wishes! 

 

    On Monday, July 18, 2016 6:24 PM, Chanh Le <giaosudau@gmail.com> wrote:
 

 Hi,What about the network (bandwidth) between hive and spark? Does it run in Hive before
then you move to Spark?Because It's complex you can use something like EXPLAIN command to
show what going on.



 
On Jul 18, 2016, at 5:20 PM, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID> wrote:
the sql logic in the program is very much complex , so do not describe the detailed codes
  here .  

    On Monday, July 18, 2016 6:04 PM, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID> wrote:
 

 Hi All,  
Here we have one application, it needs to extract different columns from 6 hive tables, and
then does some easy calculation, there is around 100,000 number of rows in each table,finally
need to output another table or file (with format of consistent columns) .
 However, after lots of days trying, the spark hive job is unthinkably slow - sometimes almost
frozen. There is 5 nodes for spark cluster.  Could anyone offer some help, some idea or
clue is also good. 
Thanks in advance~
Zhiliang 

   



  
Mime
View raw message