From user-return-59641-apmail-spark-user-archive=spark.apache.org@spark.apache.org Tue Jul 19 03:52:43 2016 Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2809E19E09 for ; Tue, 19 Jul 2016 03:52:43 +0000 (UTC) Received: (qmail 55926 invoked by uid 500); 19 Jul 2016 03:52:38 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 55789 invoked by uid 500); 19 Jul 2016 03:52:37 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 55778 invoked by uid 99); 19 Jul 2016 03:52:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Jul 2016 03:52:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 4D5B5C01A9 for ; Tue, 19 Jul 2016 03:52:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.247 X-Spam-Level: X-Spam-Status: No, score=-0.247 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id lXPkn7ElQZ5C for ; Tue, 19 Jul 2016 03:52:35 +0000 (UTC) Received: from nm28.bullet.mail.bf1.yahoo.com (nm28.bullet.mail.bf1.yahoo.com [98.139.212.187]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B60DE5FAFA for ; Tue, 19 Jul 2016 03:52:34 +0000 (UTC) Received: from [98.139.215.140] by nm28.bullet.mail.bf1.yahoo.com with NNFMP; 19 Jul 2016 03:52:28 -0000 Received: from [98.139.212.232] by tm11.bullet.mail.bf1.yahoo.com with NNFMP; 19 Jul 2016 03:52:28 -0000 Received: from [127.0.0.1] by omp1041.mail.bf1.yahoo.com with NNFMP; 19 Jul 2016 03:52:28 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 479749.50704.bm@omp1041.mail.bf1.yahoo.com X-YMail-OSG: Y820E5YVM1kPU.Sdz0JGYMc4SAPrXFz8nDCVPsX.VIqKsZ8k941nNG9vLWaNgl. icfG1O6l251JXVr0hr_aOxp1w1nNOfO1lKOy0c0BrydVa7VvgT4pG730AY.OpRz3dhW7ElNtfF7G bm6JiPZRSEHqbl5wzRwdQczfizd4xjuOk9sG5yDs69VPY.hawt7Vg2IMdtmzcKASEIVUw8OuLCJE k6hS9592nBT5xbsvyQy6lpGwcvwjVU9UDPI.CmJpH1oKAHFhLKIDm.kkAleABtzUHowGj6lq271_ 8d4MP9f9EWVsdSW1YH0B7gvYItB7HMhpGrG11nl5V5J50HeQEOiFuT18MrzZrSbM0YpnwXWYhj4g Out0nacL3dF23Gcy01Ru.JwzRFShlisEmfMzSUNzr5KUDMrFZ8Lfj2cPNF3FSa6AIGmipsbmizfx SmwyZ.bNEtxZsO0jR8jA3CptKo0J8dGFwihuvt.oMEz1i1JY.WAD5CnKgiemVIjFgaaH8run2b6Y tlWPK.eDOLPuiXcYL6umcRuNnW9Cs8h0Iz2pJ Received: from jws106173.mail.bf1.yahoo.com by sendmailws114.mail.bf1.yahoo.com; Tue, 19 Jul 2016 03:52:27 +0000; 1468900347.910 Date: Tue, 19 Jul 2016 03:52:27 +0000 (UTC) From: Zhiliang Zhu Reply-To: Zhiliang Zhu To: User Message-ID: <1118481725.1482959.1468900347670.JavaMail.yahoo@mail.yahoo.com> In-Reply-To: <651547295.1585449.1468900163771.JavaMail.yahoo@mail.yahoo.com> References: <1837718728.897579.1468836274753.JavaMail.yahoo.ref@mail.yahoo.com> <1837718728.897579.1468836274753.JavaMail.yahoo@mail.yahoo.com> <1660914861.880185.1468837252667.JavaMail.yahoo@mail.yahoo.com> <2142015718.876332.1468838034020.JavaMail.yahoo@mail.yahoo.com> <2093732889.1554235.1468897512570.JavaMail.yahoo@mail.yahoo.com> <651547295.1585449.1468900163771.JavaMail.yahoo@mail.yahoo.com> Subject: the spark job is so slow during shuffle - almost frozen MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1482958_1741331268.1468900347664" ------=_Part_1482958_1741331268.1468900347664 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Show original message=20 Hi =C2=A0All ,=C2=A0 While referring to spark UI , displayed as =C2=A0198/200 - almost frozen...= during shuffle stage of one task, most of the executor is with 0 byte, but = just =C2=A0one executor is with 1 G . moreover, in the several join operation , some case is like this, one table= or pairrdd is only with 40 keys, but the other table is with 10, 000 numbe= r keys..... Then, could it be decided some issue as data skew ... Any help or comment will be deep appreciated . Thanks in advance ~=C2=A0 ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ----------- =C2=A0Here we have one application, it needs to extract different columns f= rom 6=C2=A0hive tables, and then does some easy calculation, there is aroun= d 100,000 =C2=A0number of rows in each table,=C2=A0finally need to output another tab= le or file (with format of consistent=C2=A0=C2=A0columns) . =C2=A0However, after lots of days trying, the spark hive job is unthinkably= slow=C2=A0- sometimes almost frozen. There is 5 nodes for spark cluster. =C2=A0Could anyone offer some help, some idea or clue is also good. =C2=A0Thanks in advance~ On Tuesday, July 19, 2016 11:05 AM, Zhiliang Zhu wrot= e: Show original message=20 =20 Hi Mungeol, Thanks a lot for your help.=C2=A0I will try that.=20 On Tuesday, July 19, 2016 9:21 AM, Mungeol Heo = wrote: =20 Try to run a action at a Intermediate stage of your job process. Like save, insertInto, etc. Wish it can help you out. On Mon, Jul 18, 2016 at 7:33 PM, Zhiliang Zhu wrote: > Thanks a lot for your reply . > > In effect , here we tried to run the sql on kettle, hive and spark hive (= by > HiveContext) respectively, the job seems frozen=C2=A0 to finish to run . > > In the 6 tables , need to respectively read the different columns in > different tables for specific information , then do some simple calculati= on > before output . > join operation is used most in the sql . > > Best wishes! > > > > > On Monday, July 18, 2016 6:24 PM, Chanh Le wrote: > > > Hi, > What about the network (bandwidth) between hive and spark? > Does it run in Hive before then you move to Spark? > Because It's complex you can use something like EXPLAIN command to show w= hat > going on. > > > > > > > On Jul 18, 2016, at 5:20 PM, Zhiliang Zhu > wrote: > > the sql logic in the program is very much complex , so do not describe th= e > detailed codes=C2=A0 here . > > > On Monday, July 18, 2016 6:04 PM, Zhiliang Zhu > wrote: > > > Hi All, > > Here we have one application, it needs to extract different columns from = 6 > hive tables, and then does some easy calculation, there is around 100,000 > number of rows in each table, > finally need to output another table or file (with format of consistent > columns) . > >=C2=A0 However, after lots of days trying, the spark hive job is unthinkab= ly slow > - sometimes almost frozen. There is 5 nodes for spark cluster. > > Could anyone offer some help, some idea or clue is also good. > > Thanks in advance~ > > Zhiliang > =20 ------=_Part_1482958_1741331268.1468900347664 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Show original m= essage



Hi  All , 

Wh= ile referring to spark UI , displayed as  198/200 - almost frozen...
during shuffle stage of one task, most of the executor is with 0 byte= , but just  one executor is with 1 G .

moreover, in the several join operation , some case is like this, one ta= ble or pairrdd is only with 40 keys, but the other table is with 10, 000 nu= mber keys.....

Then, could it be decided = some issue as data skew ...

Any help or c= omment will be deep appreciated .

Thanks= in advance ~ 


---------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= --

 Here we have= one application, it needs to extract different columns from 6 = hive tables, and then does some easy calculation, there is= around 100,000
 = number of rows in each table, finally need to = output another table or file (with format of consistent  columns) .

 However, after lots of days trying= , the spark hive job is unthinkably slow - som= etimes almost frozen. There is 5 nodes for spark cluster.

 C= ould anyone offer some help, some idea or clue is also good.

 = Thanks in advance~



On Tuesday, July 19, 2016 11:05 AM, Zhiliang Zhu <zchl.jump@ya= hoo.com> wrote:
Show original message


=


Hi Mungeol,

Thanks a lot for your help. I will try = that.


On Tuesday, July= 19, 2016 9:21 AM, Mungeol Heo <mungeol.heo@gmail.com> wrote:


Try to run a action at a Intermediate stage of your job = process. Like
save, insertInto, etc.
Wi= sh it can help you out.

On Mon, Jul 18, = 2016 at 7:33 PM, Zhiliang Zhu
<zchl.jump@yahoo.com.invalid> wrote:
> Thanks a lot for your reply .
>
> In effect , here we tried to run the = sql on kettle, hive and spark hive (by
> HiveContext) = respectively, the job seems frozen  to finish to run .
>
> In the 6 tables , need to respectively read = the different columns in
> different tables for specif= ic information , then do some simple calculation
> bef= ore output .
> join operation is used most in the sql = .
>
> Best wishes!
>
>
>
&g= t;
> On Monday, July 18, 2016 6:24 PM, Chanh Le <giaosudau@gmail.com
&= gt; wrote:
>
>
= > Hi,
> What about the network (bandwidth) between = hive and spark?
> Does it run in Hive before then you = move to Spark?
> Because It's complex you can use some= thing like EXPLAIN command to show what
> going on.>
>
>
>
>
>
> On Jul 18, 2016, at 5:20 PM, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID>
> wrote:
>
> the = sql logic in the program is very much complex , so do not describe the
> detailed codes  here .
>
>
> On Monday, July 18, 2016 6:04 PM,= Zhiliang Zhu <zch= l.jump@yahoo.com.INVALID>
> wrote:
>
>
> Hi All,
>
> Here we have one application, it need= s to extract different columns from 6
> hive tables, a= nd then does some easy calculation, there is around 100,000
> number of rows in each table,
> finally need t= o output another table or file (with format of consistent
> columns) .
>
>  Howeve= r, after lots of days trying, the spark hive job is unthinkably slow
> - sometimes almost frozen. There is 5 nodes for spark clus= ter.
>
> Could anyone offer some = help, some idea or clue is also good.
>
> Thanks in advance~
>
> Z= hiliang
>







<= div class=3D"yiv1745405556y_msg_container" id=3D"yiv1745405556yui_3_16_0_ym= 19_1_1467601534104_1157505">
















<= /div>

=


=
------=_Part_1482958_1741331268.1468900347664--