spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: Spark SQL with Thrift Server is very very slow and finally failing
Date Wed, 10 Jun 2015 14:44:54 GMT
Seems that Spark SQL can't retrieve table size statistics and doesn't 
enable broadcast join in your case. Would you please try `ANALYZE TABLE 
<table-name>` for both tables to generated table statistics information?

Cheng

On 6/10/15 10:26 PM, Sourav Mazumder wrote:
> Here is the physical plan.
>
>
> Also attaching the executor log from one of the executors. You can see 
> that memory consumption is slowly rising and then it is reaching 
> around 10.5 GB. There it is staying for around 5 minutes 06-50-36 to 
> 06-55-00. Then this executor is getting killed. ExecutorMemory 
> configured is 10GB.
>
> Regards,
> Sourav
>
>
> ---------------
>
>  plan
>  ------------------------------------------------------------------------------------------------------------------------------------------
>  == Parsed Logical Plan ==
>  'Project ['IKB_PROJECT_LIVE_DT,'FLOORPLAN_NM,'FLOORPLAN_DBKEY]
>   'Filter ('IKB_PROJECT_TYPE = CR)
>    'Join Inner, None
>     'UnresolvedRelation [IKB_FP_POG_PRE_EXT], Some(P)
>     'UnresolvedRelation [IKB_PROJECT_CALENDAR_EXT], Some(C)
>
>  == Analyzed Logical Plan ==
>  Project [IKB_PROJECT_LIVE_DT#31,FLOORPLAN_NM#20,FLOORPLAN_DBKEY#17]
>   Filter (IKB_PROJECT_TYPE#29 = CR)
>    Join Inner, None
>     MetastoreRelation sourav_ikb_hs, ikb_fp_pog_pre_ext, Some(P)
>     MetastoreRelation sourav_ikb_hs, ikb_project_calendar_ext, Some(C)
>
>  == Optimized Logical Plan ==
>  Project [IKB_PROJECT_LIVE_DT#31,FLOORPLAN_NM#20,FLOORPLAN_DBKEY#17]
>   Join Inner, None
>    Project [FLOORPLAN_NM#20,FLOORPLAN_DBKEY#17]
>     MetastoreRelation sourav_ikb_hs, ikb_fp_pog_pre_ext, Some(P)
>    Project [IKB_PROJECT_LIVE_DT#31]
>     Filter (IKB_PROJECT_TYPE#29 = CR)
>      MetastoreRelation sourav_ikb_hs, ikb_project_calendar_ext, Some(C)
>
>  == Physical Plan ==
>  Project [IKB_PROJECT_LIVE_DT#31,FLOORPLAN_NM#20,FLOORPLAN_DBKEY#17]
>   CartesianProduct
>    HiveTableScan [FLOORPLAN_NM#20,FLOORPLAN_DBKEY#17], 
> (MetastoreRelation sourav_ikb_hs, ikb_fp_pog_pre_ext, Some(P)), None
>    Project [IKB_PROJECT_LIVE_DT#31]
>     Filter (IKB_PROJECT_TYPE#29 = CR)
>      HiveTableScan [IKB_PROJECT_LIVE_DT#31,IKB_PROJECT_TYPE#29], 
> (MetastoreRelation sourav_ikb_hs, ikb_project_calendar_ext, Some(C)), None
>
>  Code Generation: false
>  == RDD ==
>
> -------
>
>
>
>
> On Wed, Jun 10, 2015 at 12:59 AM, Cheng Lian <lian.cs.zju@gmail.com 
> <mailto:lian.cs.zju@gmail.com>> wrote:
>
>     Would you mind to provide executor output so that we can check the
>     reason why executors died?
>
>     And you may run EXPLAIN EXTENDED to find out the physical plan of
>     your query, something like:
>
>     |0: jdbc:hive2://localhost:10000> explain extended select * from foo;
>     +-------------------------------------------------------------------------+
>     |                                  plan                                   |
>     +-------------------------------------------------------------------------+
>     | == Parsed Logical Plan ==                                               |
>     | 'Project [*]                                                            |
>     |  'UnresolvedRelation [foo], None                                        |
>     |                                                                         |
>     | == Analyzed Logical Plan ==                                             |
>     | i: string                                                               |
>     | Project [i#6]                                                           |
>     |  Subquery foo                                                           |
>     |   Relation[i#6] org.apache.spark.sql.parquet.ParquetRelation2@517574b8  |
>     |                                                                         |
>     | == Optimized Logical Plan ==                                            |
>     | Relation[i#6] org.apache.spark.sql.parquet.ParquetRelation2@517574b8    |
>     |                                                                         |
>     | == Physical Plan ==                                                     |
>     | PhysicalRDD [i#6], MapPartitionsRDD[2] at                               |
>     |                                                                         |
>     | Code Generation: false                                                  |
>     | == RDD ==                                                               |
>     +-------------------------------------------------------------------------+
>     |
>
>     On 6/10/15 1:28 PM, Sourav Mazumder wrote:
>
>>     From log file I noticed that the ExecutorLostFailure happens
>>     after the memory used by Executor becomes more than the Executor
>>     memory value. However, even if I increase the value of Executor
>>     Memory the Executor fails - only that it takes longer time.
>>
>>     I'm wondering that for joining 2 Hive tables, one with 100 MB
>>     data (around 1 M rows) and another with 20 KB data (around 100
>>     rows) why an executor is consuming so much of memory. Even if I
>>     increase the memory to 20 GB. The same failure happens.
>>
>>     Regards,
>>     Sourav
>>
>>     On Tue, Jun 9, 2015 at 12:58 PM, Sourav Mazumder
>>     <sourav.mazumder00@gmail.com
>>     <mailto:sourav.mazumder00@gmail.com>> wrote:
>>
>>         Hi,
>>
>>         I'm just doing a select statement which is supposed to return
>>         10 MB data maximum. The driver memory is 2G and executor
>>         memory is 20 G.
>>
>>         The query I'm trying to run is something like
>>
>>         SELECT PROJECT_LIVE_DT, FLOORPLAN_NM, FLOORPLAN_DB_KEY
>>         FROM POG_PRE_EXT P, PROJECT_CALENDAR_EXT C
>>         WHERE PROJECT_TYPE = 'CR'
>>
>>         Not sure what exactly you mean by physical plan.
>>
>>         Here is he stack trace from the machine where the thrift
>>         process is running.
>>
>>         Regards,
>>         Sourav
>>
>>         On Mon, Jun 8, 2015 at 11:18 PM, Cheng, Hao
>>         <hao.cheng@intel.com <mailto:hao.cheng@intel.com>> wrote:
>>
>>             Is it the large result set return from the Thrift Server?
>>             And can you paste the SQL and physical plan?
>>
>>             *From:*Ted Yu [mailto:yuzhihong@gmail.com
>>             <mailto:yuzhihong@gmail.com>]
>>             *Sent:* Tuesday, June 9, 2015 12:01 PM
>>             *To:* Sourav Mazumder
>>             *Cc:* user
>>             *Subject:* Re: Spark SQL with Thrift Server is very very
>>             slow and finally failing
>>
>>             Which Spark release are you using ?
>>
>>             Can you pastebin the stack trace w.r.t. ExecutorLostFailure ?
>>
>>             Thanks
>>
>>             On Mon, Jun 8, 2015 at 8:52 PM, Sourav Mazumder
>>             <sourav.mazumder00@gmail.com
>>             <mailto:sourav.mazumder00@gmail.com>> wrote:
>>
>>                 Hi,
>>
>>                 I am trying to run a SQL form a JDBC driver using
>>                 Spark's Thrift Server.
>>
>>                 I'm doing a join between a Hive Table of size around
>>                 100 GB and another Hive Table with 10 KB, with a
>>                 filter on a particular column
>>
>>                 The query takes more than 45 minutes and then I get
>>                 ExecutorLostFailure. That is because of memory as
>>                 once I increase the memory the failure happens but
>>                 after a long time.
>>
>>                 I'm having executor memory 20 GB, Spark DRiver Memory
>>                 2 GB, Executor Instances 2 and Executor Core 2.
>>
>>                 Running the job using Yarn with master as 'yarn-client'.
>>
>>                 Any idea if I'm missing any other configuration ?
>>
>>                 Regards,
>>
>>                 Sourav
>>
>>
>>
>     ‚Äč
>
>


Mime
View raw message