spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Re: Spark driver main thread hanging after SQL insert
Date Fri, 02 Jan 2015 17:13:38 GMT
Hi Alessandro,

Can you create a JIRA for this rather than reporting it on the dev
list? That's where we track issues like this. Thanks!.

- Patrick

On Wed, Dec 31, 2014 at 8:48 PM, Alessandro Baretta
<alexbaretta@gmail.com> wrote:
> Here's what the console shows:
>
> 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0,
> whose tasks have all completed, from pool
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at
> ParquetTableOperations.scala:326) finished in 5493.549 s
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at
> ParquetTableOperations.scala:326, took 5493.747061 s
>
> It is now 01:40:03, so the driver has been hanging for the last 28 minutes.
> The web UI on the other hand shows that all tasks completed successfully,
> and the output directory has been populated--although the _SUCCESS file is
> missing.
>
> It is worth noting that my code started this job as its own thread. The
> actual code looks like the following snippet, modulo some simplifications.
>
>   def save_to_parquet(allowExisting : Boolean = false) = {
>     val threads = tables.map(table => {
>       val thread = new Thread {
>         override def run {
>           table.insertInto(t.table_name)
>         }
>       }
>       thread.start
>       thread
>     })
>     threads.foreach(_.join)
>   }
>
> As far as I can see the insertInto call never returns. Any idea why?
>
> Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message