spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aram Mkrtchyan <aram.mkrtchyan...@gmail.com>
Subject Parallel actions from driver
Date Thu, 26 Mar 2015 15:54:51 GMT
Hi.

I'm trying to trigger DataFrame's save method in parallel from my driver.
For that purposes I use ExecutorService and Futures, here's my code:


val futures = [1,2,3].map( t => pool.submit( new Runnable {

override def run(): Unit = {
    val commons = events.filter(_._1 == t).map(_._2.common)
    saveAsParquetFile(sqlContext, commons, s"$t/common")
    EventTypes.all.foreach { et =>
        val eventData = events.filter(ev => ev._1 == t && ev._2.eventType
== et).map(_._2.data)
        saveAsParquetFile(sqlContext, eventData, s"$t/$et")
    }
}

}))
futures.foreach(_.get)

It throws "Task is not Serializable" exception. Is it legal to use threads
in driver to trigger actions?

Mime
View raw message