spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haopu Wang" <>
Subject RE: Should I avoid "state" in an Spark application?
Date Mon, 13 Jun 2016 01:11:45 GMT
Can someone look at my questions? Thanks again!



From: Haopu Wang 
Sent: 2016年6月12日 16:40
Subject: Should I avoid "state" in an Spark application?


I have a Spark application whose structure is below:


    var ts: Long = 0L


        (x, time) => {

            ts = time





    process_data(dstream2, ts, ......)


I assume foreachRDD function call can update "ts" variable which is then used in the Spark
tasks of "process_data" function.


>From my test result of a standalone Spark cluster, it is working. But should I concern
if switch to YARN?


And I saw some articles are recommending to avoid state in Scala programming. Without the
state variable, how could that be done?


Any comments or suggestions are appreciated.




View raw message