spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Gouta <>
Subject Re: Spark Streaming - print accumulators value every period as logs
Date Fri, 25 Dec 2015 08:47:53 GMT
Something like

rdd.collect.foreach(print accum))

Should answer your question. You get things printed in Each batch interval

Ali Gouta
Le 25 déc. 2015 04:22, "Roberto Coluccio" <> a
écrit :

> Hello,
> I have a batch and a streaming driver using same functions (Scala). I use
> accumulators (passed to functions constructors) to count stuff.
> In the batch driver, doing so in the right point of the pipeline, I'm able
> to retrieve the accumulator value and print it as log4j log.
> In the streaming driver, doing the same results in just nothing. That's
> probably due to the fact that accumulators in the streaming driver are
> created empty and the code to print them is executed once at the driver
> (when they are empty) when the StreamingContext is started and the DAG is
> created.
> I'm looking for a way to log at every batch period of my Spark Streaming
> driver the current value of my accumulators. Indeed, I wish to reset such
> accumulators at each period so to just have the counts related to that
> period.
> Any advice would be really appreciated.
> Thanks,
> Roberto

View raw message