spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From salemi <alireza.sal...@udo.edu>
Subject Spark Streaming - how to implement multiple calculation using the same data set
Date Tue, 02 Sep 2014 21:54:49 GMT
Hi,

I am planing to use a incoming DStream and calculate different measures from
the same stream.

I was able to calculate the individual measures separately and know I have
to merge them and spark streaming doesn't support outer join yet.


handlingtimePerWorker List(workerId, hanlingTime)
fileProcessedCountPerWorker (workerId, filesProcessedCount)

Is there a design pattern that allows to use each RDD in the DStream and
calculate the measures for the worker and save the attributes in the same
object (Worker).





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-how-to-implement-multiple-calculation-using-the-same-data-set-tp13306.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message