spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From salemi <>
Subject Spark Streaming - how to implement multiple calculation using the same data set
Date Tue, 02 Sep 2014 21:54:49 GMT

I am planing to use a incoming DStream and calculate different measures from
the same stream.

I was able to calculate the individual measures separately and know I have
to merge them and spark streaming doesn't support outer join yet.

handlingtimePerWorker List(workerId, hanlingTime)
fileProcessedCountPerWorker (workerId, filesProcessedCount)

Is there a design pattern that allows to use each RDD in the DStream and
calculate the measures for the worker and save the attributes in the same
object (Worker).

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message