spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aecc <alessandroa...@gmail.com>
Subject data within batchduration in RDD of a Dstream reliable?
Date Thu, 23 Jan 2014 20:12:19 GMT
Hi.

I know that every RDD received in a DStream are replicated to 2 nodes by
default. However if i choose a big batchDuration (let's say 5 min), data
that is received in the stream is also reliably stored? How? As far as I
know are the RDDs the ones that stored reliably (once the RDD has it's
complete data from the batchDuration).



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/data-within-batchduration-in-RDD-of-a-Dstream-reliable-tp835.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message