spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Safaric <>
Subject Spark Streaming fault tolerance benchmark
Date Sat, 13 Aug 2016 14:50:12 GMT
A few months ago, I've started investigating part of an empirical research
several stream processing engines, including but not limited to Spark

As the benchmark should extend the scope further from performance metrics
such as throughput and latency, I've focused onto fault tolerance as well.
In particular, the rate of data items lost due to various faults. In this
context, I have the following questions.

If a Driver fails, all Executors and their in-memory kept blocks will be
lost. The state is however maintained in HDFS for example when
checkpointing, or using synchronous Write Ahead Logs. But, what happens to
the blocks that have been received by the Receiver, but not yet processed?
Follow the assumption of using an unreliable input data source. Secondly, to
simulate this scenario and determine whether and in which amount data items
were lost, how exactly can I simulate Driver failure after a certain amount
of time the application has been operational. 

Thirdly, what other fault tolerance scenarios might result to data items
being lost? I've investigated onto back-pressure to, but unlike Storm which
uses a fast-fail back-pressure strategy, Spark Streaming handles
back-pressure gracefully. 

Thanks a lot in advance!

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe e-mail:

View raw message