spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emre Sevinc <>
Subject Which method do you think is better for making MIN_REMEMBER_DURATION configurable?
Date Wed, 08 Apr 2015 08:44:43 GMT

This is about SPARK-3276 and I want to make MIN_REMEMBER_DURATION (that is
now a constant) a variable (configurable, with a default value). Before
spending effort on developing something and creating a pull request, I
wanted to consult with the core developers to see which approach makes most
sense, and has the higher probability of being accepted.

The constant MIN_REMEMBER_DURATION can be seen at:

it is marked as private member of private[streaming] object

Approach 1: Make MIN_REMEMBER_DURATION a variable, with a new name of
minRememberDuration, and then  add a new fileStream method to
JavaStreamingContext.scala :

such that the new fileStream method accepts a new parameter, e.g.
minRememberDuration: Int (in seconds), and then use this value to set the
private minRememberDuration.

Approach 2: Create a new, public Spark configuration property, e.g. named
spark.rememberDuration.min (with a default value of 60 seconds), and then
set the private variable minRememberDuration to the value of this Spark

Approach 1 would mean adding a new method to the public API, Approach 2
would mean creating a new public Spark property. Right now, approach 2
seems more straightforward and simpler to me, but nevertheless I wanted to
have the opinions of other developers who know the internals of Spark
better than I do.

Kind regards,
Emre Sevinç

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message