spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: Batchdurationmillis seems "sticky" with direct Spark streaming
Date Wed, 09 Sep 2015 18:44:41 GMT
See inline.

On Tue, Sep 8, 2015 at 9:02 PM, Dmitry Goldenberg <dgoldenberg123@gmail.com>
wrote:

> What's wrong with creating a checkpointed context??  We WANT
> checkpointing, first of all.  We therefore WANT the checkpointed context.
>
> Second of all, it's not true that we're loading the checkpointed context
> independent of whether params.isCheckpointed() is true.  I'm quoting the
> code again:
>
> // This is NOT loading a checkpointed context if isCheckpointed() is false.
> JavaStreamingContext jssc = params.isCheckpointed() ?
> createCheckpointedContext(sparkConf, params) : createContext(sparkConf,
> params);
>
>   private JavaStreamingContext createCheckpointedContext(SparkConf
> sparkConf, Parameters params) {
>     JavaStreamingContextFactory factory = new
> JavaStreamingContextFactory() {
>       @Override
>       public JavaStreamingContext create() {
>         return createContext(sparkConf, params);
>       }
>     };
>     return *JavaStreamingContext.getOrCreate(params.getCheckpointDir(),
> factory);*
>
^^^^^   when you use getOrCreate, and there exists a valid checkpoint, it
will always return the context from the checkpoint and not call the
factory. Simple way to see whats going on is to print something in the
factory to verify whether it is ever called.





>   }
>
>   private JavaStreamingContext createContext(SparkConf sparkConf,
> Parameters params) {
>     // Create context with the specified batch interval, in milliseconds.
>     JavaStreamingContext jssc = new JavaStreamingContext(sparkConf,
> Durations.milliseconds(params.getBatchDurationMillis()));
>     // Set the checkpoint directory, if we're checkpointing
>     if (params.isCheckpointed()) {
>       jssc.checkpoint(params.getCheckpointDir());
>
>     }
> ...............
> Again, this is *only* calling context.checkpoint() if isCheckpointed() is
> true.  And we WANT it to be true.
>
> What am I missing here?
>
>
>

Mime
View raw message