spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Streaming: which code is (not) executed at every batch interval?
Date Tue, 04 Nov 2014 20:10:31 GMT
On Tue, Nov 4, 2014 at 8:02 PM, spr <> wrote:
> To state this another way, it seems like there's no way to straddle the
> streaming world and the non-streaming world;  to get input from both a
> (vanilla, Linux) file and a stream.  Is that true?
> If so, it seems I need to turn my (vanilla file) data into a second stream.

Hm, why do you say that? nothing prevents that at all. You can do
anything you like in your local code, or in functions you send to
remote workers. (Of course, if those functions depend on a local file,
it has to exist locally on the workers.) You do have to think about
the distributed model here, but what executes locally/remotely isn't
mysterious. It is things in calls to Spark API method that will be
executed remotely.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message