spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sidd S <>
Subject Re: Combine code for RDD and DStream
Date Mon, 03 Aug 2015 20:43:59 GMT
DStreams "transform" function helps me solve this issue elegantly. Thanks!

On Mon, Aug 3, 2015 at 1:42 PM, Sidd S <> wrote:

> Hello!
> I am developing a Spark program that uses both batch and streaming
> (separately). They are both pretty much the exact same programs, except the
> inputs come from different sources. Unfortunately, RDD's and DStream's
> define all of their transformations in their own files, and so I have two
> different files with pretty much the exact same code. If I make a change to
> a transformation in one program, I have to make the exact same change to
> the other program. It would be nice to be able to have a third file that
> has all of my transformations. The batch program and the streaming program
> can then both reference this third file to know what transformations to
> perform on the data.
> Anyone know a good way of doing this? I want to be able to keep the exact
> same syntax (......rdd.filter({i:Int=>i*2}.map(.......).....) in this third
> file. With this method, if I make any changes to the transformations, it
> will apply to both the batch AND streaming processes. I tried a couple of
> ideas with no avail.
> Thanks in advance,
> Sidd

View raw message