Hi,
  Current implementation of map function in spark streaming looks as below.

  def map[U: ClassTag](mapFunc: T => U): DStream[U] = {
  new MappedDStream(this, context.sparkContext.clean(mapFunc))
}
It creates an instance of MappedDStream which is a subclass of DStream.

The same function can be also implemented using transform API

def map[U: ClassTag](mapFunc: T => U): DStream[U] = 
this.transform(rdd => {
  rdd.map(mapFunc)
})

Both implementation looks same. If they are same, is there any advantage having a subclass of DStream?. Why can't we just use transform API?


Regards,
Madhukara Phatak
http://datamantra.io/