spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From madhu phatak <>
Subject MappedStream vs Transform API
Date Mon, 16 Mar 2015 08:31:55 GMT
  Current implementation of map function in spark streaming looks as below.

  def map[U: ClassTag](mapFunc: T => U): DStream[U] = {

  new MappedDStream(this, context.sparkContext.clean(mapFunc))

It creates an instance of MappedDStream which is a subclass of DStream.

The same function can be also implemented using transform API

def map[U: ClassTag](mapFunc: T => U): DStream[U] =

this.transform(rdd => {

Both implementation looks same. If they are same, is there any advantage
having a subclass of DStream?. Why can't we just use transform API?

Madhukara Phatak

View raw message