spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Snively <>
Subject Re: Roadblock with Spark 0.8.0 ActorStream
Date Thu, 03 Oct 2013 16:45:52 GMT
Hi Prahant,

First, thanks so much for taking the time to investigate this!

On Oct 3, 2013, at 3:59 AM, Prashant Sharma wrote:

> Trying to replicate it following change to ActorWordCount can reproduce it.
>  lines.flatMap(_.split("\\s+")).map(x => (x, 1)).reduceByKey(_ + _).foreach {
>       x => x foreach { x1 => con ! x1 }
>     }
> While this works
>  lines.flatMap(_.split("\\s+")).map(x => (x, 1)).reduceByKey(_ + _).foreach {
>       x => con ! x.collect()
>     }

That's very interesting. So this appears to be a general issue with ActorStream, then?

> There is another important thing, for now it is not possible to pass ActorRef while creating
a props instance for actorStream. The reason is, it serializes the actorRef and sends it to
runJob(which is dumb to not have actor system).

I can see why it wouldn't be obvious that runJob() needs to have an ActorSystem in scope,
but it does seem like something that needs fixing.

> So alternative approach like by passing path(as string) or ActorPath and then instantiate
the ActorRef, may be used. 

You'd hope so, but if you run the unit test in my updated project, still at <>,
you'll see that it still doesn't work. I suspect that an ActorRef that is part of the spray-io
implementation is being serialized, about which there's nothing I can realistically do.

Incidentally, spray-io, which has become Akka IO as of Akka 2.2, is essentially a HIGHLY scalable
presentation of Java NIO as Akka actors. It's the foundation of spray's own HTTP server and
client implementation, and Akka's networking as well. In my case, I'm just trying to create
a raw SSL/TLS-aware socket server that constructs a Spark Stream for each incoming connection.
Again, since spray-io presents socket connections as Akka Actors, ActorStream seems like an
obvious fit. And it is—we're just learning useful things about certain use cases.

> This should be fixable I suppose in future, but for that I have to learn how to hack
into our runjob mechanism. Will try to check it ASAP. For now may be we can create these issues
in jira ?

Very good idea. I will do that now.
> Please let me know if this fixes your problems. 

Unfortunately it doesn't. Please let me know if you are unable to reproduce the problem with
my project, or have any other questions or comments.

Thanks so much again for your time!

View raw message