You can use the pipe() function on RDDs to call external code. It passes data to an external program through stdin / stdout. For Spark Streaming, you would do dstream.transform(rdd => rdd.pipe(...)) to call it on each RDD.

Matei

On Jan 14, 2015, at 8:41 PM, Umanga Bista <bistaumanga@gmail.com> wrote:


This is question i originally asked in Quora: http://qr.ae/6qjoI

We have some code written in C++ and Python that does data enrichment to our data streams. If i use Storm, i could use those code with some small modifications using ShellBolt and IRichBolt. Since the functionalities is all about data enrichment, if the code has been in Scala, i could use it with rdd.map() function. So, is there any way to use non scala existing code in map with spark streaming in scala like Storm’s ShellBolt and IRichBolt?

Umanga Bista
Kathmandu, Nepal