spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shashikant Kulkarni (शशिकांत कुलकर्णी) <shashikant.kulka...@gmail.com>
Subject Re: Apache Spark JavaRDD pipe() need help
Date Thu, 22 Sep 2016 09:10:51 GMT
Hello Jakob,

Thanks for replying. Here is a short example of what I am trying. Taking an
example of Product column family in Cassandra just for explaining my
requirement

In Driver.java
{
         JavaRDD<Product> productsRdd = Get Products from Cassandra;
         productsRdd.map(ProductHelper.processProduct());
}

in ProductHelper.java
{

        public static Function<Product, Boolean> processProduct() {
return new Function< Product, Boolean>(){
private static final long serialVersionUID = 1L;

@Override
public Boolean call(Product product) throws Exception {
//STEP 1: Doing some processing on product object.
//STEP 2: Now using few values of product, I need to create a string like
"name id sku datetime"
//STEP 3: Pass this string to my C binary file to perform some complex
calculations and return some data
//STEP 4: Get the return data and store it back in Cassandra DB
}
};
}
}

In this ProductHelper, I cannot pass and don't want to pass sparkContext
object as app will throw error of "task not serializable". If there is a
way let me know.

Now I am not able to achieve STEP 3 above. How can I pass a String to C
binary and get the output back in my program. The C binary reads data from
STDIN and outputs data to STDOUT. It is working from other part of
application from PHP. I want to reuse the same C binary in my Apache SPARK
application for some background processing and analysis using
JavaRDD.pipe() API. If there is any other way let me know. This code will
be executed in all the nodes in a cluster.

Hope my requirement is now clear. How to do this?

Regards,
Shash

On Thu, Sep 22, 2016 at 4:13 AM, Jakob Odersky <jakob@odersky.com> wrote:

> Can you provide more details? It's unclear what you're asking
>
> On Wed, Sep 21, 2016 at 10:14 AM, shashikant.kulkarni@gmail.com
> <shashikant.kulkarni@gmail.com> wrote:
> > Hi All,
> >
> > I am trying to use the JavaRDD.pipe() API.
> >
> > I have one object with me from the JavaRDD
>

Mime
View raw message