spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From suman bharadwaj <suman....@gmail.com>
Subject Re: PIG to SPARK
Date Thu, 06 Mar 2014 17:20:10 GMT
Thanks Mayur. I don't have clear idea on how pipe works wanted to
understand more on it. But when do we use pipe() and how it works ?. Can
you please share some sample code if you have ( even pseudo-code is fine )
? It will really help.

Regards,
Suman Bharadwaj S


On Thu, Mar 6, 2014 at 3:46 AM, Mayur Rustagi <mayur.rustagi@gmail.com>wrote:

> The real question is why do you want to run pig script using Spark
> Are you planning to user spark as underlying processing engine for Spark?
> thats not simple
> Are you planning to feed Pig data to spark for further processing, then
> you can write it to HDFS & trigger your spark script.
>
> rdd.pipe is basically similar to Hadoop streaming, allowing you to run a
> script on each partition of the RDD & get output as another RDD.
> Regards
> Mayur
>
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Wed, Mar 5, 2014 at 10:29 AM, suman bharadwaj <suman.dna@gmail.com>wrote:
>
>> Hi,
>>
>> How can i call pig script using SPARK. Can I use rdd.pipe() here ?
>>
>> And can anyone share sample implementation of rdd.pipe () and if you can
>> explain how rdd.pipe() works, it would really really help.
>>
>> Regards,
>> SB
>>
>
>

Mime
View raw message