spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: PIG to SPARK
Date Wed, 05 Mar 2014 22:16:45 GMT
The real question is why do you want to run pig script using Spark
Are you planning to user spark as underlying processing engine for Spark?
thats not simple
Are you planning to feed Pig data to spark for further processing, then you
can write it to HDFS & trigger your spark script.

rdd.pipe is basically similar to Hadoop streaming, allowing you to run a
script on each partition of the RDD & get output as another RDD.
Regards
Mayur


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Wed, Mar 5, 2014 at 10:29 AM, suman bharadwaj <suman.dna@gmail.com>wrote:

> Hi,
>
> How can i call pig script using SPARK. Can I use rdd.pipe() here ?
>
> And can anyone share sample implementation of rdd.pipe () and if you can
> explain how rdd.pipe() works, it would really really help.
>
> Regards,
> SB
>

Mime
View raw message