spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iulian DragoČ™ <iulian.dra...@typesafe.com>
Subject Re: Spark Streaming: Doing operation in Receiver vs RDD
Date Thu, 08 Oct 2015 10:14:35 GMT
You can have a look at
http://spark.apache.org/docs/latest/streaming-programming-guide.html#receiver-reliability
for details on Receiver reliability. If you go the receiver way you'll need
to enable Write Ahead Logs to ensure no data loss. In Kafka direct you
don't have this problem.

Regarding where to apply decryption, I'd lean towards doing it as RDD
transformations for the reasons you mentioned. Also, in case only some
fields are encrypted, this way you can delay decryption until really need
(assuming some records would be filtered out, etc.).

iulian

On Wed, Oct 7, 2015 at 9:55 PM, emiretsk <eugene.miretsky@gmail.com> wrote:

> Hi,
>
> I have a Spark Streaming program that is consuming message from Kafka and
> has to decrypt and deserialize each message. I can implement it either as
> Kafka deserializer (that will run in a receiver or the new receiver-less
> Kafka consumer)  or as RDD operations. What are the pros/cons of each?
>
> As I see it, doing the operations on RDDs has the following implications
> Better load balancing, and fault tolerance. (though I'm not quite sure what
> happens when a receiver fails). Also, not sure if this is still true with
> the new Kafka receiver-less consumer as it creates an RDD partition for
> each
> Kafka partition
> All functions that are applied to RDDs need to be either static or part of
> serialzable objects. This makes using standard/3rd party Java libraries
> harder.
> Cheers,
> Eugene
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Doing-operation-in-Receiver-vs-RDD-tp24973.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>


-- 

--
Iulian Dragos

------
Reactive Apps on the JVM
www.typesafe.com

Mime
View raw message