spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: Spark Kafka Direct Streaming
Date Tue, 07 Jul 2015 22:02:14 GMT
When you enable checkpointing by setting the checkpoint directory, you
enable metadata checkpointing. Data checkpointing kicks in only if you are
using a DStream operation that requires it, or you are enabling Write Ahead
Logs to prevent data loss on driver failure.

More discussion -
https://spark-summit.org/2015/events/recipes-for-running-spark-streaming-applications-in-production/

On Tue, Jul 7, 2015 at 7:42 AM, abi_pat <present.boiling2290@gmail.com>
wrote:

> Hi,
>
> I am using the new experimental Direct Stream API. Everything is working
> fine but when it comes to fault tolerance, I am not sure how to achieve it.
> Presently my Kafka config map looks like this
>
>         configMap.put("zookeeper.connect","192.168.51.98:2181");
>         configMap.put("group.id", UUID.randomUUID().toString());
>         configMap.put("auto.offset.reset","smallest");
>         configMap.put("auto.commit.enable","true");
>         configMap.put("topics","IPDR31");
>         configMap.put("kafka.consumer.id","kafkasparkuser");
>         configMap.put("bootstrap.servers","192.168.50.124:9092");
>         Set<String> topic = new HashSet<String>();
>         topic.add("IPDR31");
>
>         JavaPairInputDStream<byte[], byte[]> kafkaData =
>
> KafkaUtils.createDirectStream(js,byte[].class,byte[].class,DefaultDecoder.class,DefaultDecoder.class,configMap,topic);
>
> Questions -
>
> Q1- Is my Kafka configuration correct or should it be changed?
>
> Q2- I also looked into the Checkpointing but in my usecase, Data
> checkpointing is not required but meta checkpointing is required. Can I
> achieve this, i.e. enabling meta checkpointing and not the data
> checkpointing?
>
>
>
> Thanks
> Abhishek Patel
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kafka-Direct-Streaming-tp23685.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message