spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: Spark-Streaming: output to cassandra
Date Fri, 05 Dec 2014 06:50:37 GMT
Batch is the batch duration that you are specifying while creating the
StreamingContext, so at the end of every batch's computation the data will
get flushed to Cassandra, and why are you stopping your program with Ctrl +
C? You can always specify the time with the sc.awaitTermination(Duration)

Thanks
Best Regards

On Fri, Dec 5, 2014 at 11:53 AM, <m.sarosh@accenture.com> wrote:

>  Hi Gerard/Akhil,
>
>
>  By "how do I specify a batch" I was trying to ask that when does the
> data in the JavaDStream gets flushed into Cassandra table?.
>
> I read somewhere that the streaming data in batches gets written
> in Cassandra. This batch can be of some particular time, or one particular
> run.
>
> That was what I was trying to understand, how to set that "Batch" in my
> program. Because if a batch means one cycle run of my  streaming app, then
> in my app, I'm hitting a Ctrl+C to kill the program. So the program is
> terminating, and would the data get inserted successfully into my Cassandra
> table?
>
> For example,
>
> in Terminal-A I'm running Kafka-producer to stream-in messages.
>
> Terminal-B I'm running my Streaming App. In my App there is a line
> jssc.awaitTermination();​ which will keep running my App till I kill it.
>
> Eventually I am hitting Ctrl+C in my App terminal, i.e. Terminal-B and
> killing it. So its a kind of ungraceful termination. So in this case will
> the data in my App DStream get written into Cassandra?
>
>
>
>
>    Thanks and Regards,
>
> *Md. Aiman Sarosh.*
> Accenture Services Pvt. Ltd.
> Mob #:  (+91) - 9836112841.
>    ------------------------------
> *From:* Gerard Maas <gerard.maas@gmail.com>
> *Sent:* Thursday, December 4, 2014 10:22 PM
> *To:* Akhil Das
> *Cc:* Sarosh, M.; user@spark.apache.org
> *Subject:* Re: Spark-Streaming: output to cassandra
>
>  I guess he's already doing so, given the 'saveToCassandra' usage.
> What I don't understand is the question "how do I specify a batch". That
> doesn't make much sense to me. Could you explain further?
>
>  -kr, Gerard.
>
> On Thu, Dec 4, 2014 at 5:36 PM, Akhil Das <akhil@sigmoidanalytics.com>
> wrote:
>
>>  You can use the datastax's Cassandra connector.
>> <https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md>
>>
>>  Thanks
>> Best Regards
>>
>> On Thu, Dec 4, 2014 at 8:21 PM, <m.sarosh@accenture.com> wrote:
>>
>>>  Hi,
>>>
>>>
>>>  I have written the code below which is streaming data from kafka, and
>>> printing to the console.
>>>
>>> I want to extend this, and want my data to go into Cassandra table
>>> instead.
>>>
>>>
>>>  JavaStreamingContext jssc = new JavaStreamingContext("local[4]",
>>> "SparkStream", new Duration(1000));
>>> JavaPairReceiverInputDStream<String, String> messages =
>>> KafkaUtils.createStream(jssc, args[0], args[1], topicMap );
>>>
>>>  System.out.println("Connection done!");
>>> JavaDStream<String> data = messages.map(new Function<Tuple2<String,
>>> String>, String>()
>>> {
>>> public String call(Tuple2<String, String> message)
>>> {
>>> return message._2();
>>> }
>>> }
>>> );
>>> //data.print();   --> output to console
>>>  data.foreachRDD(saveToCassandra("mykeyspace","mytable"));
>>>  jssc.start();
>>> jssc.awaitTermination();
>>>
>>>
>>>
>>>  How should I implement the line:
>>>
>>> data.foreachRDD(saveToCassandra("mykeyspace","mytable"));​
>>>
>>> so that data goes into Cassandra, in each batch.  And how do I specify a
>>> batch, because if i do Ctrl+C on the console of streaming-job-jar, nothing
>>> will be entered into cassandra for sure since it is getting killed.
>>>
>>>
>>>  Please help.
>>>
>>>
>>>    Thanks and Regards,
>>>
>>> *Md. Aiman Sarosh.*
>>> Accenture Services Pvt. Ltd.
>>> Mob #:  (+91) - 9836112841.
>>>
>>> ------------------------------
>>>
>>> This message is for the designated recipient only and may contain
>>> privileged, proprietary, or otherwise confidential information. If you have
>>> received it in error, please notify the sender immediately and delete the
>>> original. Any other use of the e-mail by you is prohibited. Where allowed
>>> by local law, electronic communications with Accenture and its affiliates,
>>> including e-mail and instant messaging (including content), may be scanned
>>> by our systems for the purposes of information security and assessment of
>>> internal compliance with Accenture policy.
>>>
>>> ______________________________________________________________________________________
>>>
>>> www.accenture.com
>>>
>>
>>
>

Mime
View raw message