spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: Spark streaming alerting
Date Mon, 23 Mar 2015 16:43:15 GMT
I think I didn't explain myself properly :) What I meant to say was that
generally spark worker runs on either on HDFS's data nodes or on Cassandra
nodes, which typically is in a private network (protected). When a
condition is matched it's difficult to send out the alerts directly from
the worker nodes because of the security concerns. I was wondering if there
is a way to listen on the events as they occur on the sliding window scale
or is the best way to accomplish is to post back to a queue?

On Mon, Mar 23, 2015 at 2:22 AM, Khanderao Kand Gmail <
khanderao.kand@gmail.com> wrote:

> Akhil
>
> You are right in tour answer to what Mohit wrote. However what Mohit seems
> to be alluring but did not write properly might be different.
>
> Mohit
>
> You are wrong in saying "generally" streaming works in HDFS and cassandra
> . Streaming typically works with streaming or queing source like Kafka,
> kinesis, Twitter, flume, zeroMQ, etc (but can also from HDFS and S3 )
> However , streaming context ( "receiver" wishing the streaming context )
> gets events/messages/records and forms a time window based batch (RDD)-
>
> So there is a maximum gap of window time from alert message was available
> to spark and when the processing happens. I think you meant about this.
>
> As per spark programming model, RDD is the right way to deal with data.
> If you are fine with the minimum delay of say a sec (based on min time
> window that dstreaming can support) then what Rohit gave is a right model.
>
> Khanderao
>
> On Mar 22, 2015, at 11:39 PM, Akhil Das <akhil@sigmoidanalytics.com>
> wrote:
>
> What do you mean you can't send it directly from spark workers? Here's a
> simple approach which you could do:
>
>     val data = ssc.textFileStream("sigmoid/")
>     val dist = data.filter(_.contains("ERROR")).foreachRDD(rdd =>
> alert("Errors :" + rdd.count()))
>
> And the alert() function could be anything triggering an email or sending
> an SMS alert.
>
> Thanks
> Best Regards
>
> On Sun, Mar 22, 2015 at 1:52 AM, Mohit Anchlia <mohitanchlia@gmail.com>
> wrote:
>
>> Is there a module in spark streaming that lets you listen to
>> the alerts/conditions as they happen in the streaming module? Generally
>> spark streaming components will execute on large set of clusters like hdfs
>> or Cassandra, however when it comes to alerting you generally can't send it
>> directly from the spark workers, which means you need a way to listen to
>> the alerts.
>>
>
>

Mime
View raw message