You are right in tour answer to what Mohit wrote. However what Mohit seems to be alluring but did not write properly might be different.
You are wrong in saying "generally" streaming works in HDFS and cassandra . Streaming typically works with streaming or queing source like Kafka, kinesis, Twitter, flume, zeroMQ, etc (but can also from HDFS and S3 ) However , streaming context ( "receiver" wishing the streaming context ) gets events/messages/records and forms a time window based batch (RDD)-
So there is a maximum gap of window time from alert message was available to spark and when the processing happens. I think you meant about this.
As per spark programming model, RDD is the right way to deal with data. If you are fine with the minimum delay of say a sec (based on min time window that dstreaming can support) then what Rohit gave is a right model.
What do you mean you can't send it directly from spark workers? Here's a simple approach which you could do:
val data = ssc.textFileStream("sigmoid/")
val dist = data.filter(_.contains("ERROR")).foreachRDD(rdd => alert("Errors :" + rdd.count()))
And the alert() function could be anything triggering an email or sending an SMS alert.