spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashic Mahtab <as...@live.com>
Subject RE: Are these numbers abnormal for spark streaming?
Date Thu, 22 Jan 2015 15:40:17 GMT
Yup...looks like it. I can do some tricks to reduce setup costs further, but this is much better
than where I was yesterday. Thanks for your awesome input :)

-Ashic.

From: gerard.maas@gmail.com
Date: Thu, 22 Jan 2015 16:34:38 +0100
Subject: Re: Are these numbers abnormal for spark streaming?
To: asudipta.banerjee@gmail.com
CC: ashic@live.com; user@spark.apache.org; tathagata.das1565@gmail.com

Given that the process, and in particular, the setup of connections, is bound to the number
of partitions (in x.foreachPartition{ x=> ???}), I think it would be worth trying reducing
them.
Increasing the  'spark.streaming.BlockInterval' will do the trick (you can read the tuning
details here:  http://www.virdata.com/tuning-spark/#Partitions)
-kr, Gerard.
On Thu, Jan 22, 2015 at 4:28 PM, Gerard Maas <gerard.maas@gmail.com> wrote:
So the system has gone from 7msg in 4.961 secs (median) to 106msgs in 4,761 seconds. I think
there's evidence that setup costs are quite high in this case and increasing the batch interval
is helping.
On Thu, Jan 22, 2015 at 4:12 PM, Sudipta Banerjee <asudipta.banerjee@gmail.com> wrote:
Hi Ashic Mahtab,

The Cassandra and the Zookeeper are they installed as a part of Yarn architecture or are they
installed in a separate layer with Apache Spark .

Thanks and Regards,
Sudipta

On Thu, Jan 22, 2015 at 8:13 PM, Ashic Mahtab <ashic@live.com> wrote:



Hi Guys,
So I changed the interval to 15 seconds. There's obviously a lot more messages per batch,
but (I think) it looks a lot healthier. Can you see any major warning signs? I think that
with 2 second intervals, the setup / teardown per partition was what was causing the delays.

StreamingStarted at: Thu Jan 22 13:23:12 GMT 2015Time since start: 1 hour 17 minutes 16 secondsNetwork
receivers: 2Batch interval: 15 secondsProcessed batches: 309Waiting batches: 0

Statistics over last 100 processed batchesReceiver StatisticsReceiverStatusLocationRecords
in last batch[2015/01/22 14:40:29]Minimum rate[records/sec]Median rate[records/sec]Maximum
rate[records/sec]Last ErrorRmqReceiver-0ACTIVEVDCAPP53.foo.local2.6 K29106295-RmqReceiver-1ACTIVEVDCAPP50.bar.local2.6
K29107291-Batch Processing StatisticsMetricLast batchMinimum25th percentileMedian75th percentileMaximumProcessing
Time4 seconds 812 ms4 seconds 698 ms4 seconds 738 ms4 seconds 761 ms4 seconds 788 ms5 seconds
802 msScheduling Delay2 ms0 ms3 ms3 ms4 ms9 msTotal Delay4 seconds 814 ms4 seconds 701 ms4
seconds 739 ms4 seconds 764 ms4 seconds 792 ms5 seconds 809 ms
Regards,
Ashic.
From: ashic@live.com
To: gerard.maas@gmail.com
CC: user@spark.apache.org
Subject: RE: Are these numbers abnormal for spark streaming?
Date: Thu, 22 Jan 2015 12:32:05 +0000




Hi Gerard,
Thanks for the response.

The messages get desrialised from msgpack format, and one of the strings is desrialised to
json. Certain fields are checked to decide if further processing is required. If so, it goes
through a series of in mem filters to check if more processing is required. If so, only then
does the "heavy" work start. That consists of a few db queries, and potential updates to the
db + message on message queue. The majority of messages don't need processing. The messages
needing processing at peak are about three every other second. 

One possible things that might be happening is the session initialisation and prepared statement
initialisation for each partition. I can resort to some tricks, but I think I'll try increasing
batch interval to 15 seconds. I'll report back with findings.

Thanks,
Ashic.

From: gerard.maas@gmail.com
Date: Thu, 22 Jan 2015 12:30:08 +0100
Subject: Re: Are these numbers abnormal for spark streaming?
To: tathagata.das1565@gmail.com
CC: ashic@live.com; tdas@databricks.com; user@spark.apache.org

and post the code (if possible).In a nutshell, your processing time > batch interval, 
resulting in an ever-increasing delay that will end up in a crash.
3 secs to process 14 messages looks like a lot. Curious what the job logic is.
-kr, Gerard.
On Thu, Jan 22, 2015 at 12:15 PM, Tathagata Das <tathagata.das1565@gmail.com> wrote:
This is not normal. Its a huge scheduling delay!! Can you tell me more about the application?-
cluser setup, number of receivers, whats the computation, etc.
On Thu, Jan 22, 2015 at 3:11 AM, Ashic Mahtab <ashic@live.com> wrote:



Hate to do this...but...erm...bump? Would really appreciate input from others using Streaming.
Or at least some docs that would tell me if these are expected or not.

From: ashic@live.com
To: user@spark.apache.org
Subject: Are these numbers abnormal for spark streaming?
Date: Wed, 21 Jan 2015 11:26:31 +0000




Hi Guys,
I've got Spark Streaming set up for a low data rate system (using spark's features for analysis,
rather than high throughput). Messages are coming in throughout the day, at around 1-20 per
second (finger in the air estimate...not analysed yet).  In the spark streaming UI for the
application, I'm getting the following after 17 hours.

StreamingStarted at: Tue Jan 20 16:58:43 GMT 2015Time since start: 18 hours 24 minutes 34
secondsNetwork receivers: 2Batch interval: 2 secondsProcessed batches: 16482Waiting batches:
1

Statistics over last 100 processed batchesReceiver StatisticsReceiverStatusLocationRecords
in last batch[2015/01/21 11:23:18]Minimum rate[records/sec]Median rate[records/sec]Maximum
rate[records/sec]Last ErrorRmqReceiver-0ACTIVEFOOOO
144727-RmqReceiver-1ACTIVEBAAAAR
124726-Batch Processing StatisticsMetricLast batchMinimum25th percentileMedian75th percentileMaximumProcessing
Time3 seconds 994 ms157 ms4 seconds 16 ms4 seconds 961 ms5 seconds 3 ms5 seconds 171 msScheduling
Delay9 hours 15 minutes 4 seconds9 hours 10 minutes 54 seconds9 hours 11 minutes 56 seconds9
hours 12 minutes 57 seconds9 hours 14 minutes 5 seconds9 hours 15 minutes 4 secondsTotal Delay9
hours 15 minutes 8 seconds9 hours 10 minutes 58 seconds9 hours 12 minutes9 hours 13 minutes
2 seconds9 hours 14 minutes 10 seconds9 hours 15 minutes 8 seconds
Are these "normal". I was wondering what the scheduling delay and total delay terms are, and
if it's normal for them to be 9 hours.

I've got a standalone spark master and 4 spark nodes. The streaming app has been given 4 cores,
and it's using 1 core per worker node. The streaming app is submitted from a 5th machine,
and that machine has nothing but the driver running. The worker nodes are running alongside
Cassandra (and reading and writing to it).

Any insights would be appreciated.

Regards,
Ashic.
 		 	   		   		 	   		  



 		 	   		   		 	   		  


-- 
Sudipta BanerjeeConsultant, Business Analytics and Cloud Based Architecture Call me +919019578099




 		 	   		  
Mime
View raw message