spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerard Maas <gerard.m...@gmail.com>
Subject Re: Are these numbers abnormal for spark streaming?
Date Thu, 22 Jan 2015 15:34:38 GMT
Given that the process, and in particular, the setup of connections, is
bound to the number of partitions (in x.foreachPartition{ x=> ???}), I
think it would be worth trying reducing them.
Increasing the  'spark.streaming.BlockInterval' will do the trick (you can
read the tuning details here:
http://www.virdata.com/tuning-spark/#Partitions)

-kr, Gerard.

On Thu, Jan 22, 2015 at 4:28 PM, Gerard Maas <gerard.maas@gmail.com> wrote:

> So the system has gone from 7msg in 4.961 secs (median) to 106msgs in
> 4,761 seconds.
> I think there's evidence that setup costs are quite high in this case and
> increasing the batch interval is helping.
>
> On Thu, Jan 22, 2015 at 4:12 PM, Sudipta Banerjee <
> asudipta.banerjee@gmail.com> wrote:
>
>> Hi Ashic Mahtab,
>>
>> The Cassandra and the Zookeeper are they installed as a part of Yarn
>> architecture or are they installed in a separate layer with Apache Spark .
>>
>> Thanks and Regards,
>> Sudipta
>>
>> On Thu, Jan 22, 2015 at 8:13 PM, Ashic Mahtab <ashic@live.com> wrote:
>>
>>> Hi Guys,
>>> So I changed the interval to 15 seconds. There's obviously a lot more
>>> messages per batch, but (I think) it looks a lot healthier. Can you see any
>>> major warning signs? I think that with 2 second intervals, the setup /
>>> teardown per partition was what was causing the delays.
>>>
>>> Streaming
>>>
>>>    - *Started at: *Thu Jan 22 13:23:12 GMT 2015
>>>    - *Time since start: *1 hour 17 minutes 16 seconds
>>>    - *Network receivers: *2
>>>    - *Batch interval: *15 seconds
>>>    - *Processed batches: *309
>>>    - *Waiting batches: *0
>>>
>>>
>>>
>>> Statistics over last 100 processed batchesReceiver Statistics
>>>
>>>    - Receiver
>>>
>>>
>>>    - Status
>>>
>>>
>>>    - Location
>>>
>>>
>>>    - Records in last batch
>>>    - [2015/01/22 14:40:29]
>>>
>>>
>>>    - Minimum rate
>>>    - [records/sec]
>>>
>>>
>>>    - Median rate
>>>    - [records/sec]
>>>
>>>
>>>    - Maximum rate
>>>    - [records/sec]
>>>
>>>
>>>    - Last Error
>>>
>>> RmqReceiver-0ACTIVEVDCAPP53.foo.local2.6 K29106295-RmqReceiver-1ACTIVE
>>> VDCAPP50.bar.local2.6 K29107291-
>>> Batch Processing Statistics
>>>
>>>    MetricLast batchMinimum25th percentileMedian75th percentileMaximumProcessing
>>>    Time4 seconds 812 ms4 seconds 698 ms4 seconds 738 ms4 seconds 761 ms4
>>>    seconds 788 ms5 seconds 802 msScheduling Delay2 ms0 ms3 ms3 ms4 ms9
>>>    msTotal Delay4 seconds 814 ms4 seconds 701 ms4 seconds 739 ms4
>>>    seconds 764 ms4 seconds 792 ms5 seconds 809 ms
>>>
>>>
>>> Regards,
>>> Ashic.
>>> ------------------------------
>>> From: ashic@live.com
>>> To: gerard.maas@gmail.com
>>> CC: user@spark.apache.org
>>> Subject: RE: Are these numbers abnormal for spark streaming?
>>> Date: Thu, 22 Jan 2015 12:32:05 +0000
>>>
>>>
>>> Hi Gerard,
>>> Thanks for the response.
>>>
>>> The messages get desrialised from msgpack format, and one of the strings
>>> is desrialised to json. Certain fields are checked to decide if further
>>> processing is required. If so, it goes through a series of in mem filters
>>> to check if more processing is required. If so, only then does the "heavy"
>>> work start. That consists of a few db queries, and potential updates to the
>>> db + message on message queue. The majority of messages don't need
>>> processing. The messages needing processing at peak are about three every
>>> other second.
>>>
>>> One possible things that might be happening is the session
>>> initialisation and prepared statement initialisation for each partition. I
>>> can resort to some tricks, but I think I'll try increasing batch interval
>>> to 15 seconds. I'll report back with findings.
>>>
>>> Thanks,
>>> Ashic.
>>>
>>> ------------------------------
>>> From: gerard.maas@gmail.com
>>> Date: Thu, 22 Jan 2015 12:30:08 +0100
>>> Subject: Re: Are these numbers abnormal for spark streaming?
>>> To: tathagata.das1565@gmail.com
>>> CC: ashic@live.com; tdas@databricks.com; user@spark.apache.org
>>>
>>> and post the code (if possible).
>>> In a nutshell, your processing time > batch interval,  resulting in an
>>> ever-increasing delay that will end up in a crash.
>>> 3 secs to process 14 messages looks like a lot. Curious what the job
>>> logic is.
>>>
>>> -kr, Gerard.
>>>
>>> On Thu, Jan 22, 2015 at 12:15 PM, Tathagata Das <
>>> tathagata.das1565@gmail.com> wrote:
>>>
>>> This is not normal. Its a huge scheduling delay!! Can you tell me more
>>> about the application?
>>> - cluser setup, number of receivers, whats the computation, etc.
>>>
>>> On Thu, Jan 22, 2015 at 3:11 AM, Ashic Mahtab <ashic@live.com> wrote:
>>>
>>> Hate to do this...but...erm...bump? Would really appreciate input from
>>> others using Streaming. Or at least some docs that would tell me if these
>>> are expected or not.
>>>
>>> ------------------------------
>>> From: ashic@live.com
>>> To: user@spark.apache.org
>>> Subject: Are these numbers abnormal for spark streaming?
>>> Date: Wed, 21 Jan 2015 11:26:31 +0000
>>>
>>>
>>> Hi Guys,
>>> I've got Spark Streaming set up for a low data rate system (using
>>> spark's features for analysis, rather than high throughput). Messages are
>>> coming in throughout the day, at around 1-20 per second (finger in the air
>>> estimate...not analysed yet).  In the spark streaming UI for the
>>> application, I'm getting the following after 17 hours.
>>>
>>> Streaming
>>>
>>>    - *Started at: *Tue Jan 20 16:58:43 GMT 2015
>>>    - *Time since start: *18 hours 24 minutes 34 seconds
>>>    - *Network receivers: *2
>>>    - *Batch interval: *2 seconds
>>>    - *Processed batches: *16482
>>>    - *Waiting batches: *1
>>>
>>>
>>>
>>> Statistics over last 100 processed batchesReceiver Statistics
>>>
>>>    - Receiver
>>>
>>>
>>>    - Status
>>>
>>>
>>>    - Location
>>>
>>>
>>>    - Records in last batch
>>>    - [2015/01/21 11:23:18]
>>>
>>>
>>>    - Minimum rate
>>>    - [records/sec]
>>>
>>>
>>>    - Median rate
>>>    - [records/sec]
>>>
>>>
>>>    - Maximum rate
>>>    - [records/sec]
>>>
>>>
>>>    - Last Error
>>>
>>> RmqReceiver-0ACTIVEFOOOO
>>> 144727-RmqReceiver-1ACTIVEBAAAAR
>>> 124726-
>>> Batch Processing Statistics
>>>
>>>    MetricLast batchMinimum25th percentileMedian75th percentileMaximumProcessing
>>>    Time3 seconds 994 ms157 ms4 seconds 16 ms4 seconds 961 ms5 seconds 3
>>>    ms5 seconds 171 msScheduling Delay9 hours 15 minutes 4 seconds9
>>>    hours 10 minutes 54 seconds9 hours 11 minutes 56 seconds9 hours 12
>>>    minutes 57 seconds9 hours 14 minutes 5 seconds9 hours 15 minutes 4
>>>    secondsTotal Delay9 hours 15 minutes 8 seconds9 hours 10 minutes 58
>>>    seconds9 hours 12 minutes9 hours 13 minutes 2 seconds9 hours 14
>>>    minutes 10 seconds9 hours 15 minutes 8 seconds
>>>
>>>
>>> Are these "normal". I was wondering what the scheduling delay and total
>>> delay terms are, and if it's normal for them to be 9 hours.
>>>
>>> I've got a standalone spark master and 4 spark nodes. The streaming app
>>> has been given 4 cores, and it's using 1 core per worker node. The
>>> streaming app is submitted from a 5th machine, and that machine has nothing
>>> but the driver running. The worker nodes are running alongside Cassandra
>>> (and reading and writing to it).
>>>
>>> Any insights would be appreciated.
>>>
>>> Regards,
>>> Ashic.
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Sudipta Banerjee
>> Consultant, Business Analytics and Cloud Based Architecture
>> Call me +919019578099
>>
>
>

Mime
View raw message