spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From map reduced <k3t.gi...@gmail.com>
Subject Re: Spark Streaming backpressure weird behavior/bug
Date Wed, 09 Nov 2016 01:07:51 GMT
I did the math in PIDRateEstimator from one of the screenshots and filed a
bug https://issues.apache.org/jira/browse/SPARK-18371. Added
as much detail as I could, let me know if anything more is needed.

On Sat, Nov 5, 2016 at 12:48 PM, map reduced <k3t.git.1@gmail.com> wrote:

> Answers inline.
>
> Also, the observation is very obvious. What RateController seems to be
> doing is - tries as best it can to adjust the flow, then severly
> under-adjusts, and
> then something goes wrong (?) and as a backup plan/default case creates a
> giant batch to catch up to latest offsets. If I didn't use backpressure or
> maxRatePerPartition, it would have created the first batch with this giant
> size.
>
> P.S. We gave up on checkpointing since it wasn't really working out for us
> in 1.6.1, so we manage our offsets separately in C* - very convenient way
> to go back X minutes etc.
>
> On Sat, Nov 5, 2016 at 3:44 AM, Timur Shenkao <tsh@timshenkao.su> wrote:
>
>> Hi guys!
>>
>> map reduced, could tell us (if it's not a secret, of course):
>>
>> 1)  Which Kafka version do you use?
>>
> 0.8.2
>
>> 2) Are there any peculiar Kafka settings?
>>
> Nope, just broker list and auto.offset
>
>> 3) Are you sure that all these "sudden billions of records" are really
>> different?
>>
> Yes. They're different because I checked the offsets from the logs (pasted
> above too) and each record has a timestamp of when they were processed
> to be put in kafka.
>
>> 4) What is average size of your records?
>>
> Between 8-12Kb
>
>> 5) Turning off the backpressure, are you sure that records are not lost?
>> Absence of giant batches is perfect but it may also mean that records are
>> lost or not being read or handled sometimes.
>>
> Backpressure is an add-on, only supposed to dynamically adjust to whatever
> your executors can handle. Absence of it won't result in records not being
> processed, ever.
> If you have maxRatePerPartition, it'll keep on churning those many records
> at maximum, each batch. I use mainly because if my job is shutoff for
> couple of hours due to whatever reasons,
> and it has a LOT to catch up on, if not using maxRatePerPartition, it'll
> create first batch of billions of records (from wherever it left off to
> latest) - and then slowly process them while other batches
> start queueing for hours. Risk here is that if it created a 6Bi batch and
> it has processed say 2Billion msgs and 4 more to go and the cluster dies
> for any reason, it's a pain to reprocess the full 6Bi batch.
>
>> 6) What was the CPU load during these "sudden billions of records"?
>>
> This may sound dumb but because I was doing a blocking operation in my
> spark job (I know, get your pitchforks!), CPU usage was super low and it
> was taking forever to process them.
> I haven't recreated this scenario after going the async route.
>
> I had some kind of similar problems with Kafka 0.8 + Flume 1.6 + small
>> records (approximately 50 bytes).
>> Restarting Flume agents after quite long period of time, I had unpleasant
>> situations that Flume re-read continuously the same messages & loaded CPU.
>>
>> On Fri, Nov 4, 2016 at 3:40 AM, map reduced <k3t.git.1@gmail.com> wrote:
>>
>>> Forgot to add, I have turned off the backpressure (but kept
>>> maxRatePerPartition) since the last email and it's not giving any giant
>>> batches.
>>>
>>> On Thu, Nov 3, 2016 at 5:11 PM, map reduced <k3t.git.1@gmail.com> wrote:
>>>
>>>> I'll give it a try (may take some time, since this is production
>>>> traffic, and nothing less than ERROR in prod, but will get back with the
>>>> results).
>>>> Also, it's happening pretty regularly, and very much reproducible.
>>>>
>>>> On Thu, Nov 3, 2016 at 2:45 PM, Cody Koeninger <cody@koeninger.org>
>>>> wrote:
>>>>
>>>>> Yeah, that looks pretty bad.  Have you tried just setting max rate per
>>>>> partition without turning backpressure on?
>>>>>
>>>>> If you want to keep digging on this, can you add some debugging output
>>>>> related to the backpressure?
>>>>>
>>>>> if you add a line like this to your log4j.properties
>>>>>
>>>>> log4j.logger.org.apache.spark.streaming.scheduler.rate=TRACE
>>>>>
>>>>> you should start seeing log lines like
>>>>>
>>>>> 16/10/12 12:18:01 TRACE PIDRateEstimator:
>>>>> time = 1476292681092, # records = 20, processing time = 20949,
>>>>> scheduling delay = 6
>>>>> 16/10/12 12:18:01 TRACE PIDRateEstimator:
>>>>> latestRate = -1.0, error = -1.9546995083297531
>>>>> latestError = -1.0, historicalError = 0.001145639409995704
>>>>> delaySinceUpdate = 1.476292681093E9, dError = -6.466871512381435E-10
>>>>>
>>>>> and then once it updates, lines like
>>>>>
>>>>> 16/10/12 12:18:32 TRACE PIDRateEstimator: New rate = 1.0
>>>>>
>>>>> On Wed, Nov 2, 2016 at 9:43 PM, map reduced <k3t.git.1@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> It happened again (this time i've got the partitions too from the
>>>>>> logs) - 2 billion batch size all of a sudden!
>>>>>>
>>>>>> [image: Inline image 1]
>>>>>>
>>>>>>
>>>>>> topic: kafka_topic_A    partition: 51    offsets: 1020742738 to
>>>>>> 1029289633
>>>>>> topic: kafka_topic_A    partition: 101    offsets: 1020736302 to
>>>>>> 1029287024
>>>>>> topic: kafka_topic_A    partition: 58    offsets: 1020777070 to
>>>>>> 1029332079
>>>>>> topic: kafka_topic_B    partition: 4    offsets: 4803171900 to
>>>>>> 4813684863
>>>>>> topic: kafka_topic_A    partition: 181    offsets: 1020695323 to
>>>>>> 1029247077
>>>>>> topic: kafka_topic_A    partition: 120    offsets: 1020843047 to
>>>>>> 1029392933
>>>>>> topic: kafka_topic_A    partition: 21    offsets: 24723134979 to
>>>>>> 24731684016
>>>>>> topic: kafka_topic_A    partition: 232    offsets: 1020850783 to
>>>>>> 1029399540
>>>>>> topic: kafka_topic_A    partition: 140    offsets: 1020857031 to
>>>>>> 1029409063
>>>>>> topic: kafka_topic_A    partition: 24    offsets: 24727354514 to
>>>>>> 24735900600
>>>>>> topic: kafka_topic_A    partition: 27    offsets: 24707635520 to
>>>>>> 24716178579
>>>>>> topic: kafka_topic_A    partition: 108    offsets: 1020522661 to
>>>>>> 1029068390
>>>>>> topic: kafka_topic_A    partition: 67    offsets: 1020836326 to
>>>>>> 1029387310
>>>>>> topic: kafka_topic_A    partition: 243    offsets: 1020719277 to
>>>>>> 1029269108
>>>>>> topic: kafka_topic_A    partition: 222    offsets: 1020842498 to
>>>>>> 1029394654
>>>>>> topic: kafka_topic_A    partition: 42    offsets: 24717681095 to
>>>>>> 24726227066
>>>>>> topic: kafka_topic_A    partition: 23    offsets: 24729438206 to
>>>>>> 24737988239
>>>>>> topic: kafka_topic_A    partition: 119    offsets: 1020720387 to
>>>>>> 1029268682
>>>>>> topic: kafka_topic_B    partition: 37    offsets: 4801248272 to
>>>>>> 4811770427
>>>>>> topic: kafka_topic_B    partition: 38    offsets: 4802833315 to
>>>>>> 4813345630
>>>>>> topic: kafka_topic_A    partition: 244    offsets: 1021008217 to
>>>>>> 1029563278
>>>>>> topic: kafka_topic_A    partition: 203    offsets: 1020670345 to
>>>>>> 1029221218
>>>>>> topic: kafka_topic_A    partition: 66    offsets: 1020747290 to
>>>>>> 1029293991
>>>>>> topic: kafka_topic_A    partition: 165    offsets: 1020857985 to
>>>>>> 1029408487
>>>>>> topic: kafka_topic_A    partition: 110    offsets: 1020791425 to
>>>>>> 1029339894
>>>>>> topic: kafka_topic_A    partition: 150    offsets: 1020714886 to
>>>>>> 1029263887
>>>>>> topic: kafka_topic_A    partition: 85    offsets: 1020667473 to
>>>>>> 1029213323
>>>>>> topic: kafka_topic_A    partition: 105    offsets: 1020939489 to
>>>>>> 1029488428
>>>>>> topic: kafka_topic_A    partition: 72    offsets: 1020837820 to
>>>>>> 1029389538
>>>>>> topic: kafka_topic_A    partition: 146    offsets: 1020770790 to
>>>>>> 1029320327
>>>>>> topic: kafka_topic_A    partition: 90    offsets: 1020826980 to
>>>>>> 1029375310
>>>>>> topic: kafka_topic_A    partition: 138    offsets: 1020813165 to
>>>>>> 1029364755
>>>>>> topic: kafka_topic_B    partition: 18    offsets: 4801290926 to
>>>>>> 4811805578
>>>>>> topic: kafka_topic_B    partition: 1    offsets: 4802397679 to
>>>>>> 4812912703
>>>>>> topic: kafka_topic_A    partition: 182    offsets: 1020944719 to
>>>>>> 1029494237
>>>>>> topic: kafka_topic_B    partition: 5    offsets: 4808767497 to
>>>>>> 4819286328
>>>>>> topic: kafka_topic_A    partition: 199    offsets: 1020828483 to
>>>>>> 1029379310
>>>>>> topic: kafka_topic_B    partition: 19    offsets: 4814797257 to
>>>>>> 4825312689
>>>>>> topic: kafka_topic_B    partition: 7    offsets: 4804013760 to
>>>>>> 4814536974
>>>>>> topic: kafka_topic_B    partition: 42    offsets: 4803850389 to
>>>>>> 4814365291
>>>>>> topic: kafka_topic_A    partition: 235    offsets: 1020692000 to
>>>>>> 1029240754
>>>>>> topic: kafka_topic_A    partition: 195    offsets: 1020779755 to
>>>>>> 1029331674
>>>>>> topic: kafka_topic_A    partition: 248    offsets: 1020644404 to
>>>>>> 1029194743
>>>>>> topic: kafka_topic_B    partition: 27    offsets: 4803952312 to
>>>>>> 4814465967
>>>>>> topic: kafka_topic_A    partition: 136    offsets: 1020801813 to
>>>>>> 1029356188
>>>>>> topic: kafka_topic_B    partition: 16    offsets: 4800603225 to
>>>>>> 4811123659
>>>>>> topic: kafka_topic_A    partition: 48    offsets: 24733300757 to
>>>>>> 24741850194
>>>>>> topic: kafka_topic_A    partition: 172    offsets: 1020775005 to
>>>>>> 1029324739
>>>>>> topic: kafka_topic_B    partition: 49    offsets: 4800717219 to
>>>>>> 4811236254
>>>>>> topic: kafka_topic_A    partition: 93    offsets: 1020985565 to
>>>>>> 1029537168
>>>>>> topic: kafka_topic_B    partition: 24    offsets: 4799098477 to
>>>>>> 4809607456
>>>>>> topic: kafka_topic_A    partition: 154    offsets: 1020693541 to
>>>>>> 1029238078
>>>>>> topic: kafka_topic_A    partition: 233    offsets: 1020946888 to
>>>>>> 1029497894
>>>>>> topic: kafka_topic_A    partition: 189    offsets: 1020961477 to
>>>>>> 1029514103
>>>>>> topic: kafka_topic_A    partition: 1    offsets: 24740548920 to
>>>>>> 24749096350
>>>>>> topic: kafka_topic_A    partition: 38    offsets: 24723357288 to
>>>>>> 24731912319
>>>>>> topic: kafka_topic_A    partition: 22    offsets: 24724263711 to
>>>>>> 24732813058
>>>>>> topic: kafka_topic_A    partition: 40    offsets: 24731873161 to
>>>>>> 24740422207
>>>>>> topic: kafka_topic_A    partition: 116    offsets: 1020576557 to
>>>>>> 1029122423
>>>>>> topic: kafka_topic_B    partition: 8    offsets: 4799369592 to
>>>>>> 4809890388
>>>>>> topic: kafka_topic_A    partition: 36    offsets: 24726594785 to
>>>>>> 24735140031
>>>>>> topic: kafka_topic_A    partition: 211    offsets: 1020900478 to
>>>>>> 1029446732
>>>>>> topic: kafka_topic_A    partition: 153    offsets: 1020751649 to
>>>>>> 1029305015
>>>>>> topic: kafka_topic_A    partition: 168    offsets: 1020768581 to
>>>>>> 1029315536
>>>>>> topic: kafka_topic_A    partition: 117    offsets: 1020620278 to
>>>>>> 1029167248
>>>>>> topic: kafka_topic_B    partition: 35    offsets: 4806178047 to
>>>>>> 4816695731
>>>>>> topic: kafka_topic_A    partition: 220    offsets: 1020814844 to
>>>>>> 1029362554
>>>>>> topic: kafka_topic_A    partition: 196    offsets: 1020651090 to
>>>>>> 1029194969
>>>>>> topic: kafka_topic_A    partition: 236    offsets: 1020692222 to
>>>>>> 1029241847
>>>>>> topic: kafka_topic_A    partition: 6    offsets: 24722380773 to
>>>>>> 24730930570
>>>>>> topic: kafka_topic_A    partition: 59    offsets: 1020835730 to
>>>>>> 1029384973
>>>>>> topic: kafka_topic_A    partition: 30    offsets: 24726641150 to
>>>>>> 24735187702
>>>>>> topic: kafka_topic_A    partition: 209    offsets: 1020874558 to
>>>>>> 1029427895
>>>>>> topic: kafka_topic_A    partition: 163    offsets: 1020703633 to
>>>>>> 1029253408
>>>>>> topic: kafka_topic_B    partition: 47    offsets: 4800171361 to
>>>>>> 4810686521
>>>>>> topic: kafka_topic_A    partition: 97    offsets: 1020667468 to
>>>>>> 1029213541
>>>>>> topic: kafka_topic_A    partition: 226    offsets: 1020960455 to
>>>>>> 1029512858
>>>>>> topic: kafka_topic_A    partition: 208    offsets: 1020884227 to
>>>>>> 1029435364
>>>>>> topic: kafka_topic_A    partition: 194    offsets: 1020964717 to
>>>>>> 1029518958
>>>>>> topic: kafka_topic_A    partition: 178    offsets: 1020632536 to
>>>>>> 1029178618
>>>>>> topic: kafka_topic_A    partition: 52    offsets: 1020842987 to
>>>>>> 1029393669
>>>>>> topic: kafka_topic_A    partition: 5    offsets: 24719725869 to
>>>>>> 24728274543
>>>>>> topic: kafka_topic_A    partition: 63    offsets: 1020887251 to
>>>>>> 1029437144
>>>>>> topic: kafka_topic_B    partition: 36    offsets: 4800982281 to
>>>>>> 4811501000
>>>>>> topic: kafka_topic_A    partition: 11    offsets: 24729694196 to
>>>>>> 24738244559
>>>>>> topic: kafka_topic_A    partition: 69    offsets: 1020732826 to
>>>>>> 1029275514
>>>>>> topic: kafka_topic_A    partition: 89    offsets: 1020642269 to
>>>>>> 1029187888
>>>>>> topic: kafka_topic_B    partition: 11    offsets: 4808218495 to
>>>>>> 4818733612
>>>>>> topic: kafka_topic_B    partition: 25    offsets: 4798933350 to
>>>>>> 4809448450
>>>>>> topic: kafka_topic_A    partition: 96    offsets: 1020846117 to
>>>>>> 1029393750
>>>>>> topic: kafka_topic_B    partition: 10    offsets: 4803818779 to
>>>>>> 4814337498
>>>>>> topic: kafka_topic_A    partition: 37    offsets: 24739837165 to
>>>>>> 24748391468
>>>>>> topic: kafka_topic_B    partition: 32    offsets: 4810693793 to
>>>>>> 4821217501
>>>>>> topic: kafka_topic_A    partition: 134    offsets: 1020747722 to
>>>>>> 1029296407
>>>>>> topic: kafka_topic_A    partition: 13    offsets: 24734355357 to
>>>>>> 24742905825
>>>>>> topic: kafka_topic_A    partition: 19    offsets: 24732775735 to
>>>>>> 24741322331
>>>>>> topic: kafka_topic_A    partition: 229    offsets: 1020798266 to
>>>>>> 1029347927
>>>>>> topic: kafka_topic_A    partition: 91    offsets: 1020974276 to
>>>>>> 1029525120
>>>>>> topic: kafka_topic_A    partition: 64    offsets: 1020980318 to
>>>>>> 1029530189
>>>>>> topic: kafka_topic_A    partition: 34    offsets: 24723495628 to
>>>>>> 24732054835
>>>>>> topic: kafka_topic_A    partition: 4    offsets: 24727632125 to
>>>>>> 24736184191
>>>>>> topic: kafka_topic_A    partition: 175    offsets: 1020915534 to
>>>>>> 1029464464
>>>>>> topic: kafka_topic_A    partition: 53    offsets: 1020704573 to
>>>>>> 1029254608
>>>>>> topic: kafka_topic_A    partition: 143    offsets: 1020772985 to
>>>>>> 1029322428
>>>>>> topic: kafka_topic_A    partition: 118    offsets: 1020778666 to
>>>>>> 1029331391
>>>>>> topic: kafka_topic_A    partition: 249    offsets: 1020963635 to
>>>>>> 1029516291
>>>>>> topic: kafka_topic_A    partition: 3    offsets: 24721520599 to
>>>>>> 24730075720
>>>>>> topic: kafka_topic_A    partition: 184    offsets: 1020775444 to
>>>>>> 1029326031
>>>>>> topic: kafka_topic_A    partition: 225    offsets: 1020933583 to
>>>>>> 1029483635
>>>>>> topic: kafka_topic_A    partition: 188    offsets: 1020647943 to
>>>>>> 1029198446
>>>>>> topic: kafka_topic_A    partition: 94    offsets: 1020730941 to
>>>>>> 1029278716
>>>>>> topic: kafka_topic_A    partition: 213    offsets: 1020762226 to
>>>>>> 1029311435
>>>>>> topic: kafka_topic_A    partition: 151    offsets: 1020844374 to
>>>>>> 1029395379
>>>>>> topic: kafka_topic_A    partition: 125    offsets: 1020760525 to
>>>>>> 1029306817
>>>>>> topic: kafka_topic_A    partition: 139    offsets: 1020830596 to
>>>>>> 1029382287
>>>>>> topic: kafka_topic_A    partition: 223    offsets: 1020851931 to
>>>>>> 1029406373
>>>>>> topic: kafka_topic_A    partition: 79    offsets: 1020569596 to
>>>>>> 1029117673
>>>>>> topic: kafka_topic_B    partition: 41    offsets: 4802503055 to
>>>>>> 4813020137
>>>>>> topic: kafka_topic_A    partition: 157    offsets: 1020773259 to
>>>>>> 1029323214
>>>>>> topic: kafka_topic_B    partition: 43    offsets: 4807530119 to
>>>>>> 4818051823
>>>>>> topic: kafka_topic_B    partition: 9    offsets: 4801124375 to
>>>>>> 4811641360
>>>>>> topic: kafka_topic_A    partition: 121    offsets: 1020716814 to
>>>>>> 1029262616
>>>>>> topic: kafka_topic_A    partition: 78    offsets: 1020757202 to
>>>>>> 1029307937
>>>>>> topic: kafka_topic_A    partition: 43    offsets: 24728638290 to
>>>>>> 24737193015
>>>>>> topic: kafka_topic_A    partition: 113    offsets: 1020840637 to
>>>>>> 1029386523
>>>>>> topic: kafka_topic_A    partition: 219    offsets: 1020867425 to
>>>>>> 1029414624
>>>>>> topic: kafka_topic_A    partition: 17    offsets: 24719427351 to
>>>>>> 24727972412
>>>>>> topic: kafka_topic_A    partition: 156    offsets: 1020795237 to
>>>>>> 1029341015
>>>>>> topic: kafka_topic_A    partition: 70    offsets: 1020706495 to
>>>>>> 1029254472
>>>>>> topic: kafka_topic_A    partition: 61    offsets: 1021026951 to
>>>>>> 1029582817
>>>>>> topic: kafka_topic_A    partition: 190    offsets: 1020963590 to
>>>>>> 1029516326
>>>>>> topic: kafka_topic_A    partition: 29    offsets: 24722142896 to
>>>>>> 24730694155
>>>>>> topic: kafka_topic_A    partition: 207    offsets: 1020639874 to
>>>>>> 1029187494
>>>>>> topic: kafka_topic_A    partition: 177    offsets: 1020685282 to
>>>>>> 1029233121
>>>>>> topic: kafka_topic_A    partition: 160    offsets: 1020789969 to
>>>>>> 1029337510
>>>>>> topic: kafka_topic_A    partition: 102    offsets: 1020963819 to
>>>>>> 1029516283
>>>>>> topic: kafka_topic_B    partition: 20    offsets: 4801028715 to
>>>>>> 4811550727
>>>>>> topic: kafka_topic_B    partition: 13    offsets: 4797383641 to
>>>>>> 4807902682
>>>>>> topic: kafka_topic_A    partition: 128    offsets: 1020662803 to
>>>>>> 1029211499
>>>>>> topic: kafka_topic_A    partition: 215    offsets: 1020837321 to
>>>>>> 1029389104
>>>>>> topic: kafka_topic_A    partition: 240    offsets: 1021021049 to
>>>>>> 1029572788
>>>>>> topic: kafka_topic_A    partition: 56    offsets: 1020941937 to
>>>>>> 1029496916
>>>>>> topic: kafka_topic_A    partition: 147    offsets: 1020755896 to
>>>>>> 1029303241
>>>>>> topic: kafka_topic_A    partition: 112    offsets: 1020892430 to
>>>>>> 1029441614
>>>>>> topic: kafka_topic_A    partition: 45    offsets: 24716641715 to
>>>>>> 24725192614
>>>>>> topic: kafka_topic_A    partition: 68    offsets: 1020893444 to
>>>>>> 1029446558
>>>>>> topic: kafka_topic_A    partition: 77    offsets: 1020868499 to
>>>>>> 1029417133
>>>>>> topic: kafka_topic_B    partition: 28    offsets: 4805914153 to
>>>>>> 4816430998
>>>>>> topic: kafka_topic_A    partition: 161    offsets: 1020902852 to
>>>>>> 1029456951
>>>>>> topic: kafka_topic_A    partition: 186    offsets: 1020775276 to
>>>>>> 1029328133
>>>>>> topic: kafka_topic_B    partition: 14    offsets: 4796300859 to
>>>>>> 4806817229
>>>>>> topic: kafka_topic_A    partition: 44    offsets: 24731321741 to
>>>>>> 24739866858
>>>>>> topic: kafka_topic_A    partition: 47    offsets: 24726144390 to
>>>>>> 24734696944
>>>>>> topic: kafka_topic_A    partition: 86    offsets: 1020778038 to
>>>>>> 1029327512
>>>>>> topic: kafka_topic_A    partition: 46    offsets: 24721377928 to
>>>>>> 24729930715
>>>>>> topic: kafka_topic_A    partition: 200    offsets: 1020776353 to
>>>>>> 1029328471
>>>>>> topic: kafka_topic_A    partition: 132    offsets: 1020794282 to
>>>>>> 1029343725
>>>>>> topic: kafka_topic_A    partition: 100    offsets: 1020931503 to
>>>>>> 1029480173
>>>>>> topic: kafka_topic_A    partition: 212    offsets: 1020752903 to
>>>>>> 1029303842
>>>>>> topic: kafka_topic_A    partition: 193    offsets: 1020799750 to
>>>>>> 1029348032
>>>>>> topic: kafka_topic_A    partition: 239    offsets: 1020740938 to
>>>>>> 1029296021
>>>>>> topic: kafka_topic_A    partition: 242    offsets: 1021023598 to
>>>>>> 1029575545
>>>>>> topic: kafka_topic_B    partition: 40    offsets: 4801026818 to
>>>>>> 4811537565
>>>>>> topic: kafka_topic_B    partition: 12    offsets: 4798606447 to
>>>>>> 4809123173
>>>>>> topic: kafka_topic_A    partition: 18    offsets: 24725102864 to
>>>>>> 24733647562
>>>>>> topic: kafka_topic_A    partition: 33    offsets: 24729427865 to
>>>>>> 24737975446
>>>>>> topic: kafka_topic_A    partition: 16    offsets: 24725461165 to
>>>>>> 24734010070
>>>>>> topic: kafka_topic_A    partition: 234    offsets: 1020679052 to
>>>>>> 1029226903
>>>>>> topic: kafka_topic_A    partition: 127    offsets: 1020876420 to
>>>>>> 1029425258
>>>>>> topic: kafka_topic_A    partition: 173    offsets: 1020875774 to
>>>>>> 1029427802
>>>>>> topic: kafka_topic_A    partition: 174    offsets: 1020764367 to
>>>>>> 1029311197
>>>>>> topic: kafka_topic_A    partition: 60    offsets: 1020729422 to
>>>>>> 1029280479
>>>>>> topic: kafka_topic_A    partition: 164    offsets: 1020895388 to
>>>>>> 1029447072
>>>>>> topic: kafka_topic_B    partition: 3    offsets: 4801150811 to
>>>>>> 4811667621
>>>>>> topic: kafka_topic_A    partition: 76    offsets: 1020872633 to
>>>>>> 1029425200
>>>>>> topic: kafka_topic_A    partition: 2    offsets: 24720552836 to
>>>>>> 24729103435
>>>>>> topic: kafka_topic_A    partition: 31    offsets: 24724971328 to
>>>>>> 24733525699
>>>>>> topic: kafka_topic_A    partition: 180    offsets: 1020790913 to
>>>>>> 1029342607
>>>>>> topic: kafka_topic_A    partition: 7    offsets: 24722917305 to
>>>>>> 24731461090
>>>>>> topic: kafka_topic_A    partition: 0    offsets: 24715978894 to
>>>>>> 24724533838
>>>>>> topic: kafka_topic_B    partition: 6    offsets: 4801685031 to
>>>>>> 4812197203
>>>>>> topic: kafka_topic_A    partition: 111    offsets: 1020777248 to
>>>>>> 1029320002
>>>>>> topic: kafka_topic_A    partition: 214    offsets: 1020847267 to
>>>>>> 1029397260
>>>>>> topic: kafka_topic_A    partition: 183    offsets: 1020829424 to
>>>>>> 1029374366
>>>>>> topic: kafka_topic_A    partition: 247    offsets: 1020951407 to
>>>>>> 1029501748
>>>>>> topic: kafka_topic_A    partition: 35    offsets: 24724710806 to
>>>>>> 24733257282
>>>>>> topic: kafka_topic_B    partition: 2    offsets: 4799162386 to
>>>>>> 4809677022
>>>>>> topic: kafka_topic_B    partition: 23    offsets: 4806523148 to
>>>>>> 4817037826
>>>>>> topic: kafka_topic_A    partition: 84    offsets: 1021016106 to
>>>>>> 1029568619
>>>>>> topic: kafka_topic_B    partition: 31    offsets: 4807475059 to
>>>>>> 4817992907
>>>>>> topic: kafka_topic_A    partition: 15    offsets: 24722975566 to
>>>>>> 24731525636
>>>>>> topic: kafka_topic_A    partition: 238    offsets: 1020838617 to
>>>>>> 1029388674
>>>>>> topic: kafka_topic_A    partition: 217    offsets: 1020963813 to
>>>>>> 1029516908
>>>>>> topic: kafka_topic_A    partition: 141    offsets: 1020928927 to
>>>>>> 1029480391
>>>>>> topic: kafka_topic_B    partition: 21    offsets: 4799274035 to
>>>>>> 4809790430
>>>>>> topic: kafka_topic_A    partition: 142    offsets: 1020859803 to
>>>>>> 1029410671
>>>>>> topic: kafka_topic_A    partition: 26    offsets: 24716858647 to
>>>>>> 24725403869
>>>>>> topic: kafka_topic_A    partition: 75    offsets: 1020875615 to
>>>>>> 1029425108
>>>>>> topic: kafka_topic_A    partition: 88    offsets: 1020636598 to
>>>>>> 1029181677
>>>>>> topic: kafka_topic_A    partition: 55    offsets: 1020981245 to
>>>>>> 1029532042
>>>>>> topic: kafka_topic_B    partition: 26    offsets: 4802386319 to
>>>>>> 4812903171
>>>>>> topic: kafka_topic_A    partition: 176    offsets: 1020927564 to
>>>>>> 1029478273
>>>>>> topic: kafka_topic_A    partition: 246    offsets: 1020902960 to
>>>>>> 1029456226
>>>>>> topic: kafka_topic_A    partition: 237    offsets: 1020879351 to
>>>>>> 1029428560
>>>>>> topic: kafka_topic_A    partition: 124    offsets: 1020844750 to
>>>>>> 1029398619
>>>>>> topic: kafka_topic_A    partition: 216    offsets: 1020606507 to
>>>>>> 1029155109
>>>>>> topic: kafka_topic_A    partition: 32    offsets: 24727599739 to
>>>>>> 24736149128
>>>>>> topic: kafka_topic_A    partition: 25    offsets: 24740711757 to
>>>>>> 24749263320
>>>>>> topic: kafka_topic_A    partition: 197    offsets: 1021032158 to
>>>>>> 1029587829
>>>>>> topic: kafka_topic_B    partition: 44    offsets: 4810511791 to
>>>>>> 4821029704
>>>>>> topic: kafka_topic_A    partition: 95    offsets: 1020733833 to
>>>>>> 1029283829
>>>>>> topic: kafka_topic_A    partition: 12    offsets: 24723998129 to
>>>>>> 24732553534
>>>>>> topic: kafka_topic_A    partition: 109    offsets: 1020895980 to
>>>>>> 1029446212
>>>>>> topic: kafka_topic_B    partition: 22    offsets: 4801811942 to
>>>>>> 4812330157
>>>>>> topic: kafka_topic_A    partition: 135    offsets: 1020523998 to
>>>>>> 1029067367
>>>>>> topic: kafka_topic_B    partition: 48    offsets: 4805322090 to
>>>>>> 4815838865
>>>>>> topic: kafka_topic_A    partition: 74    offsets: 1020819147 to
>>>>>> 1029369936
>>>>>> topic: kafka_topic_A    partition: 230    offsets: 1020784136 to
>>>>>> 1029333313
>>>>>> topic: kafka_topic_A    partition: 103    offsets: 1020921485 to
>>>>>> 1029473542
>>>>>> topic: kafka_topic_B    partition: 34    offsets: 4801025503 to
>>>>>> 4811545042
>>>>>> topic: kafka_topic_A    partition: 115    offsets: 1020600722 to
>>>>>> 1029148541
>>>>>> topic: kafka_topic_A    partition: 152    offsets: 1020677041 to
>>>>>> 1029226178
>>>>>> topic: kafka_topic_A    partition: 158    offsets: 1020735842 to
>>>>>> 1029285162
>>>>>> topic: kafka_topic_A    partition: 210    offsets: 1020838912 to
>>>>>> 1029389328
>>>>>> topic: kafka_topic_A    partition: 123    offsets: 1020888750 to
>>>>>> 1029442669
>>>>>> topic: kafka_topic_A    partition: 49    offsets: 24733516034 to
>>>>>> 24742064144
>>>>>> topic: kafka_topic_B    partition: 39    offsets: 4806601961 to
>>>>>> 4817119869
>>>>>> topic: kafka_topic_A    partition: 114    offsets: 1020945219 to
>>>>>> 1029496002
>>>>>> topic: kafka_topic_A    partition: 65    offsets: 1020714711 to
>>>>>> 1029267579
>>>>>> topic: kafka_topic_A    partition: 98    offsets: 1020581086 to
>>>>>> 1029126420
>>>>>> topic: kafka_topic_B    partition: 33    offsets: 4802443872 to
>>>>>> 4812950776
>>>>>> topic: kafka_topic_A    partition: 73    offsets: 1020908814 to
>>>>>> 1029459329
>>>>>> topic: kafka_topic_A    partition: 14    offsets: 24720549899 to
>>>>>> 24729100604
>>>>>> topic: kafka_topic_A    partition: 106    offsets: 1020832194 to
>>>>>> 1029381879
>>>>>> topic: kafka_topic_B    partition: 46    offsets: 4805759222 to
>>>>>> 4816272314
>>>>>> topic: kafka_topic_A    partition: 130    offsets: 1020729244 to
>>>>>> 1029276701
>>>>>> topic: kafka_topic_A    partition: 166    offsets: 1020939071 to
>>>>>> 1029489456
>>>>>> topic: kafka_topic_A    partition: 104    offsets: 1020771720 to
>>>>>> 1029318470
>>>>>> topic: kafka_topic_A    partition: 224    offsets: 1021062976 to
>>>>>> 1029618193
>>>>>> topic: kafka_topic_B    partition: 0    offsets: 4805841603 to
>>>>>> 4816356537
>>>>>> topic: kafka_topic_A    partition: 39    offsets: 24733836602 to
>>>>>> 24742385677
>>>>>> topic: kafka_topic_A    partition: 202    offsets: 1020738496 to
>>>>>> 1029289191
>>>>>> topic: kafka_topic_A    partition: 62    offsets: 1020767369 to
>>>>>> 1029310260
>>>>>> topic: kafka_topic_A    partition: 54    offsets: 1020872832 to
>>>>>> 1029424418
>>>>>> topic: kafka_topic_A    partition: 155    offsets: 1020939790 to
>>>>>> 1029491266
>>>>>> topic: kafka_topic_A    partition: 57    offsets: 1020926473 to
>>>>>> 1029478170
>>>>>> topic: kafka_topic_A    partition: 10    offsets: 24722360402 to
>>>>>> 24730916736
>>>>>> topic: kafka_topic_A    partition: 227    offsets: 1020628274 to
>>>>>> 1029175330
>>>>>> topic: kafka_topic_A    partition: 205    offsets: 1020886863 to
>>>>>> 1029438420
>>>>>> topic: kafka_topic_A    partition: 9    offsets: 24730599499 to
>>>>>> 24739147248
>>>>>> topic: kafka_topic_A    partition: 218    offsets: 1020694139 to
>>>>>> 1029244205
>>>>>> topic: kafka_topic_A    partition: 81    offsets: 1020865158 to
>>>>>> 1029417909
>>>>>> topic: kafka_topic_A    partition: 99    offsets: 1020829095 to
>>>>>> 1029378716
>>>>>> topic: kafka_topic_A    partition: 144    offsets: 1020836880 to
>>>>>> 1029390098
>>>>>> topic: kafka_topic_A    partition: 80    offsets: 1020632760 to
>>>>>> 1029181116
>>>>>> topic: kafka_topic_A    partition: 185    offsets: 1020777167 to
>>>>>> 1029326135
>>>>>> topic: kafka_topic_A    partition: 137    offsets: 1020783286 to
>>>>>> 1029336240
>>>>>> topic: kafka_topic_A    partition: 145    offsets: 1020807427 to
>>>>>> 1029353122
>>>>>> topic: kafka_topic_A    partition: 122    offsets: 1020914744 to
>>>>>> 1029465920
>>>>>> topic: kafka_topic_A    partition: 133    offsets: 1020818950 to
>>>>>> 1029369827
>>>>>> topic: kafka_topic_A    partition: 71    offsets: 1020604295 to
>>>>>> 1029151699
>>>>>> topic: kafka_topic_A    partition: 82    offsets: 1020925125 to
>>>>>> 1029478280
>>>>>> topic: kafka_topic_A    partition: 87    offsets: 1020857237 to
>>>>>> 1029406722
>>>>>> topic: kafka_topic_A    partition: 201    offsets: 1020709307 to
>>>>>> 1029260228
>>>>>> topic: kafka_topic_A    partition: 28    offsets: 24728200955 to
>>>>>> 24736749015
>>>>>> topic: kafka_topic_A    partition: 41    offsets: 24729533353 to
>>>>>> 24738085917
>>>>>> topic: kafka_topic_A    partition: 170    offsets: 1020668802 to
>>>>>> 1029219950
>>>>>> topic: kafka_topic_A    partition: 187    offsets: 1020581810 to
>>>>>> 1029129601
>>>>>> topic: kafka_topic_B    partition: 29    offsets: 4803280139 to
>>>>>> 4813797539
>>>>>> topic: kafka_topic_A    partition: 92    offsets: 1020662671 to
>>>>>> 1029214523
>>>>>> topic: kafka_topic_A    partition: 231    offsets: 1020772888 to
>>>>>> 1029320782
>>>>>> topic: kafka_topic_A    partition: 241    offsets: 1020649136 to
>>>>>> 1029195109
>>>>>> topic: kafka_topic_A    partition: 192    offsets: 1020839092 to
>>>>>> 1029389989
>>>>>> topic: kafka_topic_A    partition: 8    offsets: 24732792451 to
>>>>>> 24741339710
>>>>>> topic: kafka_topic_A    partition: 131    offsets: 1020886007 to
>>>>>> 1029433501
>>>>>> topic: kafka_topic_A    partition: 162    offsets: 1020706400 to
>>>>>> 1029251727
>>>>>> topic: kafka_topic_A    partition: 126    offsets: 1020828002 to
>>>>>> 1029377579
>>>>>> topic: kafka_topic_A    partition: 228    offsets: 1020824139 to
>>>>>> 1029371645
>>>>>> topic: kafka_topic_A    partition: 167    offsets: 1020746310 to
>>>>>> 1029296452
>>>>>> topic: kafka_topic_B    partition: 30    offsets: 4795764234 to
>>>>>> 4806277616
>>>>>> topic: kafka_topic_A    partition: 221    offsets: 1020618597 to
>>>>>> 1029166130
>>>>>> topic: kafka_topic_A    partition: 206    offsets: 1020972294 to
>>>>>> 1029522361
>>>>>> topic: kafka_topic_A    partition: 245    offsets: 1020859155 to
>>>>>> 1029409690
>>>>>> topic: kafka_topic_A    partition: 148    offsets: 1020689094 to
>>>>>> 1029234764
>>>>>> topic: kafka_topic_A    partition: 171    offsets: 1020893286 to
>>>>>> 1029448085
>>>>>> topic: kafka_topic_A    partition: 20    offsets: 24727739340 to
>>>>>> 24736287861
>>>>>> topic: kafka_topic_A    partition: 159    offsets: 1020770845 to
>>>>>> 1029316911
>>>>>> topic: kafka_topic_A    partition: 169    offsets: 1020699633 to
>>>>>> 1029253155
>>>>>> topic: kafka_topic_A    partition: 83    offsets: 1020954835 to
>>>>>> 1029507004
>>>>>> topic: kafka_topic_A    partition: 149    offsets: 1020763182 to
>>>>>> 1029312029
>>>>>> topic: kafka_topic_B    partition: 17    offsets: 4798809279 to
>>>>>> 4809328520
>>>>>> topic: kafka_topic_A    partition: 191    offsets: 1020939618 to
>>>>>> 1029492433
>>>>>> topic: kafka_topic_A    partition: 50    offsets: 1020781205 to
>>>>>> 1029327065
>>>>>> topic: kafka_topic_A    partition: 107    offsets: 1020596042 to
>>>>>> 1029143966
>>>>>> topic: kafka_topic_A    partition: 179    offsets: 1020692875 to
>>>>>> 1029239892
>>>>>> topic: kafka_topic_A    partition: 204    offsets: 1020682012 to
>>>>>> 1029229892
>>>>>> topic: kafka_topic_B    partition: 15    offsets: 4797528038 to
>>>>>> 4808038327
>>>>>> topic: kafka_topic_A    partition: 198    offsets: 1020530213 to
>>>>>> 1029075405
>>>>>> topic: kafka_topic_B    partition: 45    offsets: 4803051802 to
>>>>>> 4813564524
>>>>>> topic: kafka_topic_A    partition: 129    offsets: 1020804825 to
>>>>>> 1029355767
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 2, 2016 at 11:21 AM, map reduced <k3t.git.1@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Yes it does, I checked in the logs. Infact, if you see the first
>>>>>>> screenshot, stream processing was 'stuck' processing those many
records for
>>>>>>> quite some time (~ 1hr).
>>>>>>> One thing I noticed is initial batches took (maybe far?) longer
than
>>>>>>> the configured batchDuration of 1.5mins, say in case screenshot
2, it took
>>>>>>> 5.8-7.1min and in case 1 it took 3-4 mins.
>>>>>>>
>>>>>>> On Wed, Nov 2, 2016 at 8:43 AM, Cody Koeninger <cody@koeninger.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Does that batch actually have that many records in it (you
should
>>>>>>>> be able to see beginning and ending offsets in the logs),
or is it an error
>>>>>>>> in the UI?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 1, 2016 at 11:59 PM, map reduced <k3t.git.1@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi guys,
>>>>>>>>>
>>>>>>>>> I am using Spark 2.0.0 standalone cluster, regular streaming
job
>>>>>>>>> consuming from kafka and writing to http endpoint. I
have configuration:
>>>>>>>>> executors 7 cores/executor, maxCores = 84 (so 12 executors)
>>>>>>>>> batchsize - 90 seconds
>>>>>>>>> maxRatePerPartition - 2000
>>>>>>>>> backPressure enabled = true
>>>>>>>>>
>>>>>>>>> My kafka topics have total of 300 partitions, so I am
expecting to
>>>>>>>>> be max 54million records per batch (maxRatePerPartition
* batchsize *
>>>>>>>>> #partitions) - and that's what I am getting. But it turns
out that it can't
>>>>>>>>> process 54million records in 90sec batch, so I am expecting
backpressure to
>>>>>>>>> kick in, but I see something strange there. It reduces
batch size to lesser
>>>>>>>>> # of records, but then suddenly spits out a HUGE batch
size of 13 billion
>>>>>>>>> records.
>>>>>>>>>
>>>>>>>>> [image: Inline image 1]
>>>>>>>>> I changed some configuration to see if above was a one
off case
>>>>>>>>> but the same issue happened again. Check the below screenshot
(huge batch
>>>>>>>>> size of 14 billion records again!) :
>>>>>>>>>
>>>>>>>>> [image: Inline image 2]
>>>>>>>>>
>>>>>>>>> Is this a bug? Any reasoning you know for this to happen?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> KP
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message