spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Campbell <michael.campb...@gmail.com>
Subject Having trouble with streaming (updateStateByKey)
Date Wed, 11 Jun 2014 17:47:20 GMT
I'm having a little trouble getting an "updateStateByKey()" call to work;
was wondering if anyone could help.

In my chain of calls from getting Kafka messages out of the queue to
converting the message to a set of "things", then pulling out 2 attributes
of those things to a Tuple2, everything works.

So what I end up with is about a 1 second dump of things like this (this is
crufted up data, but it's basically 2 IPV6 addresses...)

-------------------------------------------
Time: 1402507839000 ms
-------------------------------------------
(::ffff:a14:b03,::ffff:a0a:2bd4)
(::ffff:a14:b03,::ffff:a0a:2bd4)
(::ffff:a0a:25a7,::ffff:a14:b03)
(::ffff:a14:b03,::ffff:a0a:2483)
(::ffff:a14:b03,::ffff:a0a:2480)
(::ffff:a0a:2d96,::ffff:a14:b03)
(::ffff:a0a:abb5,::ffff:a14:100)
(::ffff:a0a:dcd7,::ffff:a14:28)
(::ffff:a14:28,::ffff:a0a:da94)
(::ffff:a14:b03,::ffff:a0a:2def)
...


This works ok.

The problem is when I add a call to updateStateByKey - the streaming app
runs and runs and runs and never outputs anything.  When I debug, I can't
confirm that my state update passed-in function is ever actually getting
called.

Indeed I have breakpoints and println statements in my updateFunc and it
LOOKS like it's never getting called.  I can confirm that the
updateStateByKey function IS getting called (via it stopping on a
breakpoint).

Is there something obvious I'm missing?

Mime
View raw message