kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Woodward <a...@flax.co.uk>
Subject KStreams - reading from the start of a stream
Date Thu, 14 Apr 2016 10:14:49 GMT
Hi all,

I spoke at the London Kafka meetup last night about searching streaming data (write-up here:
http://www.flax.co.uk/blog/2016/04/14/apache-kafka-london-meetup-real-time-search-insights/),
and as part of the preparation for the talk I tried porting some Samza code I have to the
KStreams library.

My first impressions are that kstreams is very nice, particularly when it comes to deploying
and testing (no need for Yarn, yay!).  I have a couple of questions though:

- I couldn't work out to open a stream and read it from the beginning.  My usecase here is
a cached search engine, where stored queries are run over documents in a stream.  Both queries
and documents are stored in Kafka topics, and when the processor starts it needs to read the
query topic from the beginning to construct its cache.  StreamBuilder.table() and StreamBuilder.stream()
seem to create consumers that join the end of topics only.

- There doesn't seem to be be a nice way of closing resources on shutdown.  Is there a plan
to add shutdown hooks, or maybe a KafkaStreams.join() method that waits for the internal threads
to be interrupted?

My (broken!) code can be found here: https://github.com/romseygeek/luwak-kstream/blob/master/src/main/java/com/flaxsearch/luwak_kstreams/StreamMonitor.java

Thanks,

Alan Woodward
www.flax.co.uk



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message