kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Szumowski <tszumow...@gmail.com>
Subject Re: Bemchmarks for KTable Joins and Queries
Date Sat, 03 Nov 2018 23:53:40 GMT
Hi Matthias.

My apologies for my ambiguous descriptions. I responded in-line to your
questions below. Thank you for taking the time to understand my question.

On Fri, Nov 2, 2018 at 1:28 PM Matthias J. Sax <matthias@confluent.io>
wrote:

> >> At a high level, KTables provide a capability to query for data.
>
> Can you elaborate? Do you refer to "Interactive Queries" feature?
>

*Yes that's correct. I'm referring to Interactive queries as described here
<https://kafka.apache.org/20/documentation/streams/developer-guide/interactive-queries.html>.
This description
<https://docs.confluent.io/current/streams/concepts.html#interactive-queries>
covers
several use cases for interactive queries including: real-time threat
monitoring, video gaming, risk and fraud, and trend detection. Let's
consider the video gaming use case for example. It states, "A mobile
companion app can then directly query the Kafka Streams application to show
the current location of a player to friends and family, and invite them to
come along". In that scenario, one may be very interested in knowing the
throughput and latency (independent of network delays) for executing that
Interactive Query on the mobile companion app in order to get locations.*


>
> >> And I imagine latency/throughout of KTable queries depend on the number
> of
> >> consumers the query would have to touch to complete.


>
> * Similar here. I am not sure if I can follow.*


In that same article
<https://kafka.apache.org/20/documentation/streams/developer-guide/interactive-queries.html>
above,
it refers to local state stores and remote state stores. I'm interested in
understanding latencies when querying a remote state store
<https://kafka.apache.org/20/documentation/streams/developer-guide/interactive-queries.html#querying-remote-state-stores-for-the-entire-app>.
The diagram in that link shows three instances. What if there are 100
instances? How do the latencies in a query change as the number of
instances increase (if at all)? What kind of latencies in that
configuration would we expect? And how would it compare to an alternate
configuration that utilizes a GlobalKTable
<https://docs.confluent.io/current/streams/concepts.html#globalktable>?

*I hope that helps clarify.*

*-Tom*


>
> -Matthias
>
> On 11/2/18 4:27 AM, Tom Szumowski wrote:
> > Thank you for the clarification. I understand they are fundamentally
> > different underneath than a relational database, and may not be fair to
> > compare directly.
> >
> > But how about benchmarks that aren't a comparison from other databases?
> >
> > At a high level, KTables provide a capability to query for data. Suppose
> I
> > have a requirement to fetch thr data in less than X milliseconds. It
> seems
> > fair to want to understand if a KTable can satisfy that capability under
> > some configuration, or if I need to seek alternate solutions.
> >
> > And I imagine latency/throughout of KTable queries depend on the number
> of
> > consumers the query would have to touch to complete. For example, in
> > complex joins or filters. Or perhaps a trade-off with GlobalKTables...
> >
> > That's the kind of information in terms of benchmarks I'd be interested
> in
> > knowing exists or not.
> >
> > Thank you,
> >
> > Tom
> >
> > On Thu, Nov 1, 2018, 16:24 Matthias J. Sax <matthias@confluent.io wrote:
> >
> >> I am not aware if benchmarks, but want to point out, that KTables work
> >> somewhat different to relational database system. Thus, you might want
> >> to evaluate not base on performance, but on the semantics KTable
> provide.
> >>
> >> Recall, that Kafka Streams is a stream processing library while a
> >> database system is a "batch processing" system. It's two quite different
> >> types of systems and benchmarking them to compare each other is
> >> questionable.
> >>
> >>
> >> -Matthias
> >>
> >> On 10/31/18 12:18 PM, Tom Szumowski wrote:
> >>> I was wonderinf if anyone had or knew of benchmark tests for KTable or
> >>> GlobalKTable queries/joins, as compared to alternatives such as
> >> distributed
> >>> databases.
> >>>
> >>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message