kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: Kafka Streams: ReadOnlyKeyValueStore range behavior
Date Thu, 16 Mar 2017 18:11:37 GMT
I think what you are seeing is that the order is not guaranteed across
partitions. When you use Queryable State you are actually querying multiple
underlying stores, i.e., one per partition. The implementation iterates
over one store/partition at a time, so the ordering will appear random.
This could be improved

The tombstone records appearing in the results seems like a bug.

Thanks,
Damian

On Thu, 16 Mar 2017 at 17:37 Matthias J. Sax <matthias@confluent.io> wrote:

> Can you check if the problem exist for 0.10.2, too? (0.10.2 is
> compatible to 0.10.1 broker -- so you can upgrade your Streams code
> independently from the brokers).
>
> About the range: I did double check this, and I guess my last answer was
> not correct, and range() should return ordered data, but I got a follow
> up question: what the key type and serializer you use? Internally, data
> is stored in serialized form and ordered according to
> `LexicographicByteArrayComparator` -- thus, if the serialized bytes
> don't reflect the order of the deserialized data, it returned range
> shows up unordered to you.
>
>
> -Matthias
>
>
>
>
> On 3/16/17 10:14 AM, Dmitry Minkovsky wrote:
> > Hi Matthias. Thank you for your response.
> >
> > Yes, I was able to reproduce the null issue reliably. I can't open a JIRA
> > at this time, but I can say I was using 0.10.1.0 and it was trivial to
> > reproduce. Just send records and the tombstones to a table topic. Then
> scan
> > the range. You'll see the tombstones.
> >
> > Indeed, ranges are returned with no specific order. I'm not sure what you
> > mean that default stores are hash-based, but this ordering thing is a
> shame
> > because it kind of kills the ability to use KS as a full fledged DB that
> > lets you index things like HBase (composite keys for lists of items). Is
> > that how RocksDB works? Just returns range scans in random order? I don't
> > know C++ so the documentation is a bit opaque to me. But what's the point
> > of scanning a range if the data comes in some random order? That being
> the
> > case, the number of possible use-case scenarios seem to become
> > significantly limited.
> >
> >
> > Thank you!
> > Dmitry
> >
> > On Tue, Mar 14, 2017 at 1:12 PM, Matthias J. Sax <matthias@confluent.io>
> > wrote:
> >
> >>> However,
> >>>> for keys that have been tombstoned, it does return null for me.
> >>
> >> Sound like a bug. Can you reliable reproduce this? Would you mind
> >> opening a JIRA?
> >>
> >> Can you check if this happens for both cases: caching enabled and
> >> disabled? Or only for once case?
> >>
> >>
> >>> "No ordering guarantees are provided."
> >>
> >> That is correct. Internally, default stores are hash-based -- thus, we
> >> don't give a sorted list/iterator back. You could replace RocksDB with a
> >> custom store though.
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 3/13/17 3:56 PM, Dmitry Minkovsky wrote:
> >>> I am using interactive streams to query tables:
> >>>
> >>>             ReadOnlyKeyValueStore<Messages.ByUserAndDate,
> >>> Messages.UserLetter> store
> >>>               = streams.store("view-user-drafts",
> >>> QueryableStoreTypes.keyValueStore());
> >>>
> >>> Documentation says that #range() should not return null values.
> However,
> >>> for keys that have been tombstoned, it does return null for me.
> >>>
> >>> Also, I noticed only just now that "No ordering guarantees are
> >> provided." I
> >>> haven't done enough testing or looked at the code carefully enough yet
> >> and
> >>> wonder if someone who knows could confirm: is this true? Is this common
> >> to
> >>> all store implementations? I was hoping to use interactive streams like
> >>> HBase to scan ranges. It appears this is not possible.
> >>>
> >>> Thank you,
> >>> Dmitry
> >>>
> >>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message