samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Filipchik <afilipc...@gmail.com>
Subject Re: Can't get all stored values via range iterator
Date Tue, 17 Nov 2015 21:41:58 GMT
Just want to update you on this one. After some time spent in debugging I
found that the actual problem was a piece of our code that was calling
next() on a range iterator twice :(.
After removing the duplicate call everything works as expected.

Thank you!

Alex

On Mon, Nov 16, 2015 at 10:45 PM, Yi Pan <nickpan47@gmail.com> wrote:

> Hi, Alexander,
>
> Sorry to reply late on this one. I embedded my questions and comments
> in-between the lines:
>
> On Sun, Nov 15, 2015 at 7:07 PM, Alexander Filipchik <afilipchik@gmail.com
> >
> wrote:
>
> >
> > nodeIterator = store.range(
> >         String.join(".", nodeId, String.valueOf(Character.MIN_VALUE)),
> >         String.join(".", nodeId, String.valueOf(Character.MAX_VALUE)));
> >
> >
> Theoretically, what you want is a prefix scan, the start key should be
> nodeId + '.' and end key should be nodeId + '.' + maxId, in which maxId
> should have each character = Character.MAX_VALUE with total length that is
> equal or greater than the max possible nodeId.
>
> I restreamed RockDB changelog topic and I can see all this edges stored
> > there, but query still returnes only 4.3M nodes.
> >
>
> Could you help to clarify what you did here to "see all these edges" and to
> "query still returns only 4.3M nodes"?
>
>
> > 1) Have anyone seen such a behaviour before?
> >
>
> Not I am aware of.
>
>
> > 2) What is the best way to debug it on a remote machine? Any particular
> > logs to look for? Any RockDb config params that should be enabled?
> >
>
> You can try to add Jmx debug port option to task.opts. With Samza 0.10
> (latest from trunk), the JMX server port is reported from the AppMaster's
> web API. As for the state store config, you can try to disable the
> CachedStore to prevent any potential issues w/ cache management.
>
>
> > 3)  Is it a good idea to store a graph in such a format?
> >
>
> As long as you can partition the data based on nodeId, it should be fine.
>
>
> >
> > Thank you,
> > Alex
> >
>
> Please let us know if you find any issues with your use case.
>
> -Yi
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message