aries-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothee Maret <tma...@apache.org>
Subject Re: [DISCUSS] Goals, Requirements and API for journaled events
Date Wed, 02 Jan 2019 20:20:18 GMT
Hi Christian,

I'll try to catch up :-)

Regarding the abstraction for the offset, to be double checked but I think
the main use case is to compute the relative order between independent
events (happened-before relation). Thus we could abstract the offset from
the API by having the Position interface extends Comparable<Position>
interface (which provides the happened-before semantic) and remove the
Position#getOffset signature.

Regarding limiting to a single partition, one use case that would likely
benefit from supporting multiple partitions is sharing topics among tenants
in a cloud environment. Adding partitions, at least for Kafka, would be the
way to scale the consumers and spread the load in such environment. As long
as the API can be extended without breaking backward compatibility, I think
it's ok to start without partition support in the API.

Regarding the changes to the API, I think Messaging#send returning a
Position will require a blocking read on some backends which would be a
costly toll for use cases that don't require the producers to know the
actual position for sent messages. Our replication use case falls in this
category. The previous version (Messaging#send returning null, position
passed in the callback) was more versatile to me, a producer that needs to
know the position could still use the callback mechanism to implement the
blocking operation. The implementation looks good!

Regarding reusing the tests for all implementation, I very much like the
idea, the tests would serve as some sort of TCK. I'll apply those in the
Kafka implementation.

Regards,

Timothee





Le mer. 2 janv. 2019 à 17:13, Christian Schneider <chris@die-schneider.net>
a écrit :

> I am a bit torn about using push streams. I think they would simplify a few
> things but also add some complexity and force people to use push streams.
>
> I think the resource leak is not a big problem as the Subscription is
> Closeable. So you can close it in the @Deactivate methods. So it is similar
> to not forgetting to close an InputStream.
>
> You can see how it works in practice in the in memory impl.
>
> I am a bit concerned about the threads needed to feed the subscriptions
> though. When we discussed about the API Alexei at some point thought about
> a pure polling based API instead of handlers. Would that be an alternative?
>
> Christian
>
> Am Mi., 2. Jan. 2019 um 11:03 Uhr schrieb Timothy Ward <
> timothyjward@apache.org>:
>
> > Hi all,
> >
> > I think this is an interesting area to look at, but one thing I see
> > immediately is that the API is being designed in a way to encourage
> > lifecycle issues. Specifically the service interface “subscribe” method
> > receives a consumer function from the client. It would be *much* better
> if
> > the subscribe method did not take a consumer, but instead the
> Subscription
> > returned by the subscribe method should return a PushStream.
> >
> > Making this change avoids the provider implementation from having to
> > maintain a registry of instances from client bundles (the listener
> pattern
> > is “considered harmful” in OSGi), which can leak memory and/or class
> > loaders as client bundles are started/stopped/updated. Allowing the
> client
> > to create PushStream instances on demand gives finer grained control for
> > the client over when the stream of data processing is closed (both from
> > within and without the data stream), and provides easier fail-safe
> defaults
> > for late-registering clients.
> >
> > You obviously get the further advantages of PushStreams including
> > buffering, windowing and transformation pipelines. Using this would allow
> > for simpler optimisation of the fetch logic in the Kafka/Mongo/Memory
> > client when processing bulk messages from history.
> >
> > Best Regards,
> >
> > Tim
> >
> > On 2 Jan 2019, at 07:30, Christian Schneider <chris@die-schneider.net
> > <mailto:chris@die-schneider.net>> wrote:
> >
> > Am Mi., 2. Jan. 2019 um 02:05 Uhr schrieb Timothee Maret <
> > tmaret@apache.org<mailto:tmaret@apache.org>
> > :
> >
> > Hi,
> >
> > I looked at the API considering how we could use it for our replication
> use
> > case. I identified one key concept that seemed to be missing, the
> indexing
> > of messages with monotonically increasing offsets.
> >
> > For replication, we leverage those offsets extensively, for instance to
> > efficiently compute sub ranges of messages, to skip range of messages, to
> > delay processing of messages, to clean up resources, etc. If we want to
> > leverage the journaled event API to guarantee portability, it seems to me
> > that we'd need to have the offset or an equivalent construct part of the
> > API.
> >
> > How about adding a "getOffset" signature and documenting the offset
> > requirement in the Position interface ?
> >
> >
> > I just started implementing the in memory impl of the API and also used
> an
> > offset.
> > For the cases I know an offset makes sense. Alexei and I were just unsure
> > if the offset
> > is really a general abstraction. If we all agree an offset makes sense
> then
> > I am in favour of adding it.
> > Actually in the case of no partitions (wich we currently assume) the
> > position is not more than an offset.
> >
> >
> > Another unclear intention to me in the API, is the support for partitions
> > (similar to Kafka). The documentation indicates it is not a goal, however
> > the API seems to contain some hints for multi-partition support such as
> the
> > "TopicPosition" interface. How about supporting multiple partitions in
> the
> > API by allowing to specify a key (with a semantic similar to Kafka) in
> the
> > "newMessage" signature ?
> >
> >
> > I removed the TopicPosition interface again a few days ago. It was not
> part
> > of the API Alexei and I discussed and makes no
> > sense when we limit ourself to no partitions (or 1 partition in case of
> > kafka).
> > So the main question is if limiting ourselves is a good idea. I think it
> is
> > but I would be very interested in other opinions.
> >
> > Cheers
> > Christian
> >
> > --
> > --
> > Christian Schneider
> > http://www.liquid-reality.de<http://www.liquid-reality.de/>
> >
> > Computer Scientist
> > http://www.adobe.com<http://www.adobe.com/>
> >
> >
>
> --
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Computer Scientist
> http://www.adobe.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message