aries-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Schneider <ch...@die-schneider.net>
Subject Re: [DISCUSS] Goals, Requirements and API for journaled events
Date Wed, 02 Jan 2019 16:13:18 GMT
I am a bit torn about using push streams. I think they would simplify a few
things but also add some complexity and force people to use push streams.

I think the resource leak is not a big problem as the Subscription is
Closeable. So you can close it in the @Deactivate methods. So it is similar
to not forgetting to close an InputStream.

You can see how it works in practice in the in memory impl.

I am a bit concerned about the threads needed to feed the subscriptions
though. When we discussed about the API Alexei at some point thought about
a pure polling based API instead of handlers. Would that be an alternative?

Christian

Am Mi., 2. Jan. 2019 um 11:03 Uhr schrieb Timothy Ward <
timothyjward@apache.org>:

> Hi all,
>
> I think this is an interesting area to look at, but one thing I see
> immediately is that the API is being designed in a way to encourage
> lifecycle issues. Specifically the service interface “subscribe” method
> receives a consumer function from the client. It would be *much* better if
> the subscribe method did not take a consumer, but instead the Subscription
> returned by the subscribe method should return a PushStream.
>
> Making this change avoids the provider implementation from having to
> maintain a registry of instances from client bundles (the listener pattern
> is “considered harmful” in OSGi), which can leak memory and/or class
> loaders as client bundles are started/stopped/updated. Allowing the client
> to create PushStream instances on demand gives finer grained control for
> the client over when the stream of data processing is closed (both from
> within and without the data stream), and provides easier fail-safe defaults
> for late-registering clients.
>
> You obviously get the further advantages of PushStreams including
> buffering, windowing and transformation pipelines. Using this would allow
> for simpler optimisation of the fetch logic in the Kafka/Mongo/Memory
> client when processing bulk messages from history.
>
> Best Regards,
>
> Tim
>
> On 2 Jan 2019, at 07:30, Christian Schneider <chris@die-schneider.net
> <mailto:chris@die-schneider.net>> wrote:
>
> Am Mi., 2. Jan. 2019 um 02:05 Uhr schrieb Timothee Maret <
> tmaret@apache.org<mailto:tmaret@apache.org>
> :
>
> Hi,
>
> I looked at the API considering how we could use it for our replication use
> case. I identified one key concept that seemed to be missing, the indexing
> of messages with monotonically increasing offsets.
>
> For replication, we leverage those offsets extensively, for instance to
> efficiently compute sub ranges of messages, to skip range of messages, to
> delay processing of messages, to clean up resources, etc. If we want to
> leverage the journaled event API to guarantee portability, it seems to me
> that we'd need to have the offset or an equivalent construct part of the
> API.
>
> How about adding a "getOffset" signature and documenting the offset
> requirement in the Position interface ?
>
>
> I just started implementing the in memory impl of the API and also used an
> offset.
> For the cases I know an offset makes sense. Alexei and I were just unsure
> if the offset
> is really a general abstraction. If we all agree an offset makes sense then
> I am in favour of adding it.
> Actually in the case of no partitions (wich we currently assume) the
> position is not more than an offset.
>
>
> Another unclear intention to me in the API, is the support for partitions
> (similar to Kafka). The documentation indicates it is not a goal, however
> the API seems to contain some hints for multi-partition support such as the
> "TopicPosition" interface. How about supporting multiple partitions in the
> API by allowing to specify a key (with a semantic similar to Kafka) in the
> "newMessage" signature ?
>
>
> I removed the TopicPosition interface again a few days ago. It was not part
> of the API Alexei and I discussed and makes no
> sense when we limit ourself to no partitions (or 1 partition in case of
> kafka).
> So the main question is if limiting ourselves is a good idea. I think it is
> but I would be very interested in other opinions.
>
> Cheers
> Christian
>
> --
> --
> Christian Schneider
> http://www.liquid-reality.de<http://www.liquid-reality.de/>
>
> Computer Scientist
> http://www.adobe.com<http://www.adobe.com/>
>
>

-- 
-- 
Christian Schneider
http://www.liquid-reality.de

Computer Scientist
http://www.adobe.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message