samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (SAMOA-65) Apache Kafka integration components for SAMOA
Date Thu, 15 Jun 2017 14:44:00 GMT


ASF GitHub Bot commented on SAMOA-65:

Github user nicolas-kourtellis commented on the issue:
    Thank you @pwawrzyniak for the contributions!
    Regarding the current code:
    - Can you change the copyright to be current year? (I wonder if we should keep this year
appearing. It needs constant updating every time we have a new release).
    - I added some minor comments on leftovers from the AVRO combined integration. If you
can remove them it would be cleaner.
    - I noticed that there is redundancy / repetition between the three PRs (#59,#64#65).
Is there a way to make them unique to each other? Otherwise I think there will be conflicts
when trying to merge them. @gdfm what do you think?
    - After checking the code, I realized that this is dedicated for Kafka.
    A quick question: Can this JSON serializer/deserializer be extended/abstracted to be used
by other interfaces besides Kafka (e.g., even storing/retrieving JSON files from disk)? Do
you think it is feasible or needs a lot of work?
    - Can we do something similar for Avro?

> Apache Kafka integration components for SAMOA
> ---------------------------------------------
>                 Key: SAMOA-65
>                 URL:
>             Project: SAMOA
>          Issue Type: New Feature
>          Components: SAMOA-API, SAMOA-Instances
>            Reporter: Piotr Wawrzyniak
>              Labels: kafka, sink, source, streaming
>   Original Estimate: 672h
>  Remaining Estimate: 672h
> As of now Apache SAMOA includes no integration components for Apache Kafka, meaning in
particular no possibility to read data coming from Kafka and write data with prediction results
back to Kafka.
> The key assumptions for the development of Kafka-related components are as follows:
> 1)	develop support for input data stream arriving to Apache Samoa via Apache Kafka
> 2)	develop support for output data stream produced by Apache Samoa, including the results
of stream mining and forwarded to Apache Kafka to be provided in this way to other modules
consuming the stream.
> This makes the goal of this issue is to create the following components:
> 1)	KafkaEntranceProcessor in samoa-api. This entrance processor will be able to accept
incoming Kafka stream. It will require KafkaDeserializer interface implementation to be delivered.
The role of Deserializer would be to translate incoming Apache Kafka messages into implementation
of Instance interface of SAMOA.
> 2)	KafkaDestinationProcessor in samoa-api. Similarly to the KafkaEntranceProcessor, this
processor would require KafkaSerializer interface implementation to be delivered. The role
of Serializer would be to create a Kafka message from the underlying Instance class.
> 3)	KafkaStream, as the extension to existing streams (e.g. InstanceStream), would take
similar role to other streams, and will provide the control over Instances flows in the entire
> Moreover, the following assumptions are considered:
> 1)	Components would be implemented with the use of most up-to-date version of Apache
Kafka, i.e. 0.10
> 2)	Samples of aforementioned Serializer and Deserializer would be delivered, both supporting
AVRO and JSON serialization of Instance objects.
> 3)	Sample testing classes providing reference use of Kafka source and destination would
be included in the project as well.

This message was sent by Atlassian JIRA

View raw message