samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco De Francisci Morales <>
Subject Re: Avro Support for SAMOA
Date Tue, 20 Oct 2015 07:27:14 GMT
Hi Jayadeep,

I think it's pretty cool!
If we get both Avro and Kafka support right, we can connect to almost

The document looks very comprehensive, you seem to have given a lot of
thought to it.
I am not extremely familiar with Avro myself, I've just used it a couple of
times, but I'll try to provide some suggestions.

- The general idea of where and how to store data and meta-data seems right.
- In general, all attributes in a sparse instance are optional, and all
attributes in a dense instance are required. Maybe we want to be more
granular than this in the future, but it seems that Avro supports a
superset of these settings. We may want to have some defaults "prototypes"
in order to make mapping the current dense/sparse instances easy.
- Right now we are not making use of Date-type attributes in SAMOA (there
is no such thing in samoa-instances), so if it makes it easier we could
skip supporting it. Ideally we could have algorithms that respect
event-time as provided by timestamps in the instances (as opposed to
processing the event whenever it arrives), however we are not there yet :)

All the rest seems pretty straightforward.

Moving to the more software-engineering oriented aspects, where would we
have dependencies for Avro? And how should they be deployed? Would they
simply go inside the deployable uber-jar of SAMOA?



On 19 October 2015 at 11:24, Jayadeep J <> wrote:

> Hi Gianmarco / All,
> I am working on an integration of SAMOA with Apache Avro. Basically I want
> to use data stored in Avro Files to be used as input to SAMOA.
> As I understand, current SAMOA readers only support ARFF format. Do you
> think such a feature would be useful to SAMOA in general ? Avro allows two
> encodings for the data: Binary & JSON. Hence an Avro support may allow
> users with JSON data also to use SAMOA.
> Based on the input given by @gdfm to @ctippur, I have prepared an Input
> Format document in Google Docs.
> Would it be possible for you to have a look and provide your valuable
> suggestions ? Thanks
> Thanks
> Jay

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message