samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco De Francisci Morales <>
Subject Re: Samoa - Samza job execution
Date Sat, 11 Jul 2015 08:35:42 GMT
Hi Shekar,

At the moment we do not support JSON data.
The current readers support ARFF format, which is a CSV with some header.
Adding support for JSON is doable, but it should conform to a very specific

About Kafka, we support it as a transport via Samza, but we don't have a
reader for it right now.
Adding it would be very valuable. If you wanted to work on it I'd be happy
to help.
Have a look at org.apache.samoa.streams.fs.HDFSFileStreamSource,
and org.apache.samoa.streams.ArffFileStream for some examples.



On 10 July 2015 at 01:18, Shekar Tippur <> wrote:

> Hello,
> I am trying to use Samoa/Samza combination to apply ML for a dataset I have
> in JSON format.
> This is the document I am following:
> Couple of questions:
> 1. How do I point the input event to a Stream/Topic in Kafka? The data is
> in JSON.
> 2. If I want to use historical data that is stored in a file, how do I
> point the job to read from a file and serialise as json?
> bin/samoa samza target/SAMOA-Samza-0.3.0-SNAPSHOT.jar
> "PrequentialEvaluation -l classifiers.ensemble.Bagging -s (??)"
> - Shekar

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message