samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SAMOA-47) Integrate Avro Streams with SAMOA
Date Fri, 30 Oct 2015 11:30:27 GMT

    [ https://issues.apache.org/jira/browse/SAMOA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982426#comment-14982426
] 

ASF GitHub Bot commented on SAMOA-47:
-------------------------------------

GitHub user jayadeepj opened a pull request:

    https://github.com/apache/incubator-samoa/pull/40

    SAMOA-47: Integrate Avro Streams with SAMOA

    Code changes to Integrate Avro Streams with SAMOA.
    
    Commands to Test are below
    
    Local - Avro JSON
    bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar "PrequentialEvaluation
-l classifiers.ensemble.Bagging -s (AvroFileStream -f covtypeNorm_json.avro -e json) -f 100000"

    
    Local - Avro BINARY
    bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar "PrequentialEvaluation
-l classifiers.ensemble.Bagging -s (AvroFileStream -f covtypeNorm_binary.avro -e binary) -f
100000" 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jayadeepj/incubator-samoa master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-samoa/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #40
    
----
commit e406b9231d1880a96888943776a6079e7e750892
Author: jayadeepj <jayadeepj@gmail.com>
Date:   2015-10-30T09:27:06Z

    SAMOA-47: Integrate Avro Streams with SAMOA

commit 7c8cac7c3f03bd68c80b17c483b9babdfaa37adc
Author: jayadeepj <jayadeepj@gmail.com>
Date:   2015-10-30T11:24:05Z

    SAMOA-47: Integrate Avro Streams with SAMOA

----


> Integrate Avro Streams with SAMOA
> ---------------------------------
>
>                 Key: SAMOA-47
>                 URL: https://issues.apache.org/jira/browse/SAMOA-47
>             Project: SAMOA
>          Issue Type: New Feature
>          Components: SAMOA-API, SAMOA-Instances
>            Reporter: jayadeepj
>            Priority: Minor
>              Labels: patch
>
> The current SAMOA readers can only support data streams in ARFF format. Hence SAMOA as
a distributed streaming machine learning framework is limited in scope since end users may
have to transform their data to ARFF . Apache Avro is a data serialization system that handles
data streams in compact binary format and is typically used in conjunction with with Big Data
eco-system tools. Avro allows two encodings for the data: Binary & JSON. Hence an Avro
support may allow users with JSON data also to use SAMOA seamlessly.
> The GOAL is to build support for Avro Streams into SAMOA by adding Avro File Stream Handler,
Avro Loader to read records & transform to instances and  a user option to switch between
JSON/Binary encodings. The input format with representation of meta-data for both JSON/Binary
data to be finalized along with build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message