samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (SAMOA-49) Add an Adapter for Apache Apex
Date Tue, 21 Jun 2016 08:59:57 GMT


ASF GitHub Bot commented on SAMOA-49:

Github user bhupeshchawda commented on the issue:
    @nicolas-kourtellis Please find my responses below:
    1) The slow execution is a deliberate (although temporary) configuration done in samoa-apex
by limiting the number of tuples in an application window. This has to do with the way iteration
works in Apex, which is tightly coupled to windowing. In case we don't limit the number of
tuples, the tuples in a particular window keep on increasing due to the additional tuples
that are fed on the iteration loop back stream. If the number of iterations is large enough,
the amount of time taken to process a window of data increases beyond normal behaviour and
the operator is killed by the Apex app master. I am working on identifying some workaround
either to eliminate this limit, or to optimally set this limit. 
    2) The execution in local mode of Apex is highly asynchronous with all operators in the
topology running in different threads. The local mode of Samoa, on the other hand seems to
be synchronous; i.e. the next tuple is processed only when the first one has been processed
completely by all operators. I also tried to check executing the local mode of Storm, which
also produces different results every time it is run for the same input file. 
    3) I think this is due to the same reason in (2)
    4) Yes, these changes are necessary for Apex to function correctly. Apex relies on Kryo
serialization (without any fall back on Java serialization) and hence is necessary for classes
to have a default constructor. I think it will be better to have them as part of this PR.
May be I can split them into a different commit if that helps?

> Add an Adapter for Apache Apex
> ------------------------------
>                 Key: SAMOA-49
>                 URL:
>             Project: SAMOA
>          Issue Type: New Feature
>            Reporter: Bhupesh Chawda
> Apache Apex is a new data-in-motion platform that unifies stream processing as well as
batch processing. An Apache Apex adapter for Samoa would allow users to run streaming machine
learning algorithms built on Apache Samoa, on Apache Apex platform. This adapter should be
able to translate the Apache Samoa topologies into Apache Apex DAGs in order to run them on
the Apex platform. 

This message was sent by Atlassian JIRA

View raw message