samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nicolas-kourtellis <...@git.apache.org>
Subject [GitHub] incubator-samoa issue #54: SAMOA-59: add an adapter for Apache Gearpump
Date Wed, 26 Oct 2016 09:45:08 GMT
Github user nicolas-kourtellis commented on the issue:

    https://github.com/apache/incubator-samoa/pull/54
  
    Hi @manuzhang,
    
    I managed to get the adapter working. Here are some notes that I would ask you take into
consideration:
    - There are some inherent difficulties compiling gearpump from source. It would be good
to have a compiled version to use directly.
    - Assuming this is given (which was my case because @manuzhang provided a compiled version),
I managed to get samoa to compile/package with gearpump and run the package.
    - However, it would be good for the adapter to be upgraded to the new version of samoa
in incubation, which is 0.5.0. But it should be fairly straightforward. This will allow us
to test it with some more generators and ML methods added in the recent past.
    
    - Feedback when executing VHT:
    => The engine seems to continue executing the topology long after it has been created,
used for the task and finished. Is there any way to pass a signal at the end of the execution
to shut it down? (note: not the engine itself, but the topology). It was occupying resources
on my computer for no reason at full CPU consumption. I found a manual way to kill it using
the command "gear kill -appid X" with X being the id of the task, but I wonder if there is
a more automatic way.
    => After I killed the jobs manually, the java processes that were created for the execution
(I will assume they are the containers of the topologies) were still alive, just not consuming
much resources. Shouldn't they have been terminated and removed? Is there a way to do that?
    => When I run new tasks, they just keep getting added on the engine (which is logical),
even though I had killed the other ones earlier.
    => Multiple executions of the same experiment with the same seed for the random generator
using the parameter -r which should yield the same random tree, perform differently with respect
to accuracy. Is that expected?
    => Using a different seed for the random tree generator (r=1,...,5), the performance
of the execution of VHT on local GearPump is fairly low (average over 5 different seeds: 65.39%
accuracy) in comparison to running the topology on local Storm (84.046% accuracy). Any explanation
why so much reduction in performance?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message