samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco De Francisci Morales <g...@apache.org>
Subject Re: Question about SAMOA
Date Sat, 31 Oct 2015 10:50:09 GMT
We do not have an example ready, though we spoke several times about
including such thing.
One of the main issues is how to define the features and the schema for the
data.
Given that every application is different, we never managed to settle for a
good one.

If you are interested in contributing such an example, we would be happy to
include it in SAMOA, and it would be helpful for other people.

Cheers,

--
Gianmarco

On 29 October 2015 at 02:20, John Calvo Martinez <
j.calvomartinez@student.unsw.edu.au> wrote:

> In addition,
>
> Do you have any example on consuming a twitter stream and using it in
> SAMOA?
>
> Thank you
>
>
>
>
> John Calvo M.Sc. B.Eng.
> PhD Student
> School of Computer Science and Engineering
> UNSW AUSTRALIA
>
> Building K17   Room 301-04
> SYDNEY 2052 NSW Australia
>
> W: www.computing.unsw.edu.au
> FB: https://www.facebook.com/UNSW.COMPUTING
> TW: @UNSWCOMPUTING
> G+: UNSW CSE
> E: jcalvo@cse.unsw.edu.au
> P: (+61) 2 9385 6916 (Internal: x56916)
> M: (+61) 04 5161 4230
>
> On 29 Oct 2015, at 10:53, John Calvo Martinez <
> j.calvomartinez@student.unsw.edu.au> wrote:
>
> Dear Nicolas,
>
> Thanks for your answer,
>
> Finally, I’ve sorted it out with Storm! As you mentioned, the key point is
> to properly configure Storm. In my case was a bit different since I
> installed it on a macOsx machine.
> I’m sending you the details of my installation if it’s helpful in some way.
>
> Best regards,
>
> <SAMOA Installation.rtf>
> <0.png>
>
>
>
> John Calvo M.Sc. B.Eng.
> PhD Student
> School of Computer Science and Engineering
> UNSW AUSTRALIA
>
> Building K17   Room 301-04
> SYDNEY 2052 NSW Australia
>
> W: www.computing.unsw.edu.au
> FB: https://www.facebook.com/UNSW.COMPUTING
> TW: @UNSWCOMPUTING
> G+: UNSW CSE
> E: jcalvo@cse.unsw.edu.au
> P: (+61) 2 9385 6916 (Internal: x56916)
> M: (+61) 04 5161 4230
>
> On 15 Oct 2015, at 18:27, Nicolas Kourtellis <nkourtellis@gmail.com>
> wrote:
>
> Hi John,
>
> I don't have experience with Samza, but it could be an issue with your
> classpath.
>
> In any case, I have played with Samoa + Storm and it works. It can be a
> bit involved to set up Storm itself but once you do, it should work fine.
>
> If you want to try it out, here is a list of steps I followed and worked
> for me.
> I will upload these steps on the samoa page as well for future reference.
>
> Hope they help,
>
> Nicolas
>
>
>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> Installation of storm cluster:
> https://storm.apache.org/documentation/Setting-up-a-Storm-cluster.html
> Download a stable distribution from
> https://storm.apache.org/downloads.html
>
> e.g.:
> wget
> http://ftp.cixug.es/apache/storm/apache-storm-0.9.3/apache-storm-0.9.3.tar.gz
>
> untar the file:
> tar -xvf apache-storm-0.9.3.tar.gz
>
> Setup appropriately the conf/storm.yaml file within the unpacked folder.
> An example is the following:
>
> storm.zookeeper.servers:
>    - "127.0.0.1"
> storm.local.dir: "/var/storm-logs"
> nimbus.host: "127.0.0.1"
> supervisor.slots.ports:
>    - 6700
> worker.childopts: "-Xmx2000m"
> supervisor.childopts: "-Xmx256m"
> nimbus.childopts: "-Xmx512m"
>
> Create folder ~/.storm
> Copy the file conf/storm.yaml into the folder ~/.storm/
>
> Setup your $STORM_HOME to point to the folder of storm.
>
> e.g.
> export STORM_HOME=/homedirectory/apache-storm-0.9.3
>
> Installation of Zookeeper:
>
> http://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html#sc_InstallingSingleMode
>
> Download from:
> wget
> http://apache.rediris.es/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
>
> Untar:
> tar -xvf zookeeper-3.4.6.tar.gz
>
> Go in the unpacked folder and create conf/zoo.cfg file and modify the
> dataDir directory.
>
> Start zookeeper:
> bin/zkServer.sh start
>
> Start nimbus process (go into the folder of bin/storm).
> (Better to execute this from a screen terminal so that you can detach from
> it after starting)
> ./storm nimbus
>
> Start supervisor process (go into the folder of bin/storm)
> (Better to execute this from a screen terminal so that you can detach from
> it after starting)
> ./storm supervisor
>
> Start the UI for storm (go into the folder of bin/storm)
> (Better to execute this from a screen terminal so that you can detach from
> it after starting)
> ./storm ui
>
> After you have downloaded, unpacked and mvn-ed the samoa package, you can
> execute the bin/samoa command with storm as the processing engine (and
> cross fingers!).
>
>
> On Thu, Oct 15, 2015 at 8:20 AM, John Calvo Martinez <
> j.calvomartinez@student.unsw.edu.au> wrote:
>
>> Dear Gianmarco, I hope your are well,
>>
>> I’m writing you because I was trying to use SAMOA on Samza, S4 and Storm
>> but none of those worked for me. Would you help me a bit with this? The
>> most likely to run was Samza. I did the Zookeeper and Kafka installation.
>> Those worked well.
>>
>> Following the tutorial on
>> http://samoa.incubator.apache.org/documentation/Executing-SAMOA-with-Apache-Samza.html
I
>> was trying to build the Samza maven package but it seems that the git
>> folder is no longer available, so I decided to use and build this repo>
>> https://github.com/apache/samza This was successfully built, but when I
>> try to use SAMOA I got an error. First, it seems that the package was
>> successfully built:
>>
>> [INFO]
>> [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ samoa-test ---
>> [INFO] Building jar:
>> /usr/local/samoa-0.3.0-incubating/samoa-test/target/samoa-test-0.3.0-incubating.jar
>> [INFO]
>> [INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @
>> samoa-test ---
>> [INFO]
>> [INFO] --- maven-jar-plugin:2.4:test-jar (default) @ samoa-test ---
>> [INFO] Building jar:
>> /usr/local/samoa-0.3.0-incubating/samoa-test/target/samoa-test-0.3.0-incubating-tests.jar
>> [INFO]
>> [INFO] --- maven-assembly-plugin:2.4.1:single (default) @ samoa-test ---
>> [INFO] Reading assembly descriptor:
>> src/main/assembly/test-jar-with-dependencies.xml
>> [INFO] Building jar:
>> /usr/local/samoa-0.3.0-incubating/samoa-test/target/samoa-test-0.3.0-incubating-test-jar-with-dependencies.jar
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Reactor Summary:
>> [INFO]
>> [INFO] Apache SAMOA ....................................... SUCCESS [
>> 4.328 s]
>> [INFO] samoa-instances .................................... SUCCESS [
>> 1.936 s]
>> [INFO] samoa-api .......................................... SUCCESS [
>> 11.857 s]
>> [INFO] samoa-samza ........................................ SUCCESS [
>> 16.251 s]
>> [INFO] samoa-test ......................................... SUCCESS [
>> 1.910 s]
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] BUILD SUCCESS
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Total time: 36.563 s
>> [INFO] Finished at: 2015-10-15T17:04:38+11:00
>> [INFO] Final Memory: 52M/1493M
>> [INFO]
>> ------------------------------------------------------------------------
>>
>> But when I’m trying to use it an unloaded class error occurs:
>>
>> $ bin/samoa samza target/SAMOA-Samza-0.3.0-SNAPSHOT.jar
>> "PrequentialEvaluation -d /tmp/dump.csv -i 1000000 -f 100000 -l
>> (classifiers.trees.VerticalHoeffdingTree -p 4) -s
>> (generators.RandomTreeGenerator -c 2 -o 10 -u 10)"
>> bin/samoa
>> Deploying to SAMZA
>> Error: Could not find or load main class org.apache.samoa.SamzaDoTask
>>
>> Kafka Server is running:
>> [2015-10-15 16:07:51,534] INFO Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=6000
>> watcher=org.I0Itec.zkclient.ZkClient@7364985f
>> (org.apache.zookeeper.ZooKeeper)
>> [2015-10-15 16:07:51,553] INFO Opening socket connection to server
>> localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL
>> (unknown error) (org.apache.zookeeper.ClientCnxn)
>> [2015-10-15 16:07:51,622] INFO Socket connection established to localhost/
>> 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
>> [2015-10-15 16:07:51,709] INFO Session establishment complete on server
>> localhost/127.0.0.1:2181, sessionid = 0x15069dd8c1d0000, negotiated
>> timeout = 6000 (org.apache.zookeeper.ClientCnxn)
>> [2015-10-15 16:07:51,710] INFO zookeeper state changed (SyncConnected)
>> (org.I0Itec.zkclient.ZkClient)
>> [2015-10-15 16:07:51,795] INFO Log directory '/tmp/kafka-logs' not found,
>> creating it. (kafka.log.LogManager)
>> [2015-10-15 16:07:51,804] INFO Loading logs. (kafka.log.LogManager)
>> [2015-10-15 16:07:51,809] INFO Logs loading complete.
>> (kafka.log.LogManager)
>> [2015-10-15 16:07:51,809] INFO Starting log cleanup with a period of
>> 300000 ms. (kafka.log.LogManager)
>> [2015-10-15 16:07:51,813] INFO Starting log flusher with a default period
>> of 9223372036854775807 ms. (kafka.log.LogManager)
>> [2015-10-15 16:07:51,844] INFO Awaiting socket connections on
>> 0.0.0.0:9092. (kafka.network.Acceptor)
>> [2015-10-15 16:07:51,845] INFO [Socket Server on Broker 0], Started
>> (kafka.network.SocketServer)
>> [2015-10-15 16:07:51,903] INFO Will not load MX4J, mx4j-tools.jar is not
>> in the classpath (kafka.utils.Mx4jLoader$)
>> [2015-10-15 16:07:51,930] INFO 0 successfully elected as leader
>> (kafka.server.ZookeeperLeaderElector)
>> [2015-10-15 16:07:51,996] INFO Registered broker 0 at path /brokers/ids/0
>> with address 10.248.15.104:9092. (kafka.utils.ZkUtils$)
>> [2015-10-15 16:07:52,011] INFO [Kafka Server 0], started
>> (kafka.server.KafkaServer)
>> [2015-10-15 16:07:52,050] INFO New leader is 0
>> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
>>
>>
>> And Zookeeper as well:
>> 2015-10-15 17:17:34,334 [myid:] - INFO  [main:Environment@100] - Client
>> environment:os.name=Mac OS X
>> 2015-10-15 17:17:34,334 [myid:] - INFO  [main:Environment@100] - Client
>> environment:os.arch=x86_64
>> 2015-10-15 17:17:34,334 [myid:] - INFO  [main:Environment@100] - Client
>> environment:os.version=10.11
>> 2015-10-15 17:17:34,334 [myid:] - INFO  [main:Environment@100] - Client
>> environment:user.name=johncalvo
>> 2015-10-15 17:17:34,334 [myid:] - INFO  [main:Environment@100] - Client
>> environment:user.home=/Users/johncalvo
>> 2015-10-15 17:17:34,335 [myid:] - INFO  [main:Environment@100] - Client
>> environment:user.dir=/usr/local/zookeeper-3.4.6
>> 2015-10-15 17:17:34,336 [myid:] - INFO  [main:ZooKeeper@438] -
>> Initiating client connection, connectString=127.0.0.1:2181
>> sessionTimeout=30000
>> watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@69d0a921
>> Welcome to ZooKeeper!
>> 2015-10-15 17:17:34,372 [myid:] - INFO  [main-SendThread(127.0.0.1:2181
>> ):ClientCnxn$SendThread@975] - Opening socket connection to server
>> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> JLine support is enabled
>> 2015-10-15 17:17:34,463 [myid:] - INFO  [main-SendThread(127.0.0.1:2181
>> ):ClientCnxn$SendThread@852] - Socket connection established to
>> 127.0.0.1/127.0.0.1:2181, initiating session
>> [zk: 127.0.0.1:2181(CONNECTING) 0] 2015-10-15 17:17:34,546 [myid:] -
>> INFO  [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1235] -
>> Session establishment complete on server 127.0.0.1/127.0.0.1:2181,
>> sessionid = 0x15069dd8c1d0001, negotiated timeout = 30000
>>
>> WATCHER::
>>
>> WatchedEvent state:SyncConnected type:None path:null
>>
>>
>> What can we do?
>>
>> Let me know your comments.
>>
>> PD: In addition, we would like to know if you are planning to implement
>> SAMOA on other SPE….
>>
>> Thank you.
>>
>> All the best
>>
>>
>>
>>
>> John Calvo M.Sc. B.Eng.
>> PhD Student
>> School of Computer Science and Engineering
>> UNSW AUSTRALIA
>>
>> Building K17   Room 301-04
>> SYDNEY 2052 NSW Australia
>>
>> W: www.computing.unsw.edu.au
>> FB: https://www.facebook.com/UNSW.COMPUTING
>> TW: @UNSWCOMPUTING
>> G+: UNSW CSE
>> E: jcalvo@cse.unsw.edu.au
>> P: (+61) 2 9385 6916 (Internal: x56916)
>> M: (+61) 04 5161 4230
>>
>> On 7 Aug 2015, at 18:20, Gianmarco De Francisci Morales <gdfm@apache.org>
>> wrote:
>>
>> Redirecting John's question to the mailing list.
>>
>>
>> John, seems the script cannot find the jar.
>> Have you compiled it from the current master?
>> If so, the jar should be "SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar".
>> Most of the examples and docs need to be updated given that we recently
>> made a new release.
>>
>> --
>> Gianmarco
>>
>> On 6 August 2015 at 11:20, John Calvo <john.calvo@gmail.com> wrote:
>>
>>> Hi Gianmarco, I hope you are well,
>>>
>>> I’m writing you because I was trying to explore SAMOA but I got an error
>>> running the prequential example:
>>>
>>> bin/samoa local target/SAMOA-Local-0.3.0-SNAPSHOT.jar
>>> "PrequentialEvaluation -l classifiers.ensemble.Bagging -s (ArffFileStream
>>> -f covtypeNorm.arff) -f 100000"
>>> bin/samoa
>>> Deploying to LOCAL
>>> Error: Could not find or load main class org.apache.samoa.LocalDoTask
>>>
>>> Do you know what would be missed? I tried a local installation, without
>>> SPE….
>>>
>>> Any help would be appreciated!
>>>
>>> Best regards,
>>>
>>> <0.png>
>>>
>>>
>>>
>>> John Calvo M.Sc. B.Eng.
>>> PhD Student
>>> School of Computer Science and Engineering
>>> UNSW AUSTRALIA
>>>
>>> Building K17   Room 301-04
>>> SYDNEY 2052 NSW Australia
>>>
>>> W: www.computing.unsw.edu.au
>>> FB: https://www.facebook.com/UNSW.COMPUTING
>>> TW: @UNSWCOMPUTING
>>> G+: UNSW CSE
>>> E: jcalvo@cse.unsw.edu.au
>>> P: (+61) 2 9385 6916 <%2B61%29%202%209385%206916> (Internal: x56916)
>>> M: (+61) 04 5161 4230
>>>
>>>
>>
>>
>
>
> --
> Nicolas Kourtellis
>
>
>
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message