Looks like problem of sync. Can you try again it after add Thread.sleep(100); line?
Sent from my iPhone
On 2011. 2. 16., at 오후 3:24, Paweł Brach <braszek@gmail.com> wrote:
> Yes, I have of course. My cluster has been configured and both examples
> PiEstimator and SerializePrinting work (there is communication between 3
> nodes). I've modified your example - PiEstimator (put everything in the
> loop) and it works for few iterations (there is communication) and after
> that connection is lost. After that connection is re-established but some
> messages are missing. It looks like that Hama framework is very unstable
> when it's loaded and many messages are sending between nodes.
> On the same cluster I've configured Apache Hadoop and it's very stable.
> If you have own cluster configured, could you run my example on it ? Have
> you ever run something more complicated than PiEstimator and
> SerializePrinting on it ?
>
> Cheers,
> Pawel
>
> 2011/2/16 Chia-Hung Lin <clin4j@googlemail.com>
>
>> Have you configured zookeeper in hama-site.xml? Hama makes use of
>> zookeeper to do node communication IIRC.
>>
>> Opening socket connection to server cl5/127.0.1.1:2181
>>
>> indicates that seems only localhost is up. If this is the case, you
>> can change hama.zookeeper.quorum property pointing with value set to
>> e.g.
>>
>> <property>
>> <name>hama.zookeeper.quorum</name>
>> <value>node1,node2,node3,node4,node5</value>
>> </property>
>>
>> Hope it helps
>>
>> 2011/2/15 Paweł Brach <braszek@gmail.com>:
>>> Hello,
>>>
>>> During last few days I've tested Hama solutions and today I found some
>>> strange error in Hama framework. If you run a simple job with more than
>> few
>>> supersteps the following error occures:
>>>
>>> 2011-02-15 15:13:55,934 ERROR org.apache.hama.bsp.BSPPeer:
>>> 2011-02-15 15:13:56,525 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket
>>> connection to server cl5/127.0.1.1:2181
>>> 2011-02-15 15:13:56,526 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
>>> for server null, unexpected error, closing socket connection and
>> attempting
>>> reconnect
>>> java.net.ConnectException: Connection refused
>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>> at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>>> at
>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
>>> 2011-02-15 15:13:56,626 ERROR org.apache.hama.bsp.BSPPeer:
>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss for /bsp
>>>
>>> You can reproduce that by running PiEstimator (the newest source code
>> from
>>> svn) with small changes - put whole body of the bsp() method in the for
>>> loop. So add in the beginning following line:
>>>
>>> for (int j = 0; j < 100; j++) {
>>> // oryginal bsp() code
>>> }
>>>
>>> When I'm trying to run it, the framowork hangs and mentioned before error
>>> occures.
>>>
>>> Your help will be appreciated.
>>>
>>> Cheers,
>>>
>>> --
>>> Pawel Brach
>>>
>>
>>
>>
>> --
>> ChiaHung Lin @ nuk, tw.
>>
>
>
>
> --
> Paweł Brach
|