All,

 

I was able to run the hello-samza application on a single node machine.

Now I am trying to run the hello-samza application on  a 2 node setup.

 

Node1 has a Resource Manager

Node2 has a Node Manager

 

The NM gets registered with the RM successfully as seen in rm.log of the RM node:

13/11/07 11:44:29 INFO service.AbstractService: Service:ResourceManager is started.

13/11/07 11:48:30 INFO util.RackResolver: Resolved IMPETUS-DSRV14.impetus.co.in to /default-rack

13/11/07 11:48:30 INFO resourcemanager.ResourceTrackerService: NodeManager from node IMPETUS-DSRV14.impetus.co.in(cmPort: 56093 httpPort: 8042) registered with capability: <memory:8192, vCores:16>, assigned nodeId IMPETUS-DSRV14.impetus.co.in:56093

13/11/07 11:48:30 INFO rmnode.RMNodeImpl: IMPETUS-DSRV14.impetus.co.in:56093 Node Transitioned from NEW to RUNNING

13/11/07 11:48:30 INFO capacity.CapacityScheduler: Added node IMPETUS-DSRV14.impetus.co.in:56093 clusterResource: <memory:8192, vCores:16>

 

I am submitting the job from the RM machine using the command line:

bin/run-job.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file:/home/bda/nirmal/hello-samza/deploy/samza/config/test-consumer.properties

 

However, I am getting the following exception after submitting the job to YARN:

 

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got container id: container_1383816757258_0001_01_000001

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got app attempt id: appattempt_1383816757258_0001_000001

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got node manager host: IMPETUS-DSRV14

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got node manager port: 59828

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got node manager http port: 8042

2013-11-07 15:05:57 SamzaAppMaster$ [INFO] got config: {task.inputs=kafka.storm-sentence, job.factory.class=org.apache.samza.job.yarn.YarnJobFactory, systems.kafka.samza.consumer.factory=samza.stream.kafka.KafkaConsumerFactory, job.name=test-Consumer, systems.kafka.consumer.zookeeper.connect=192.168.145.195:2181/, systems.kafka.consumer.auto.offset.reset=largest, systems.kafka.samza.msg.serde=json, serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory, systems.kafka.samza.partition.manager=samza.stream.kafka.KafkaPartitionManager, task.window.ms=10000, task.class=samza.examples.wikipedia.task.TestConsumer, yarn.package.path=file:/home/temptest/samza+storm/hello-samza/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz, systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory, systems.kafka.producer.metadata.broker.list=192.168.145.195:9092,192.168.145.195:9093}

2013-11-07 15:05:57 ClientHelper [INFO] trying to connect to RM /0.0.0.0:8032

2013-11-07 15:05:57 JmxServer [INFO] According to InetAddress.getLocalHost.getHostName we are IMPETUS-DSRV14.impetus.co.in

2013-11-07 15:05:57 JmxServer [INFO] Started JmxServer port=47115 url=service:jmx:rmi:///jndi/rmi://IMPETUS-DSRV14.impetus.co.in:47115/jmxrmi

2013-11-07 15:05:57 SamzaAppMasterTaskManager [INFO] No yarn.container.count specified. Defaulting to one container.

2013-11-07 15:05:57 VerifiableProperties [INFO] Verifying properties

2013-11-07 15:05:57 VerifiableProperties [INFO] Property client.id is overridden to samza_admin-test_Consumer-1-1383816957797-0

2013-11-07 15:05:57 VerifiableProperties [INFO] Property metadata.broker.list is overridden to 192.168.145.195:9092,192.168.145.195:9093

2013-11-07 15:05:57 VerifiableProperties [INFO] Verifying properties

2013-11-07 15:05:57 VerifiableProperties [INFO] Property auto.offset.reset is overridden to largest

2013-11-07 15:05:57 VerifiableProperties [INFO] Property client.id is overridden to samza_admin-test_Consumer-1-1383816957797-0

2013-11-07 15:05:57 VerifiableProperties [INFO] Property group.id is overridden to undefined-samza-consumer-group-

2013-11-07 15:05:57 VerifiableProperties [INFO] Property zookeeper.connect is overridden to 192.168.145.195:2181/

2013-11-07 15:05:57 VerifiableProperties [INFO] Verifying properties

2013-11-07 15:05:57 VerifiableProperties [INFO] Property client.id is overridden to samza_admin-test_Consumer-1-1383816957797-0

2013-11-07 15:05:57 VerifiableProperties [INFO] Property metadata.broker.list is overridden to 192.168.145.195:9092,192.168.145.195:9093

2013-11-07 15:05:57 VerifiableProperties [INFO] Property request.timeout.ms is overridden to 6000

2013-11-07 15:05:57 ClientUtils$ [INFO] Fetching metadata from broker id:0,host:192.168.145.195,port:9092 with correlation id 0 for 1 topic(s) Set(storm-sentence)

2013-11-07 15:05:57 SyncProducer [INFO] Connected to 192.168.145.195:9092 for producing

2013-11-07 15:05:57 SyncProducer [INFO] Disconnecting from 192.168.145.195:9092

2013-11-07 15:05:57 SamzaAppMasterService [INFO] Starting webapp at rpc 39152, tracking port 26751

2013-11-07 15:05:57 log [INFO] Logging to org.slf4j.impl.Log4jLoggerAdapter(org.eclipse.jetty.util.log) via org.eclipse.jetty.util.log.Slf4jLog

2013-11-07 15:05:58 ClientHelper [INFO] trying to connect to RM /0.0.0.0:8032

2013-11-07 15:05:58 log [INFO] jetty-7.0.0.v20091005

2013-11-07 15:05:58 log [INFO] Extract jar:file:/tmp/hadoop-vuser/nm-local-dir/usercache/bda/appcache/application_1383816757258_0001/filecache/8004956396276725272/samza-job-package-0.7.0-dist.tar.gz/lib/samza-yarn_2.8.1-0.7.0-yarn-2.0.5-alpha.jar!/scalate/WEB-INF/ to /tmp/Jetty_0_0_0_0_39152_scalate____xveaws/webinf/WEB-INF

2013-11-07 15:05:58 ServletTemplateEngine [INFO] Scalate template engine using working directory: /tmp/scalate-5279562760844696556-workdir

2013-11-07 15:05:58 log [INFO] Started SelectChannelConnector@0.0.0.0:39152

2013-11-07 15:05:58 log [INFO] jetty-7.0.0.v20091005

2013-11-07 15:05:58 log [INFO] Extract jar:file:/tmp/hadoop-vuser/nm-local-dir/usercache/bda/appcache/application_1383816757258_0001/filecache/8004956396276725272/samza-job-package-0.7.0-dist.tar.gz/lib/samza-yarn_2.8.1-0.7.0-yarn-2.0.5-alpha.jar!/scalate/WEB-INF/ to /tmp/Jetty_0_0_0_0_26751_scalate____.dr19qj/webinf/WEB-INF

2013-11-07 15:05:58 ServletTemplateEngine [INFO] Scalate template engine using working directory: /tmp/scalate-5582747144249485577-workdir

2013-11-07 15:05:58 log [INFO] Started SelectChannelConnector@0.0.0.0:26751

2013-11-07 15:06:08 SamzaAppMasterLifecycle [INFO] Shutting down.

2013-11-07 15:06:18 YarnAppMaster [WARN] Listener org.apache.samza.job.yarn.SamzaAppMasterLifecycle@500c954e failed to shutdown.

java.lang.reflect.UndeclaredThrowableException

         at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)

         at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.finishApplicationMaster(AMRMProtocolPBClientImpl.java:90)

         at org.apache.hadoop.yarn.client.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:244)

         at org.apache.samza.job.yarn.SamzaAppMasterLifecycle.onShutdown(SamzaAppMasterLifecycle.scala:68)

         at org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$9.apply(YarnAppMaster.scala:70)

         at org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$9.apply(YarnAppMaster.scala:69)

         at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)

         at scala.collection.immutable.List.foreach(List.scala:45)

         at org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:69)

         at org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:78)

         at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)

Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)

         at $Proxy12.finishApplicationMaster(Unknown Source)

         at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.finishApplicationMaster(AMRMProtocolPBClientImpl.java:87)

         ... 9 more

Caused by: java.net.ConnectException: Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

         at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:780)

         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:727)

         at org.apache.hadoop.ipc.Client.call(Client.java:1239)

         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

         ... 11 more

Caused by: java.net.ConnectException: Connection refused

         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)

         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:526)

         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:490)

         at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:508)

         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:603)

         at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:253)

         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1288)

         at org.apache.hadoop.ipc.Client.call(Client.java:1206)

         ... 12 more

 

 

I have changed the following properties in the hello-samza/deploy/yarn/etc/hadoop/yarn-site.xml on the Node Manager machine:

 

<property>

                <name>yarn.resourcemanager.scheduler.address</name>

                <value>192.168.145.37:8030</value>

</property>

<property>

                <name>yarn.resourcemanager.resource-tracker.address</name>

                <value>192.168.145.37:8031</value>

</property>

<property>

                <name>yarn.resourcemanager.address</name>

                <value>192.168.145.37:8032</value>

</property>

<property>

                <name>yarn.resourcemanager.admin.address</name>

                <value>192.168.145.37:8033</value>

</property>

<property>

                <name>yarn.resourcemanager.webapp.address</name>

                <value>192.168.145.37:8088</value>

</property>

 

 

 

These properties are reflected on the UI screen as well:

 

 

But this overriding of the yarn.resourcemanager.scheduler.address to 192.168.145.37:8030 does not rectify the error.

I still get:

Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From IMPETUS-DSRV14.impetus.co.in/192.168.145.43 to 0.0.0.0:8030 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

 

Nestat on the RM machine shows me:

tcp        0      0 ::ffff:192.168.145.37:8088  :::*                        LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8030  :::*                        LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8031  :::*                        LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8032  :::*                        LISTEN      14595/java

tcp        0      0 ::ffff:192.168.145.37:8033  :::*                        LISTEN      14595/java

 

Nestat on the NM machine shows me:

tcp        0      0 :::8040                     :::*                        LISTEN      1331/java

tcp        0      0 :::8042                     :::*                        LISTEN      1331/java

tcp        0      0 :::56877                    :::*                        LISTEN      1331/java

 

Kindly help me how to rectify this error.

 

Regards,

-Nirmal









NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.