storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bobby Evans <ev...@yahoo-inc.com>
Subject Re: Storm workers get killed in the middle and supervisor restarts
Date Mon, 07 Aug 2017 14:34:20 GMT
No you should not need to configure the users ahead of time.  In storm if security is turned
off the "user" or "owner" of the topology is stored internally as the unix user running nimbus. 
The fact that it is null indicates that there is some kind of a bug.  I am just trying to
gauge how serious of a bug it is.  Does the supervisor recover after a little while or is
it stuck in a bad state?


- Bobby


On Monday, August 7, 2017, 9:28:46 AM CDT, Martin Burian <martin.burianjr@gmail.com>
wrote:

I am seeing exactly the same exception being thrown by supervisors in my topology after an
update from storm 1.0.3 to 1.0.4.A quick search took me to http://storm.apache.org/releases/1.0.4/SECURITY.html where
there are multiple references to some "users". Do I need to configure one of those? I Didn't
have to do anything with users before.
Martin

po 7. 8. 2017 v 16:01 odesílatel Bobby Evans <evans@yahoo-inc.com> napsal:

from the code
https://github.com/apache/storm/blob/v1.1.1/storm-core/src/jvm/org/apache/storm/localizer/Localizer.java?utf8=%E2%9C%93#L332
it looks like your user is null when trying to update the resources for the topology?  Did
the supervisor and your topology recover when the supervisor was relaunched?  If not please
file a bug JIRA and we can look into it.


- Bobby


On Sunday, August 6, 2017, 11:34:19 PM CDT, Sahan Maldeniya <sahan@haulmatic.com> wrote:


Hi,

We are using apache storm to analyze a GPS data stream by subscribing to rabbit mq message
channel. We use apache storm 1.11.

We have deployed a zookeeper, 1 nimbus, 1 UI and a supervisor in 3 Amazon EC2 instances. Also
we have a local storm supervisor which is pointing to the remote nimbus in ec2 via the same
zookeeper. 





When we run the topology as a local cluster or submit to the storm when only local supervisor
is running ( we stop the remote supervisor instance), everything works as expected.

Problem arise when we submit the topology to the remote supervisor (production like env).
This keeps restarting the supervisor with below logs.




worker.log
2017-08-05 08:50:27.334 o.a.s.d.worker Thread-19 [INFO] Shutting down worker GpsDataAnalyticsTopology-6-150
1922728 a383ada8-62a4-418c-9c7c-4e5a5e 19f051 6700
    2017-08-05 08:50:27.334 o.a.s.d.worker Thread-19 [INFO] Terminating messaging context
    2017-08-05 08:50:27.334 o.a.s.d.worker Thread-19 [INFO] Shutting down executors
    2017-08-05 08:50:27.334 o.a.s.d.executor Thread-19 [INFO] Shutting down executor fuel-data-analyzer:[2
2]
    2017-08-05 08:50:27.335 o.a.s.util Thread-4-fuel-data-analyzer-ex ecutor[2 2] [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.335 o.a.s.util Thread-3-disruptor-executor[2 2]-send-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.341 o.a.s.d.executor Thread-19 [INFO] Shut down executor fuel-data-analyzer:[2
2]
    2017-08-05 08:50:27.342 o.a.s.d.executor Thread-19 [INFO] Shutting down executor fuel-data-save-to-db:[3
3]
    2017-08-05 08:50:27.342 o.a.s.util Thread-6-fuel-data-save-to-db- executor[3 3] [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.342 o.a.s.util Thread-5-disruptor-executor[3 3]-send-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.343 o.a.s.d.executor Thread-19 [INFO] Shut down executor fuel-data-save-to-db:[3
3]
    2017-08-05 08:50:27.343 o.a.s.d.executor Thread-19 [INFO] Shutting down executor __acker:[1
1]
    2017-08-05 08:50:27.343 o.a.s.util Thread-8-__acker-executor[1 1] [INFO] Async loop
interrupted!
    2017-08-05 08:50:27.343 o.a.s.util Thread-7-disruptor-executor[1 1]-send-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.344 o.a.s.d.executor Thread-19 [INFO] Shut down executor __acker:[1
1]
    2017-08-05 08:50:27.344 o.a.s.d.executor Thread-19 [INFO] Shutting down executor rabbit-mq-gps-reader-spout:[6
6]
    2017-08-05 08:50:27.344 o.a.s.util Thread-9-disruptor-executor[6 6]-send-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.344 o.a.s.util Thread-10-rabbit-mq-gps-reader -spout-executor[6
6] [INFO] Async loop interrupted!
    2017-08-05 08:50:27.348 o.a.s.d.executor Thread-19 [INFO] Shut down executor rabbit-mq-gps-reader-spout:[6
6]
    2017-08-05 08:50:27.348 o.a.s.d.executor Thread-19 [INFO] Shutting down executor __system:[-1
-1]
    2017-08-05 08:50:27.348 o.a.s.util Thread-11-disruptor-executor[- 1 -1]-send-queue
[INFO] Async loop interrupted!
    2017-08-05 08:50:27.349 o.a.s.util Thread-12-__system-executor[-1 -1] [INFO] Async
loop interrupted!
    2017-08-05 08:50:27.349 o.a.s.d.executor Thread-19 [INFO] Shut down executor __system:[-1
-1]
    2017-08-05 08:50:27.349 o.a.s.d.executor Thread-19 [INFO] Shutting down executor gps-data-logger:[5
5]
    2017-08-05 08:50:27.349 o.a.s.util Thread-13-disruptor-executor[5 5]-send-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.349 o.a.s.util Thread-14-gps-data-logger-exec utor[5 5] [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.350 o.a.s.d.executor Thread-19 [INFO] Shut down executor gps-data-logger:[5
5]
    2017-08-05 08:50:27.350 o.a.s.d.executor Thread-19 [INFO] Shutting down executor gps-data-devider:[4
4]
    2017-08-05 08:50:27.350 o.a.s.util Thread-16-gps-data-devider-exe cutor[4 4] [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.350 o.a.s.util Thread-15-disruptor-executor[4 4]-send-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.350 o.a.s.d.executor Thread-19 [INFO] Shut down executor gps-data-devider:[4
4]
    2017-08-05 08:50:27.351 o.a.s.d.worker Thread-19 [INFO] Shut down executors
    2017-08-05 08:50:27.353 o.a.s.d.worker Thread-19 [INFO] Shutting down transfer thread
    2017-08-05 08:50:27.353 o.a.s.util Thread-17-disruptor-worker-tra nsfer-queue [INFO]
Async loop interrupted!
    2017-08-05 08:50:27.354 o.a.s.d.worker Thread-19 [INFO] Shut down transfer thread
    2017-08-05 08:50:27.354 o.a.s.d.worker Thread-19 [INFO] Shut down backpressure thread
    2017-08-05 08:50:27.355 o.a.s.d.worker Thread-19 [INFO] Shutting down default resources
    2017-08-05 08:50:27.356 o.a.s.d.worker Thread-19 [INFO] Shut down default resources
    2017-08-05 08:50:27.356 o.a.s.d.worker Thread-19 [INFO] Trigger any worker shutdown
hooks
    2017-08-05 08:50:27.363 o.a.s.d.worker Thread-19 [INFO] Disconnecting from storm cluster
state context
    2017-08-05 08:50:27.363 o.a.s.s.o.a.c.f.i.CuratorFrame workImpl Curator-Framework-0
[INFO] backgroundOperationsLoop exiting
    2017-08-05 08:50:27.365 o.a.s.s.o.a.z.ClientCnxn main-EventThread [INFO] EventThread
shut down
    2017-08-05 08:50:27.366 o.a.s.s.o.a.z.ZooKeeper Thread-19 [INFO] Session: 0x15d7d7e305c03a6
closed
    2017-08-05 08:50:27.366 o.a.s.d.worker Thread-19 [INFO] Shut down worker GpsDataAnalyticsTopology-6-150
1922728 a383ada8-62a4-418c-9c7c-4e5a5e 19f051 6700


supervisor.log

2017-08-05 08:45:30.798 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE WAITING_FOR_BLOB_LOCALIZATION
msInState: 31 -> WAITING_FOR_WORKER_START msInState: 0 topo:GpsDataAnalyticsTopology- 6-1501922728
worker:5ecd5438-1e1f-465b-bb82 -765881af690c
    2017-08-05 08:45:30.798 o.a.s.d.s.Slot SLOT_6700 [INFO] SLOT 6700: Changing current
assignment from null to LocalAssignment(topology_id:Gp sDataAnalyticsTopology-6-15019 22728,
executors:[ExecutorInfo(task_s tart:4, task_end:4), ExecutorInfo(task_start:3, task_end:3),
ExecutorInfo(task_start:2, task_end:2), ExecutorInfo(task_start:6, task_end:6), ExecutorInfo(task_start:1,
task_end:1), ExecutorInfo(task_start:5, task_end:5)], resources:WorkerResources(mem_ on_heap:0.0,
mem_off_heap:0.0, cpu:0.0))
    2017-08-05 08:45:34.801 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE WAITING_FOR_WORKER_START
msInState: 4003 topo:GpsDataAnalyticsTopology- 6-1501922728 worker:5ecd5438-1e1f-465b-bb82
-765881af690c -> RUNNING msInState: 0 topo:GpsDataAnalyticsTopology- 6-1501922728 worker:5ecd5438-1e1f-465b-bb82
-765881af690c
    2017-08-05 08:45:40.354 o.a.s.e.EventManagerImp Thread-4 [ERROR] {} Error when processing
event
    java.lang.NullPointerException : null
        at java.util.concurrent.Concurren tHashMap.get(ConcurrentHashMap .java:936)
~[?:1.8.0_144]
        at org.apache.storm.localizer.Loc alizer.updateBlobs(Localizer. java:332) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.superv isor.timer.UpdateBlobs.updateB lobsForTopology(UpdateBlobs.
java:99) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.superv isor.timer.UpdateBlobs.run( UpdateBlobs.java:72)
~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.event.EventMa nagerImp$1.run(EventManagerImp .java:54) ~[storm-core-1.1.1.jar:1.1.1]
    2017-08-05 08:45:40.354 o.a.s.u.Utils Thread-4 [ERROR] Halting process: Error when
processing an event
    java.lang.RuntimeException: Halting process: Error when processing an event
        at org.apache.storm.utils.Utils.e xitProcess(Utils.java:1773) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.event.EventMa nagerImp$1.run(EventManagerImp .java:63) ~[storm-core-1.1.1.jar:1.1.1]



Thank You

-- 
Thank You,Best Regards.

| HaulMatic
 Technologies |  Sahan Maldeniya Software Craftsman sahan@haulmatic.com | +94776306579 | +94114693330
HaulMatic Technologies (Pvt) Ltd | http://haulmatic.com 120, Highlevel Road, Colombo 05, Sri
Lanka      |



Mime
View raw message