trafodion-codereview mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DaveBirdsall <...@git.apache.org>
Subject [GitHub] trafodion pull request #1427: TRAFODION-2940 In HA env, one node lose networ...
Date Tue, 06 Feb 2018 20:39:44 GMT
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1427#discussion_r166434382
  
    --- Diff: dcs/src/main/java/org/trafodion/dcs/master/DcsMaster.java ---
    @@ -111,11 +104,59 @@ public DcsMaster(String[] args) {
             trafodionHome = System.getProperty(Constants.DCS_TRAFODION_HOME);
             jvmShutdownHook = new JVMShutdownHook();
             Runtime.getRuntime().addShutdownHook(jvmShutdownHook);
    -        thrd = new Thread(this);
    -        thrd.start();
    +
    +        ExecutorService executorService = Executors.newFixedThreadPool(1);
    +        CompletionService<Integer> completionService = new ExecutorCompletionService<Integer>(executorService);
    +
    +        while (true) {
    +            completionService.submit(this);
    +            Future<Integer> f = null;
    +            try {
    +                f = completionService.take();
    +                if (f != null) {
    +                    Integer status = f.get();
    +                    if (status <= 0) {
    +                        System.exit(status);
    +                    } else {
    +                        // 35000 * 15mins ~= 1 years
    +                        RetryCounter retryCounter = RetryCounterFactory.create(35000,
15, TimeUnit.MINUTES);
    +                        while (true) {
    +                            try {
    +                                ZkClient tmpZkc = new ZkClient();
    +                                tmpZkc.connect();
    +                                tmpZkc.close();
    +                                tmpZkc = null;
    +                                LOG.info("Connected to ZooKeeper successful, restart
DCS Master.");
    +                                // reset lock
    +                                isLeader = new CountDownLatch(1);
    +                                break;
    --- End diff --
    
    I'm not sure I understand this logic. Do we sit inside one of these method calls during
normal processing? Or does tmpZkc.connect() and close() complete immediately? If so, it looks
like we just loop around the while loop and do it again, over and over.


---

Mime
View raw message