hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming LI (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-901) hawq init failed: hawqstandbywatch.py:test5:gpadmin-[WARNING]:-syncmaster not running
Date Thu, 07 Jul 2016 02:31:11 GMT
Ming LI created HAWQ-901:

             Summary: hawq init failed: hawqstandbywatch.py:test5:gpadmin-[WARNING]:-syncmaster
not running
                 Key: HAWQ-901
                 URL: https://issues.apache.org/jira/browse/HAWQ-901
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Command Line Tools
            Reporter: Ming LI
            Assignee: Lei Chang

Error message in ~/hawqAdminLogs/hawq_init_XXXXXXXX.log
20160706:06:45:53:006218 hawq_start:test1:gpadmin-[INFO]:-Start hawq with args: ['start',
20160706:06:45:53:006218 hawq_start:test1:gpadmin-[INFO]:-Gathering information and validating
the environment...
20160706:06:45:53:006218 hawq_start:test1:gpadmin-[INFO]:-Start standby master service
20160706:06:46:02:006218 hawq_start:test1:gpadmin-[INFO]:-Checking standby master status
20160706:06:45:55:004418 hawqstandbywatch.py:test5:gpadmin-[INFO]:-Monitoring logs
20160706:06:46:00:004418 hawqstandbywatch.py:test5:gpadmin-[INFO]:-checking if syncmaster
is running
20160706:06:46:02:004418 hawqstandbywatch.py:test5:gpadmin-[WARNING]:-syncmaster not running
20160706:06:46:02:006218 hawq_start:test1:gpadmin-[ERROR]:-Standby master start failed, exit
20160706:06:46:02:003999 hawqinit.sh:test5:gpadmin-[ERROR]:-Start HAWQ standby failed

(1) I suspect the root cause maybe: we only wait 5 seconds before we check standby running
status, this interval is too small.  Could you please firstly change the standby running status
check interval from 5 seconds to a loop like recovery running status check on master? 

(2) If the error 'syncmaster not running' will lead to init failure, we should change from

This message was sent by Atlassian JIRA

View raw message