2016-08-14 17:49:49: pid 1791: LOG: reading status file: 0 th backend is set to down status 2016-08-14 17:49:49: pid 1791: LOCATION: pgpool_main.c:3037 2016-08-14 17:49:49: pid 1791: LOG: waiting for watchdog to initialize 2016-08-14 17:49:49: pid 1791: LOCATION: pgpool_main.c:259 2016-08-14 17:49:49: pid 1801: LOG: setting the local watchdog node name to "Linux_mgrdb85_9999" 2016-08-14 17:49:49: pid 1801: LOCATION: watchdog.c:575 2016-08-14 17:49:49: pid 1801: LOG: watchdog cluster configured with 1 remote nodes 2016-08-14 17:49:49: pid 1801: LOCATION: watchdog.c:583 2016-08-14 17:49:49: pid 1801: LOG: watchdog remote node:0 on 1.1.1.84:9000 2016-08-14 17:49:49: pid 1801: LOCATION: watchdog.c:594 2016-08-14 17:49:49: pid 1801: LOG: IPC socket path: "/tmp/.s.PGPOOLWD_CMD.9000" 2016-08-14 17:49:49: pid 1801: LOCATION: watchdog.c:1063 2016-08-14 17:49:49: pid 1801: LOG: new outbond connection to 1.1.1.84:9000 2016-08-14 17:49:49: pid 1801: LOCATION: watchdog.c:2482 2016-08-14 17:49:54: pid 1801: LOG: watchdog node state changed from [LOADING] to [JOINING] 2016-08-14 17:49:54: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:49:56: pid 1801: LOG: new watchdog node connection is received from "1.1.1.84:17543" 2016-08-14 17:49:56: pid 1801: LOCATION: watchdog.c:2409 2016-08-14 17:49:59: pid 1801: LOG: watchdog node state changed from [JOINING] to [INITIALIZING] 2016-08-14 17:49:59: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:50:00: pid 1801: LOG: watchdog node state changed from [INITIALIZING] to [STANDING FOR MASTER] 2016-08-14 17:50:00: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:50:00: pid 1801: LOG: our stand for coordinator request is rejected by node "Linux_mgrdb84_9999" 2016-08-14 17:50:00: pid 1801: LOCATION: watchdog.c:3987 2016-08-14 17:50:00: pid 1801: LOG: watchdog node state changed from [STANDING FOR MASTER] to [PARTICIPATING IN ELECTION] 2016-08-14 17:50:00: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:50:06: pid 1801: LOG: watchdog node state changed from [PARTICIPATING IN ELECTION] to [JOINING] 2016-08-14 17:50:06: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:50:06: pid 1801: LOG: watchdog node state changed from [JOINING] to [INITIALIZING] 2016-08-14 17:50:06: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:50:07: pid 1801: LOG: watchdog node state changed from [INITIALIZING] to [STANDBY] 2016-08-14 17:50:07: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:50:07: pid 1801: LOG: successfully joined the watchdog cluster as standby node 2016-08-14 17:50:07: pid 1801: DETAIL: our join coordinator request is accepted by cluster leader node "Linux_mgrdb84_9999" 2016-08-14 17:50:07: pid 1801: LOCATION: watchdog.c:4583 2016-08-14 17:50:07: pid 1791: LOG: watchdog process is initialized 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:273 2016-08-14 17:50:07: pid 1801: LOG: new IPC connection received 2016-08-14 17:50:07: pid 1801: LOCATION: watchdog.c:2441 2016-08-14 17:50:07: pid 1791: LOG: Setting up socket for 0.0.0.0:9999 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:797 2016-08-14 17:50:07: pid 1791: LOG: Setting up socket for :::9999 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:797 2016-08-14 17:50:07: pid 2164: LOG: 2 watchdog nodes are configured for lifecheck 2016-08-14 17:50:07: pid 2164: LOCATION: wd_lifecheck.c:475 2016-08-14 17:50:07: pid 2164: LOG: watchdog nodes ID:0 Name:"Linux_mgrdb85_9999" 2016-08-14 17:50:07: pid 2164: DETAIL: Host:"1.1.1.85" WD Port:9000 pgpool-II port:9999 2016-08-14 17:50:07: pid 2164: LOCATION: wd_lifecheck.c:483 2016-08-14 17:50:07: pid 2164: LOG: watchdog nodes ID:1 Name:"Linux_mgrdb84_9999" 2016-08-14 17:50:07: pid 2164: DETAIL: Host:"1.1.1.84" WD Port:9000 pgpool-II port:9999 2016-08-14 17:50:07: pid 2164: LOCATION: wd_lifecheck.c:483 2016-08-14 17:50:07: pid 1791: LOG: pgpool-II successfully started. version 3.5.3 (ekieboshi) 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:370 2016-08-14 17:50:07: pid 1791: LOG: find_primary_node: checking backend no 0 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:2693 2016-08-14 17:50:07: pid 1791: LOG: find_primary_node: checking backend no 1 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:2693 2016-08-14 17:50:07: pid 1791: LOG: find_primary_node: primary node id is 1 2016-08-14 17:50:07: pid 1791: LOCATION: pgpool_main.c:2723 2016-08-14 17:50:08: pid 2168: LOG: createing watchdog heartbeat receive socket. 2016-08-14 17:50:08: pid 2168: DETAIL: bind receive socket to device: "eth1" 2016-08-14 17:50:08: pid 2168: LOCATION: wd_heartbeat.c:206 2016-08-14 17:50:08: pid 2168: LOG: set SO_REUSEPORT option to the socket 2016-08-14 17:50:08: pid 2168: LOCATION: wd_heartbeat.c:679 2016-08-14 17:50:08: pid 2168: LOG: creating watchdog heartbeat receive socket. 2016-08-14 17:50:08: pid 2168: DETAIL: set SO_REUSEPORT 2016-08-14 17:50:08: pid 2168: LOCATION: wd_heartbeat.c:225 2016-08-14 17:50:08: pid 2169: LOG: creating socket for sending heartbeat 2016-08-14 17:50:08: pid 2169: DETAIL: bind send socket to device: eth1 2016-08-14 17:50:08: pid 2169: LOCATION: wd_heartbeat.c:126 2016-08-14 17:50:08: pid 2169: LOG: set SO_REUSEPORT option to the socket 2016-08-14 17:50:08: pid 2169: LOCATION: wd_heartbeat.c:679 2016-08-14 17:50:08: pid 2169: LOG: creating socket for sending heartbeat 2016-08-14 17:50:08: pid 2169: DETAIL: set SO_REUSEPORT 2016-08-14 17:50:08: pid 2169: LOCATION: wd_heartbeat.c:143 2016-08-14 17:51:47: pid 2164: LOG: watchdog: lifecheck started 2016-08-14 17:51:47: pid 2164: LOCATION: wd_lifecheck.c:417 2016-08-14 17:51:47: pid 2164: LOG: informing the node status change to watchdog 2016-08-14 17:51:47: pid 2164: DETAIL: node id :1 status = "NODE DEAD" message:"No heartbeat signal from node" 2016-08-14 17:51:47: pid 2164: LOCATION: wd_lifecheck.c:509 2016-08-14 17:51:47: pid 1801: LOG: new IPC connection received 2016-08-14 17:51:47: pid 1801: LOCATION: watchdog.c:2441 2016-08-14 17:51:47: pid 1801: LOG: received node status change ipc message 2016-08-14 17:51:47: pid 1801: DETAIL: No heartbeat signal from node 2016-08-14 17:51:47: pid 1801: LOCATION: watchdog.c:1672 2016-08-14 17:51:47: pid 1801: LOG: remote node "Linux_mgrdb84_9999" is lost 2016-08-14 17:51:47: pid 1801: LOCATION: watchdog.c:3570 2016-08-14 17:51:47: pid 1801: LOG: watchdog cluster has lost the coordinator node 2016-08-14 17:51:47: pid 1801: LOCATION: watchdog.c:3575 2016-08-14 17:51:47: pid 1801: LOG: watchdog node state changed from [STANDBY] to [JOINING] 2016-08-14 17:51:47: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:51:52: pid 1801: LOG: watchdog node state changed from [JOINING] to [INITIALIZING] 2016-08-14 17:51:52: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:51:53: pid 1801: LOG: I am the only alive node in the watchdog cluster 2016-08-14 17:51:53: pid 1801: HINT: skiping stand for coordinator state 2016-08-14 17:51:53: pid 1801: LOCATION: watchdog.c:3902 2016-08-14 17:51:53: pid 1801: LOG: watchdog node state changed from [INITIALIZING] to [MASTER] 2016-08-14 17:51:53: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 17:51:53: pid 1801: LOG: I am announcing my self as master/coordinator watchdog node 2016-08-14 17:51:53: pid 1801: LOCATION: watchdog.c:4083 2016-08-14 17:51:58: pid 1801: LOG: I am the cluster leader node 2016-08-14 17:51:58: pid 1801: DETAIL: our declare coordinator message is accepted by all nodes 2016-08-14 17:51:58: pid 1801: LOCATION: watchdog.c:4116 2016-08-14 17:51:58: pid 1801: LOG: I am the cluster leader node. Starting escalation process 2016-08-14 17:51:58: pid 1801: LOCATION: watchdog.c:4135 2016-08-14 17:51:58: pid 1801: LOG: escalation process started with PID:2632 2016-08-14 17:51:58: pid 1801: LOCATION: watchdog.c:4450 2016-08-14 17:51:58: pid 2632: LOG: watchdog: escalation started 2016-08-14 17:51:58: pid 2632: LOCATION: wd_escalation.c:92 2016-08-14 17:52:09: pid 2632: WARNING: watchdog failed to bring up delegate IP, 'if_up_cmd' failed 2016-08-14 17:52:09: pid 2632: LOCATION: wd_if.c:142 2016-08-14 17:52:09: pid 2632: WARNING: watchdog de-escalation failed to bring down delegate IP 2016-08-14 17:52:09: pid 2632: LOCATION: wd_escalation.c:140 2016-08-14 17:52:09: pid 1801: LOG: watchdog escalation process with pid: 2632 exit with SUCCESS. 2016-08-14 17:52:09: pid 1801: LOCATION: watchdog.c:2299 2016-08-14 17:53:45: pid 1801: LOG: read from socket failed with error :"Connection reset by peer" 2016-08-14 17:53:45: pid 1801: LOCATION: pool_stream.c:1191 2016-08-14 17:53:46: pid 1801: LOG: new watchdog node connection is received from "1.1.1.84:43466" 2016-08-14 17:53:46: pid 1801: LOCATION: watchdog.c:2409 2016-08-14 17:53:56: pid 1801: LOG: new outbond connection to 1.1.1.84:9000 2016-08-14 17:53:56: pid 1801: LOCATION: watchdog.c:2482 2016-08-14 17:53:57: pid 2164: LOG: informing the node status change to watchdog 2016-08-14 17:53:57: pid 2164: DETAIL: node id :1 status = "NODE ALIVE" message:"Heartbeat signal found" 2016-08-14 17:53:57: pid 2164: LOCATION: wd_lifecheck.c:509 2016-08-14 17:53:57: pid 1801: LOG: new IPC connection received 2016-08-14 17:53:57: pid 1801: LOCATION: watchdog.c:2441 2016-08-14 17:53:57: pid 1801: LOG: received node status change ipc message 2016-08-14 17:53:57: pid 1801: DETAIL: Heartbeat signal found 2016-08-14 17:53:57: pid 1801: LOCATION: watchdog.c:1672 2016-08-14 17:58:57: pid 2164: LOG: informing the node status change to watchdog 2016-08-14 17:58:57: pid 2164: DETAIL: node id :1 status = "NODE DEAD" message:"No heartbeat signal from node" 2016-08-14 17:58:57: pid 2164: LOCATION: wd_lifecheck.c:509 2016-08-14 17:58:57: pid 1801: LOG: new IPC connection received 2016-08-14 17:58:57: pid 1801: LOCATION: watchdog.c:2441 2016-08-14 17:58:57: pid 1801: LOG: received node status change ipc message 2016-08-14 17:58:57: pid 1801: DETAIL: No heartbeat signal from node 2016-08-14 17:58:57: pid 1801: LOCATION: watchdog.c:1672 2016-08-14 17:58:57: pid 1801: LOG: remote node "Linux_mgrdb84_9999" is lost 2016-08-14 17:58:57: pid 1801: LOCATION: watchdog.c:3570 2016-08-14 18:01:12: pid 1801: LOG: new watchdog node connection is received from "1.1.1.84:60049" 2016-08-14 18:01:12: pid 1801: LOCATION: watchdog.c:2409 2016-08-14 18:01:23: pid 1801: WARNING: "Linux_mgrdb85_9999" is the coordinator as per our record but "Linux_mgrdb84_9999" is also announcing as a coordinator 2016-08-14 18:01:23: pid 1801: DETAIL: re-initializing the cluster 2016-08-14 18:01:23: pid 1801: LOCATION: watchdog.c:2862 2016-08-14 18:01:23: pid 1801: LOG: watchdog node state changed from [MASTER] to [JOINING] 2016-08-14 18:01:23: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 18:01:28: pid 1801: LOG: watchdog node state changed from [JOINING] to [INITIALIZING] 2016-08-14 18:01:28: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 18:01:29: pid 1801: LOG: watchdog node state changed from [INITIALIZING] to [STANDBY] 2016-08-14 18:01:29: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 18:01:34: pid 1801: LOG: successfully joined the watchdog cluster as standby node 2016-08-14 18:01:34: pid 1801: DETAIL: our join coordinator request is accepted by cluster leader node "Linux_mgrdb84_9999" 2016-08-14 18:01:34: pid 1801: LOCATION: watchdog.c:4583 2016-08-14 18:01:34: pid 1801: WARNING: we have not received a beacon message from master node "Linux_mgrdb84_9999" 2016-08-14 18:01:34: pid 1801: DETAIL: requesting info message from master node 2016-08-14 18:01:34: pid 1801: LOCATION: watchdog.c:4715 2016-08-14 18:01:37: pid 2164: LOG: informing the node status change to watchdog 2016-08-14 18:01:37: pid 2164: DETAIL: node id :1 status = "NODE ALIVE" message:"Heartbeat signal found" 2016-08-14 18:01:37: pid 2164: LOCATION: wd_lifecheck.c:509 2016-08-14 18:01:37: pid 1801: LOG: new IPC connection received 2016-08-14 18:01:37: pid 1801: LOCATION: watchdog.c:2441 2016-08-14 18:01:37: pid 1801: LOG: received node status change ipc message 2016-08-14 18:01:37: pid 1801: DETAIL: Heartbeat signal found 2016-08-14 18:01:37: pid 1801: LOCATION: watchdog.c:1672 2016-08-14 18:01:37: pid 1801: WARNING: we have not received a beacon message from master node "Linux_mgrdb84_9999" 2016-08-14 18:01:37: pid 1801: DETAIL: requesting info message from master node 2016-08-14 18:01:37: pid 1801: LOCATION: watchdog.c:4715 2016-08-14 18:01:50: pid 1801: LOG: read from socket failed with error :"Connection reset by peer" 2016-08-14 18:01:50: pid 1801: LOCATION: pool_stream.c:1191 2016-08-14 18:01:50: pid 1801: LOG: new outbond connection to 1.1.1.84:9000 2016-08-14 18:01:50: pid 1801: LOCATION: watchdog.c:2482 2016-08-14 18:01:50: pid 1801: WARNING: we have not received a beacon message from master node "Linux_mgrdb84_9999" and it has not replied to our info request 2016-08-14 18:01:50: pid 1801: DETAIL: re-initializing the cluster 2016-08-14 18:01:50: pid 1801: LOCATION: watchdog.c:4703 2016-08-14 18:01:50: pid 1801: LOG: watchdog node state changed from [STANDBY] to [JOINING] 2016-08-14 18:01:50: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 18:01:50: pid 1801: LOG: watchdog node state changed from [JOINING] to [INITIALIZING] 2016-08-14 18:01:50: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 18:01:51: pid 1801: LOG: watchdog node state changed from [INITIALIZING] to [STANDBY] 2016-08-14 18:01:51: pid 1801: LOCATION: watchdog.c:4778 2016-08-14 18:01:51: pid 1801: LOG: successfully joined the watchdog cluster as standby node 2016-08-14 18:01:51: pid 1801: DETAIL: our join coordinator request is accepted by cluster leader node "Linux_mgrdb84_9999" 2016-08-14 18:01:51: pid 1801: LOCATION: watchdog.c:4583