[pgpool-hackers: 2531] Re: New Feature with patch: Quorum and Consensus for backend failover

Muhammad Usama m.usama at gmail.com
Tue Sep 12 23:46:29 JST 2017


Hi Ishii-San


On Fri, Sep 8, 2017 at 5:40 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> I have tested the patch a little bit using 004 watchdog regression
> test.  After the test ends, I manually started master and standby
> Pgpool-II.
>
> 1) Stop master PostgreSQL. Since there's only one PostgreSQL is
> configured, I expected:
>
> psql: ERROR:  pgpool is not accepting any new connections
> DETAIL:  all backend nodes are down, pgpool requires at least one valid
> node
> HINT:  repair the backend nodes and restart pgpool
>
> but master Pgpool-II replies:
>
> psql: FATAL:  failed to create a backend connection
> DETAIL:  executing failover on backend
>
> Is this normal?
>

Finally I am able to reproduce this behaviour and I don't think it is
normal.
But this is not caused by this patch and I get the same response without
the patch and even without watchdog.

This is an existing problem in degenerate_backend_set_ex() function with
RAW mode where it does not
considers RAW mode while checking for valid backends.

I will take care of this issue separately.


> 2) I shutdown the master node to see if the standby escalates.
>
> After shutting down the master, I see this using pcp_watchdog_info:
>
> pcp_watchdog_info -p 11105
> localhost:11100 Linux tishii-CF-SX3HE4BP localhost 11100 21104 4 MASTER
> localhost:11000 Linux tishii-CF-SX3HE4BP localhost 11000 21004 10 SHUTDOWN
>
> Seems ok but I want to confirm.
>

Yes this seems perfect :-)

Thanks
Best regards
Muhammad Usama


> master and standby pgpool logs attached.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > On Fri, Aug 25, 2017 at 5:05 PM, Tatsuo Ishii <ishii at sraoss.co.jp>
> wrote:
> >
> >> > On Fri, Aug 25, 2017 at 12:53 PM, Tatsuo Ishii <ishii at sraoss.co.jp>
> >> wrote:
> >> >
> >> >> Usama,
> >> >>
> >> >> With the new patch, the regression tests all passed.
> >> >>
> >> >
> >> > Glad to hear that :-)
> >> > Did you had a chance to look at the node quarantine state I added.
> What
> >> are
> >> > your thoughts on that ?
> >>
> >> I'm going to look into the patch this weekend.
> >>
> >
> > Many thanks
> >
> > Best Regards
> > Muhammad Usama
> >
> >>
> >> Best regards,
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese:http://www.sraoss.co.jp
> >>
> >> >> > Hi Ishii-San
> >> >> >
> >> >> > Please fine the updated patch, It fixes the regression issue you
> were
> >> >> > facing and also another bug which I encountered during my testing.
> >> >> >
> >> >> > -- Adding Yugo to the thread,
> >> >> > Hi Yugo,
> >> >> >
> >> >> > Since you are an expert of watchdog feature, So I thought you might
> >> have
> >> >> > something to say especially regarding the discussion points
> mentioned
> >> in
> >> >> > the initial mail.
> >> >> >
> >> >> >
> >> >> > Thanks
> >> >> > Best Regards
> >> >> > Muhammad Usama
> >> >> >
> >> >> >
> >> >> > On Thu, Aug 24, 2017 at 11:25 AM, Muhammad Usama <
> m.usama at gmail.com>
> >> >> wrote:
> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> On Thu, Aug 24, 2017 at 4:34 AM, Tatsuo Ishii <ishii at sraoss.co.jp
> >
> >> >> wrote:
> >> >> >>
> >> >> >>> After applying the patch, many of regression tests fail. It seems
> >> >> >>> pgpool.conf.sample has bogus comment which causes the pgpool.conf
> >> >> >>> parser to complain parse error.
> >> >> >>>
> >> >> >>> 2017-08-24 08:22:36: pid 6017: FATAL:  syntex error in
> configuration
> >> >> file
> >> >> >>> "/home/t-ishii/work/pgpool-II/current/pgpool2/src/test/regre
> >> >> >>> ssion/tests/004.watchdog/standby/etc/pgpool.conf"
> >> >> >>> 2017-08-24 08:22:36: pid 6017: DETAIL:  parse error at line 568
> '*'
> >> >> token
> >> >> >>> = 8
> >> >> >>>
> >> >> >>
> >> >> >> Really sorry, Somehow I overlooked the sample config file changes
> I
> >> made
> >> >> >> at the last minute.
> >> >> >> Will send you the updated version.
> >> >> >>
> >> >> >> Thanks
> >> >> >> Best Regards
> >> >> >> Muhammad Usama
> >> >> >>
> >> >> >>>
> >> >> >>> Best regards,
> >> >> >>> --
> >> >> >>> Tatsuo Ishii
> >> >> >>> SRA OSS, Inc. Japan
> >> >> >>> English: http://www.sraoss.co.jp/index_en.php
> >> >> >>> Japanese:http://www.sraoss.co.jp
> >> >> >>>
> >> >> >>> > Usama,
> >> >> >>> >
> >> >> >>> > Thanks for the patch. I am going to review it.
> >> >> >>> >
> >> >> >>> > In the mean time when I apply your patch, I got some trailing
> >> >> >>> > whitespace errors. Can you please fix them?
> >> >> >>> >
> >> >> >>> > /home/t-ishii/quorum_aware_failover.diff:470: trailing
> >> whitespace.
> >> >> >>> >
> >> >> >>> > /home/t-ishii/quorum_aware_failover.diff:485: trailing
> >> whitespace.
> >> >> >>> >
> >> >> >>> > /home/t-ishii/quorum_aware_failover.diff:564: trailing
> >> whitespace.
> >> >> >>> >
> >> >> >>> > /home/t-ishii/quorum_aware_failover.diff:1428: trailing
> >> whitespace.
> >> >> >>> >
> >> >> >>> > /home/t-ishii/quorum_aware_failover.diff:1450: trailing
> >> whitespace.
> >> >> >>> >
> >> >> >>> > warning: squelched 3 whitespace errors
> >> >> >>> > warning: 8 lines add whitespace errors.
> >> >> >>> >
> >> >> >>> > Best regards,
> >> >> >>> > --
> >> >> >>> > Tatsuo Ishii
> >> >> >>> > SRA OSS, Inc. Japan
> >> >> >>> > English: http://www.sraoss.co.jp/index_en.php
> >> >> >>> > Japanese:http://www.sraoss.co.jp
> >> >> >>> >
> >> >> >>> >> Hi
> >> >> >>> >>
> >> >> >>> >> I was working on the new feature to make the backend node
> >> failover
> >> >> >>> quorum
> >> >> >>> >> aware and on the half way through the implementation I also
> added
> >> >> the
> >> >> >>> >> majority consensus feature for the same.
> >> >> >>> >>
> >> >> >>> >> So please find the first version of the patch for review that
> >> makes
> >> >> the
> >> >> >>> >> backend node failover consider the watchdog cluster quorum
> status
> >> >> and
> >> >> >>> seek
> >> >> >>> >> the majority consensus before performing failover.
> >> >> >>> >>
> >> >> >>> >> *Changes in the Failover mechanism with watchdog.*
> >> >> >>> >> For this new feature I have modified the Pgpool-II's existing
> >> >> failover
> >> >> >>> >> mechanism with watchdog.
> >> >> >>> >> Previously as you know when the Pgpool-II require to perform a
> >> node
> >> >> >>> >> operation (failover, failback, promote-node) with the
> watchdog.
> >> The
> >> >> >>> >> watchdog used to propagated the failover request to all the
> >> >> Pgpool-II
> >> >> >>> nodes
> >> >> >>> >> in the watchdog cluster and as soon as the request was
> received
> >> by
> >> >> the
> >> >> >>> >> node, it used to initiate the local failover and that failover
> >> was
> >> >> >>> >> synchronised on all nodes using the distributed locks.
> >> >> >>> >>
> >> >> >>> >> *Now Only the Master node performs the failover.*
> >> >> >>> >> The attached patch changes the mechanism of synchronised
> >> failover,
> >> >> and
> >> >> >>> now
> >> >> >>> >> only the Pgpool-II of master watchdog node performs the
> failover,
> >> >> and
> >> >> >>> all
> >> >> >>> >> other standby nodes sync the backend statuses after the master
> >> >> >>> Pgpool-II is
> >> >> >>> >> finished with the failover.
> >> >> >>> >>
> >> >> >>> >> *Overview of new failover mechanism.*
> >> >> >>> >> -- If the failover request is received to the standby watchdog
> >> >> >>> node(from
> >> >> >>> >> local Pgpool-II), That request is forwarded to the master
> >> watchdog
> >> >> and
> >> >> >>> the
> >> >> >>> >> Pgpool-II main process is returned with the
> >> >> FAILOVER_RES_WILL_BE_DONE
> >> >> >>> >> return code. And upon receiving the FAILOVER_RES_WILL_BE_DONE
> >> from
> >> >> the
> >> >> >>> >> watchdog for the failover request the requesting Pgpool-II
> moves
> >> >> >>> forward
> >> >> >>> >> without doing anything further for the particular failover
> >> command.
> >> >> >>> >>
> >> >> >>> >> -- Now when the failover request from standby node is
> received by
> >> >> the
> >> >> >>> >> master watchdog, after performing the validation, applying the
> >> >> >>> consensus
> >> >> >>> >> rules the failover request is triggered on the local
> Pgpool-II .
> >> >> >>> >>
> >> >> >>> >> -- When the failover request is received to the master
> watchdog
> >> node
> >> >> >>> from
> >> >> >>> >> the local Pgpool-II (On the IPC channel) the watchdog process
> >> inform
> >> >> >>> the
> >> >> >>> >> Pgpool-II requesting process to proceed with failover
> (provided
> >> all
> >> >> >>> >> failover rules are satisfied).
> >> >> >>> >>
> >> >> >>> >> -- After the failover is finished on the master Pgpool-II, the
> >> >> failover
> >> >> >>> >> function calls the *wd_failover_end*() which sends the backend
> >> sync
> >> >> >>> >> required message to all standby watchdogs.
> >> >> >>> >>
> >> >> >>> >> -- Upon receiving the sync required message from master
> watchdog
> >> >> node
> >> >> >>> all
> >> >> >>> >> Pgpool-II sync the new statuses of each backend node from the
> >> master
> >> >> >>> >> watchdog.
> >> >> >>> >>
> >> >> >>> >> *No More Failover locks*
> >> >> >>> >> Since with this new failover mechanism we do not require any
> >> >> >>> >> synchronisation and guards against the execution of
> >> >> failover_commands
> >> >> >>> by
> >> >> >>> >> multiple Pgpool-II nodes, So the patch removes all the
> >> distributed
> >> >> >>> locks
> >> >> >>> >> from failover function, This makes the failover simpler and
> >> faster.
> >> >> >>> >>
> >> >> >>> >> *New kind of Failover operation NODE_QUARANTINE_REQUEST*
> >> >> >>> >> The patch adds the new kind of backend node operation
> >> >> NODE_QUARANTINE
> >> >> >>> which
> >> >> >>> >> is effectively same as the NODE_DOWN, but with node_quarantine
> >> the
> >> >> >>> >> failover_command is not triggered.
> >> >> >>> >> The NODE_DOWN_REQUEST is automatically converted to the
> >> >> >>> >> NODE_QUARANTINE_REQUEST when the failover is requested on the
> >> >> backend
> >> >> >>> node
> >> >> >>> >> but watchdog cluster does not holds the quorum.
> >> >> >>> >> This means in the absence of quorum the failed backend nodes
> are
> >> >> >>> >> quarantined and when the quorum becomes available again the
> >> >> Pgpool-II
> >> >> >>> >> performs the failback operation on all quarantine nodes.
> >> >> >>> >> And again when the failback is performed on the quarantine
> >> backend
> >> >> >>> node the
> >> >> >>> >> failover function does not trigger the failback_command.
> >> >> >>> >>
> >> >> >>> >> *Controlling the Failover behaviour.*
> >> >> >>> >> The patch adds three new configuration parameters to configure
> >> the
> >> >> >>> failover
> >> >> >>> >> behaviour from user side.
> >> >> >>> >>
> >> >> >>> >> *failover_when_quorum_exists*
> >> >> >>> >> When enabled the failover command will only be executed when
> the
> >> >> >>> watchdog
> >> >> >>> >> cluster holds the quorum. And when the quorum is absent and
> >> >> >>> >> failover_when_quorum_exists is enabled the failed backend
> nodes
> >> will
> >> >> >>> get
> >> >> >>> >> quarantine until the quorum becomes available again.
> >> >> >>> >> disabling it will enable the old behaviour of failover
> commands.
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> *failover_require_consensus*This new configuration parameter
> >> can be
> >> >> >>> used to
> >> >> >>> >> make sure we get the majority vote before performing the
> >> failover on
> >> >> >>> the
> >> >> >>> >> node. When *failover_require_consensus* is enabled then the
> >> >> failover is
> >> >> >>> >> only performed after receiving the failover request from the
> >> >> majority
> >> >> >>> or
> >> >> >>> >> Pgpool-II nodes.
> >> >> >>> >> For example in three nodes cluster the failover will not be
> >> >> performed
> >> >> >>> until
> >> >> >>> >> at least two nodes ask for performing the failover on the
> >> particular
> >> >> >>> >> backend node.
> >> >> >>> >>
> >> >> >>> >> It is also worthwhile to mention here that
> >> >> *failover_require_consensus*
> >> >> >>> >> only works when failover_when_quorum_exists is enables.
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> *enable_multiple_failover_requests_from_node*
> >> >> >>> >> This parameter works in connection with
> >> *failover_require_consensus*
> >> >> >>> >> config. When enabled a single Pgpool-II node can vote for
> >> failover
> >> >> >>> multiple
> >> >> >>> >> times.
> >> >> >>> >> For example in the three nodes cluster if one Pgpool-II node
> >> sends
> >> >> the
> >> >> >>> >> failover request of particular node twice that would be
> counted
> >> as
> >> >> two
> >> >> >>> >> votes in favour of failover and the failover will be performed
> >> even
> >> >> if
> >> >> >>> we
> >> >> >>> >> do not get a vote from other two nodes.
> >> >> >>> >>
> >> >> >>> >> And when *enable_multiple_failover_requests_from_node* is
> >> disabled,
> >> >> >>> Only
> >> >> >>> >> the first vote from each Pgpool-II will be accepted and all
> other
> >> >> >>> >> subsequent votes will be marked duplicate and rejected.
> >> >> >>> >> So in that case we will require a majority votes from distinct
> >> >> nodes to
> >> >> >>> >> execute the failover.
> >> >> >>> >> Again this *enable_multiple_failover_requests_from_node* only
> >> >> becomes
> >> >> >>> >> effective when both *failover_when_quorum_exists* and
> >> >> >>> >> *failover_require_consensus* are enabled.
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> *Controlling the failover: The Coding perspective.*
> >> >> >>> >> Although the failover functions are made quorum and consensus
> >> aware
> >> >> but
> >> >> >>> >> there is still a way to bypass the quorum conditions, and
> >> >> requirement
> >> >> >>> of
> >> >> >>> >> consensus.
> >> >> >>> >>
> >> >> >>> >> For this the patch uses the existing request_details flags in
> >> >> >>> >> POOL_REQUEST_NODE to control the behaviour of failover.
> >> >> >>> >>
> >> >> >>> >> Here are the newly added flags values.
> >> >> >>> >>
> >> >> >>> >> *REQ_DETAIL_WATCHDOG*:
> >> >> >>> >> Setting this flag while issuing the failover command will not
> >> send
> >> >> the
> >> >> >>> >> failover request to the watchdog. But this flag may not be
> >> useful in
> >> >> >>> any
> >> >> >>> >> other place than where it is already used.
> >> >> >>> >> Mostly this flag can be used to avoid the failover command
> from
> >> >> going
> >> >> >>> to
> >> >> >>> >> watchdog that is already originated from watchdog. Otherwise
> we
> >> can
> >> >> >>> end up
> >> >> >>> >> in infinite loop.
> >> >> >>> >>
> >> >> >>> >> *REQ_DETAIL_CONFIRMED*:
> >> >> >>> >> Setting this flag will bypass the *failover_require_consensus*
> >> >> >>> >> configuration and immediately perform the failover if quorum
> is
> >> >> >>> present.
> >> >> >>> >> This flag can be used to issue the failover request originated
> >> from
> >> >> PCP
> >> >> >>> >> command.
> >> >> >>> >>
> >> >> >>> >> *REQ_DETAIL_UPDATE*:
> >> >> >>> >> This flag is used for the command where we are failing back
> the
> >> >> >>> quarantine
> >> >> >>> >> nodes. Setting this flag will not trigger the
> failback_command.
> >> >> >>> >>
> >> >> >>> >> *Some conditional flags used:*
> >> >> >>> >> I was not sure about the configuration of each type of
> failover
> >> >> >>> operation.
> >> >> >>> >> As we have three main failover operations NODE_UP_REQUEST,
> >> >> >>> >> NODE_DOWN_REQUEST, and PROMOTE_NODE_REQUEST
> >> >> >>> >> So I was thinking do we need to give the configuration option
> to
> >> the
> >> >> >>> users,
> >> >> >>> >> if they want to enable/disable quorum checking and consensus
> for
> >> >> >>> individual
> >> >> >>> >> failover operation type.
> >> >> >>> >> For example: is it a practical configuration where a user
> would
> >> >> want to
> >> >> >>> >> ensure quorum while preforming NODE_DOWN operation while does
> not
> >> >> want
> >> >> >>> it
> >> >> >>> >> for NODE_UP.
> >> >> >>> >> So in this patch I use three compile time defines to enable
> >> disable
> >> >> the
> >> >> >>> >> individual failover operation, while we can decide on the best
> >> >> >>> solution.
> >> >> >>> >>
> >> >> >>> >> NODE_UP_REQUIRE_CONSENSUS: defining it will enable quorum
> >> checking
> >> >> >>> feature
> >> >> >>> >> for NODE_UP_REQUESTs
> >> >> >>> >>
> >> >> >>> >> NODE_DOWN_REQUIRE_CONSENSUS: defining it will enable quorum
> >> checking
> >> >> >>> >> feature for NODE_DOWN_REQUESTs
> >> >> >>> >>
> >> >> >>> >> NODE_PROMOTE_REQUIRE_CONSENSUS: defining it will enable
> quorum
> >> >> >>> checking
> >> >> >>> >> feature for PROMOTE_NODE_REQUESTs
> >> >> >>> >>
> >> >> >>> >> *Some Point for Discussion:*
> >> >> >>> >>
> >> >> >>> >> *Do we really need to check ReqInfo->switching flag before
> >> enqueuing
> >> >> >>> >> failover request.*
> >> >> >>> >> While working on the patch I was wondering why do we disallow
> >> >> >>> enqueuing the
> >> >> >>> >> failover command when the failover is already in progress? For
> >> >> example
> >> >> >>> in
> >> >> >>> >> *pcp_process_command*() function if we see the
> >> *Req_info->switching*
> >> >> >>> flag
> >> >> >>> >> set we bailout with the error instead of enqueuing the
> command.
> >> Is
> >> >> is
> >> >> >>> >> really necessary?
> >> >> >>> >>
> >> >> >>> >> *Do we need more granule control over each failover
> operation:*
> >> >> >>> >> As described in section "Some conditional flags used" I want
> the
> >> >> >>> opinion on
> >> >> >>> >> do we need configuration parameters in pgpool.conf to enable
> >> disable
> >> >> >>> quorum
> >> >> >>> >> and consensus checking on individual failover types.
> >> >> >>> >>
> >> >> >>> >> *Which failover should be mark as Confirmed:*
> >> >> >>> >> As defined in the above section of REQ_DETAIL_CONFIRMED, We
> can
> >> mark
> >> >> >>> the
> >> >> >>> >> failover request to not need consensus, currently the requests
> >> from
> >> >> >>> the PCP
> >> >> >>> >> commands are fired with this flag. But I was wondering there
> may
> >> be
> >> >> >>> more
> >> >> >>> >> places where we many need to use the flag.
> >> >> >>> >> For example I currently use the same confirmed flag when
> >> failover is
> >> >> >>> >> triggered because of *replication_stop_on_mismatch*.
> >> >> >>> >>
> >> >> >>> >> I think we should think this flag for each place of failover,
> >> like
> >> >> >>> when the
> >> >> >>> >> failover is triggered
> >> >> >>> >> because of health_check failure.
> >> >> >>> >> because of replication mismatch
> >> >> >>> >> because of backend_error
> >> >> >>> >> e.t.c
> >> >> >>> >>
> >> >> >>> >> *Node Quarantine behaviour.*
> >> >> >>> >> What do you think about the node quarantine used by this
> patch.
> >> Can
> >> >> you
> >> >> >>> >> think of some problem which can be caused by this?
> >> >> >>> >>
> >> >> >>> >> *What should be the default values for each newly added config
> >> >> >>> parameters.*
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> *TODOs*
> >> >> >>> >>
> >> >> >>> >> -- Updating the documentation is still todo. Will do that once
> >> every
> >> >> >>> aspect
> >> >> >>> >> of the feature will be finalised.
> >> >> >>> >> -- Some code warnings and cleanups are still not done.
> >> >> >>> >> -- I am still little short on testing
> >> >> >>> >> -- Regression test cases for the feature
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> Thoughts and suggestions are most welcome.
> >> >> >>> >>
> >> >> >>> >> Thanks
> >> >> >>> >> Best regards
> >> >> >>> >> Muhammad Usama
> >> >> >>> > _______________________________________________
> >> >> >>> > pgpool-hackers mailing list
> >> >> >>> > pgpool-hackers at pgpool.net
> >> >> >>> > http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >>
> >>
>
> 2017-09-08 09:29:29: pid 4434: LOG:  Backend status file
> /home/t-ishii/work/pgpool-II/current/pgpool2/src/test/
> regression/tests/004.watchdog/master/log/pgpool_status discarded
> 2017-09-08 09:29:29: pid 4434: LOG:  waiting for watchdog to initialize
> 2017-09-08 09:29:29: pid 4436: LOG:  setting the local watchdog node name
> to "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:29: pid 4436: LOG:  watchdog cluster is configured with 1
> remote nodes
> 2017-09-08 09:29:29: pid 4436: LOG:  watchdog remote node:0 on
> localhost:21104
> 2017-09-08 09:29:29: pid 4436: LOG:  interface monitoring is disabled in
> watchdog
> 2017-09-08 09:29:29: pid 4436: LOG:  watchdog node state changed from
> [DEAD] to [LOADING]
> 2017-09-08 09:29:34: pid 4436: LOG:  watchdog node state changed from
> [LOADING] to [JOINING]
> 2017-09-08 09:29:38: pid 4436: LOG:  watchdog node state changed from
> [JOINING] to [INITIALIZING]
> 2017-09-08 09:29:39: pid 4436: LOG:  I am the only alive node in the
> watchdog cluster
> 2017-09-08 09:29:39: pid 4436: HINT:  skiping stand for coordinator state
> 2017-09-08 09:29:39: pid 4436: LOG:  watchdog node state changed from
> [INITIALIZING] to [MASTER]
> 2017-09-08 09:29:39: pid 4436: LOG:  I am announcing my self as
> master/coordinator watchdog node
> 2017-09-08 09:29:41: pid 4436: LOG:  new watchdog node connection is
> received from "127.0.0.1:33967"
> 2017-09-08 09:29:41: pid 4436: LOG:  new node joined the cluster
> hostname:"localhost" port:21104 pgpool_port:11100
> 2017-09-08 09:29:42: pid 4436: LOG:  adding watchdog node "localhost:11100
> Linux tishii-CF-SX3HE4BP" to the standby list
> 2017-09-08 09:29:42: pid 4436: LOG:  I am the cluster leader node
> 2017-09-08 09:29:42: pid 4436: DETAIL:  our declare coordinator message is
> accepted by all nodes
> 2017-09-08 09:29:42: pid 4436: LOG:  setting the local node
> "localhost:11000 Linux tishii-CF-SX3HE4BP" as watchdog cluster master
> 2017-09-08 09:29:42: pid 4436: LOG:  I am the cluster leader node.
> Starting escalation process
> 2017-09-08 09:29:42: pid 4436: LOG:  escalation process started with
> PID:4444
> 2017-09-08 09:29:42: pid 4434: LOG:  watchdog process is initialized
> 2017-09-08 09:29:42: pid 4444: LOG:  watchdog: escalation started
> 2017-09-08 09:29:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:29:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:29:42: pid 4446: LOG:  2 watchdog nodes are configured for
> lifecheck
> 2017-09-08 09:29:42: pid 4446: LOG:  watchdog nodes ID:0
> Name:"localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:42: pid 4446: DETAIL:  Host:"localhost" WD Port:21004
> pgpool-II port:11000
> 2017-09-08 09:29:42: pid 4446: LOG:  watchdog nodes ID:1
> Name:"localhost:11100 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:42: pid 4446: DETAIL:  Host:"localhost" WD Port:21104
> pgpool-II port:11100
> 2017-09-08 09:29:42: pid 4434: LOG:  Setting up socket for 127.0.0.1:11000
> 2017-09-08 09:29:42: pid 4436: LOG:  watchdog escalation process with pid:
> 4444 exit with SUCCESS.
> 2017-09-08 09:29:42: pid 4434: LOG:  pgpool-II successfully started.
> version 3.7devel (amefuriboshi)
> 2017-09-08 09:29:43: pid 4448: LOG:  set SO_REUSEPORT option to the socket
> 2017-09-08 09:29:43: pid 4448: LOG:  creating socket for sending heartbeat
> 2017-09-08 09:29:43: pid 4448: DETAIL:  set SO_REUSEPORT
> 2017-09-08 09:29:43: pid 4447: LOG:  set SO_REUSEPORT option to the socket
> 2017-09-08 09:29:43: pid 4447: LOG:  creating watchdog heartbeat receive
> socket.
> 2017-09-08 09:29:43: pid 4447: DETAIL:  set SO_REUSEPORT
> 2017-09-08 09:29:45: pid 4436: LOG:  new outbond connection to
> localhost:21104
> 2017-09-08 09:29:52: pid 4446: LOG:  watchdog: lifecheck started
> 2017-09-08 09:31:42: pid 4498: LOG:  failed to connect to PostgreSQL
> server by unix domain socket
> 2017-09-08 09:31:42: pid 4498: DETAIL:  connect to "/tmp/.s.PGSQL.11002"
> failed with error "No such file or directory"
> 2017-09-08 09:31:42: pid 4498: ERROR:  failed to make persistent db
> connection
> 2017-09-08 09:31:42: pid 4498: DETAIL:  connection to host:"/tmp:11002"
> failed
> 2017-09-08 09:31:42: pid 4498: LOG:  health check failed on node 0
> (timeout:0)
> 2017-09-08 09:31:42: pid 4498: LOG:  received degenerate backend request
> for node_id: 0 from pid [4498]
> 2017-09-08 09:31:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog received the failover
> command from local pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog is processing the failover
> command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC
> interface
> 2017-09-08 09:31:42: pid 4436: LOG:  we do not require majority votes to
> proceed with failover
> 2017-09-08 09:31:42: pid 4436: DETAIL:  proceeding with the failover
> 2017-09-08 09:31:42: pid 4436: HINT:  failover_require_consensus is set to
> false
> 2017-09-08 09:31:42: pid 4434: LOG:  Pgpool-II parent process has received
> failover request
> 2017-09-08 09:31:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog is informed of failover end
> 2017-09-08 09:31:42: pid 4434: LOG:  starting degeneration. shutdown host
> /tmp(11002)
> 2017-09-08 09:31:42: pid 4434: WARNING:  All the DB nodes are in down
> status and skip writing status file.
> 2017-09-08 09:31:42: pid 4434: LOG:  failover: no valid backends node found
> 2017-09-08 09:31:42: pid 4434: LOG:  Restart all children
> 2017-09-08 09:31:42: pid 4434: LOG:  failover: set new primary node: -1
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog received the failover
> command from remote pgpool-II node "localhost:11100 Linux
> tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog is processing the failover
> command [DEGENERATE_BACKEND_REQUEST] received from localhost:11100 Linux
> tishii-CF-SX3HE4BP
> 2017-09-08 09:31:42: pid 4436: LOG:  we do not require majority votes to
> proceed with failover
> 2017-09-08 09:31:42: pid 4436: DETAIL:  proceeding with the failover
> 2017-09-08 09:31:42: pid 4436: HINT:  failover_require_consensus is set to
> false
> 2017-09-08 09:31:42: pid 4436: LOG:  received degenerate backend request
> for node_id: 0 from pid [4436]
> 2017-09-08 09:31:42: pid 4497: LOG:  worker process received restart
> request
> 2017-09-08 09:31:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog is informed of failover start
> failover done. shutdown host /tmp(11002)2017-09-08 09:31:42: pid 4434:
> LOG:  failover done. shutdown host /tmp(11002)
> 2017-09-08 09:31:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog is informed of failover end
> 2017-09-08 09:31:42: pid 4434: LOG:  starting degeneration. shutdown host
> /tmp(11002)
> 2017-09-08 09:31:42: pid 4434: WARNING:  All the DB nodes are in down
> status and skip writing status file.
> 2017-09-08 09:31:42: pid 4434: LOG:  failover: no valid backends node found
> 2017-09-08 09:31:42: pid 4434: LOG:  Restart all children
> 2017-09-08 09:31:42: pid 4434: LOG:  failover: set new primary node: -1
> 2017-09-08 09:31:42: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4436: LOG:  watchdog is informed of failover start
> failover done. shutdown host /tmp(11002)2017-09-08 09:31:42: pid 4434:
> LOG:  failover done. shutdown host /tmp(11002)
> 2017-09-08 09:31:43: pid 4670: LOG:  failed to connect to PostgreSQL
> server by unix domain socket
> 2017-09-08 09:31:43: pid 4670: DETAIL:  connect to "/tmp/.s.PGSQL.11002"
> failed with error "No such file or directory"
> 2017-09-08 09:31:43: pid 4670: LOG:  received degenerate backend request
> for node_id: 0 from pid [4670]
> 2017-09-08 09:31:43: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:43: pid 4436: LOG:  watchdog received the failover
> command from local pgpool-II on IPC interface
> 2017-09-08 09:31:43: pid 4436: LOG:  watchdog is processing the failover
> command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC
> interface
> 2017-09-08 09:31:43: pid 4436: LOG:  Duplicate failover request from
> "localhost:11000 Linux tishii-CF-SX3HE4BP" node
> 2017-09-08 09:31:43: pid 4436: DETAIL:  request ignored
> 2017-09-08 09:31:43: pid 4436: LOG:  we do not require majority votes to
> proceed with failover
> 2017-09-08 09:31:43: pid 4436: DETAIL:  proceeding with the failover
> 2017-09-08 09:31:43: pid 4436: HINT:  failover_require_consensus is set to
> false
> 2017-09-08 09:31:43: pid 4670: FATAL:  failed to create a backend
> connection
> 2017-09-08 09:31:43: pid 4670: DETAIL:  executing failover on backend
> 2017-09-08 09:31:43: pid 4496: LOG:  restart request received in pcp child
> process
> 2017-09-08 09:31:43: pid 4434: LOG:  PCP child 4496 exits with status 0 in
> failover()
> 2017-09-08 09:31:43: pid 4434: LOG:  fork a new PCP child pid 4676 in
> failover()
> 2017-09-08 09:31:43: pid 4434: LOG:  Pgpool-II parent process has received
> failover request
> 2017-09-08 09:31:43: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:43: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:43: pid 4436: LOG:  watchdog is informed of failover end
> 2017-09-08 09:31:43: pid 4434: LOG:  starting degeneration. shutdown host
> /tmp(11002)
> 2017-09-08 09:31:43: pid 4434: WARNING:  All the DB nodes are in down
> status and skip writing status file.
> 2017-09-08 09:31:43: pid 4434: LOG:  failover: no valid backends node found
> 2017-09-08 09:31:43: pid 4434: LOG:  Restart all children
> 2017-09-08 09:31:43: pid 4434: LOG:  failover: set new primary node: -1
> 2017-09-08 09:31:43: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:43: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:43: pid 4436: LOG:  watchdog is informed of failover start
> failover done. shutdown host /tmp(11002)2017-09-08 09:31:43: pid 4434:
> LOG:  failover done. shutdown host /tmp(11002)
> 2017-09-08 09:31:44: pid 4676: LOG:  restart request received in pcp child
> process
> 2017-09-08 09:31:44: pid 4434: LOG:  PCP child 4676 exits with status 0 in
> failover()
> 2017-09-08 09:31:44: pid 4434: LOG:  fork a new PCP child pid 4709 in
> failover()
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4449 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4449 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4450 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4450 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4452 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4452 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4454 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4454 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4456 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4456 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4459 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4459 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4460 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4460 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4462 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4462 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4465 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4465 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4467 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4467 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4469 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4469 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4471 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4471 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4473 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4473 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4475 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4475 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4476 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4476 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4477 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4477 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4479 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4479 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4481 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4481 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4482 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4482 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4483 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4483 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4484 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4484 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4485 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4485 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4486 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4486 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4487 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4487 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4488 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4488 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4489 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4489 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4490 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4490 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4491 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4491 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4492 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4492 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4493 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4493 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4494 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4494 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4495 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4495 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  worker child process with pid: 4497
> exits with status 256
> 2017-09-08 09:31:44: pid 4434: LOG:  fork a new worker child process with
> pid: 4710
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4578 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4578 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4579 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4579 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4580 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4580 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4581 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4581 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4582 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4582 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4583 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4583 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4584 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4584 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4585 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4585 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4586 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4586 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4587 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4587 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4588 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4588 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4589 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4589 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4590 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4590 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4591 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4591 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4592 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4592 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4593 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4593 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4594 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4594 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4595 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4595 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4596 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4596 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4597 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4597 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4598 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4598 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4599 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4599 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4600 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4600 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4601 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4601 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4614 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4614 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4616 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4616 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4618 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4618 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4619 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4619 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4620 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4620 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4621 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4621 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4623 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4623 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4624 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4624 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4642 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4642 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4643 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4643 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4644 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4644 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4645 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4645 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4646 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4646 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4647 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4647 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4648 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4648 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4649 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4649 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4650 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4650 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4651 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4651 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4652 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4652 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4653 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4653 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4654 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4654 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4655 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4655 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4656 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4656 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4657 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4657 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4658 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4658 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4659 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4659 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4660 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4660 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4661 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4661 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4662 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4662 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4663 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4663 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4664 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4664 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4665 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4665 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4666 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4666 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4668 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4668 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4669 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4669 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4670 exits
> with status 256
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4670 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4671 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4671 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4672 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4672 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4673 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4673 exited
> with success and will not be restarted
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4674 exits
> with status 0
> 2017-09-08 09:31:44: pid 4434: LOG:  child process with pid: 4674 exited
> with success and will not be restarted
> 2017-09-08 09:31:46: pid 4706: LOG:  failed to connect to PostgreSQL
> server by unix domain socket
> 2017-09-08 09:31:46: pid 4706: DETAIL:  connect to "/tmp/.s.PGSQL.11002"
> failed with error "No such file or directory"
> 2017-09-08 09:31:46: pid 4706: LOG:  received degenerate backend request
> for node_id: 0 from pid [4706]
> 2017-09-08 09:31:46: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:46: pid 4436: LOG:  watchdog received the failover
> command from local pgpool-II on IPC interface
> 2017-09-08 09:31:46: pid 4436: LOG:  watchdog is processing the failover
> command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC
> interface
> 2017-09-08 09:31:46: pid 4436: LOG:  Duplicate failover request from
> "localhost:11000 Linux tishii-CF-SX3HE4BP" node
> 2017-09-08 09:31:46: pid 4436: DETAIL:  request ignored
> 2017-09-08 09:31:46: pid 4436: LOG:  we do not require majority votes to
> proceed with failover
> 2017-09-08 09:31:46: pid 4436: DETAIL:  proceeding with the failover
> 2017-09-08 09:31:46: pid 4436: HINT:  failover_require_consensus is set to
> false
> 2017-09-08 09:31:46: pid 4706: FATAL:  failed to create a backend
> connection
> 2017-09-08 09:31:46: pid 4706: DETAIL:  executing failover on backend
> 2017-09-08 09:31:46: pid 4434: LOG:  Pgpool-II parent process has received
> failover request
> 2017-09-08 09:31:46: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:46: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:46: pid 4436: LOG:  watchdog is informed of failover end
> 2017-09-08 09:31:46: pid 4434: LOG:  starting degeneration. shutdown host
> /tmp(11002)
> 2017-09-08 09:31:46: pid 4434: WARNING:  All the DB nodes are in down
> status and skip writing status file.
> 2017-09-08 09:31:46: pid 4434: LOG:  failover: no valid backends node found
> 2017-09-08 09:31:46: pid 4434: LOG:  Restart all children
> 2017-09-08 09:31:46: pid 4434: LOG:  failover: set new primary node: -1
> 2017-09-08 09:31:46: pid 4710: LOG:  worker process received restart
> request
> 2017-09-08 09:31:46: pid 4436: LOG:  new IPC connection received
> 2017-09-08 09:31:46: pid 4436: LOG:  received the failover indication from
> Pgpool-II on IPC interface
> 2017-09-08 09:31:46: pid 4436: LOG:  watchdog is informed of failover start
> failover done. shutdown host /tmp(11002)2017-09-08 09:31:46: pid 4434:
> LOG:  failover done. shutdown host /tmp(11002)
> 2017-09-08 09:31:47: pid 4709: LOG:  restart request received in pcp child
> process
> 2017-09-08 09:31:47: pid 4434: LOG:  PCP child 4709 exits with status 0 in
> failover()
> 2017-09-08 09:31:47: pid 4434: LOG:  fork a new PCP child pid 4744 in
> failover()
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4677 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4677 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4678 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4678 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4679 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4679 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4680 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4680 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4681 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4681 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4682 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4682 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4683 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4683 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4684 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4684 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4685 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4685 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4686 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4686 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4687 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4687 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4688 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4688 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4689 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4689 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4690 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4690 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4691 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4691 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4692 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4692 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4693 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4693 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4694 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4694 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4695 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4695 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4696 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4696 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4697 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4697 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4698 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4698 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4699 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4699 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4700 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4700 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4701 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4701 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4702 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4702 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4703 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4703 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4704 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4704 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4705 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4705 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4706 exits
> with status 256
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4706 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4707 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4707 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4708 exits
> with status 0
> 2017-09-08 09:31:47: pid 4434: LOG:  child process with pid: 4708 exited
> with success and will not be restarted
> 2017-09-08 09:31:47: pid 4434: LOG:  worker child process with pid: 4710
> exits with status 256
> 2017-09-08 09:31:47: pid 4434: LOG:  fork a new worker child process with
> pid: 4745
> 2017-09-08 09:33:10: pid 4434: LOG:  received fast shutdown request
> 2017-09-08 09:33:10: pid 4434: LOG:  shutdown request. closing listen
> socket
> 2017-09-08 09:33:10: pid 4436: LOG:  Watchdog is shutting down
> 2017-09-08 09:33:10: pid 4766: LOG:  watchdog: de-escalation started
> 2017-09-08 09:33:10: pid 4434: WARNING:  All the DB nodes are in down
> status and skip writing status file.
>
> 2017-09-08 09:29:41: pid 4441: LOG:  Backend status file
> /home/t-ishii/work/pgpool-II/current/pgpool2/src/test/
> regression/tests/004.watchdog/master/log/pgpool_status does not exist
> 2017-09-08 09:29:41: pid 4441: LOG:  waiting for watchdog to initialize
> 2017-09-08 09:29:41: pid 4443: LOG:  setting the local watchdog node name
> to "localhost:11100 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:41: pid 4443: LOG:  watchdog cluster is configured with 1
> remote nodes
> 2017-09-08 09:29:41: pid 4443: LOG:  watchdog remote node:0 on
> localhost:21004
> 2017-09-08 09:29:41: pid 4443: LOG:  interface monitoring is disabled in
> watchdog
> 2017-09-08 09:29:41: pid 4443: LOG:  watchdog node state changed from
> [DEAD] to [LOADING]
> 2017-09-08 09:29:41: pid 4443: LOG:  new outbond connection to
> localhost:21004
> 2017-09-08 09:29:41: pid 4443: LOG:  setting the remote node
> "localhost:11000 Linux tishii-CF-SX3HE4BP" as watchdog cluster master
> 2017-09-08 09:29:41: pid 4443: LOG:  watchdog node state changed from
> [LOADING] to [INITIALIZING]
> 2017-09-08 09:29:42: pid 4443: LOG:  watchdog node state changed from
> [INITIALIZING] to [STANDBY]
> 2017-09-08 09:29:42: pid 4443: LOG:  successfully joined the watchdog
> cluster as standby node
> 2017-09-08 09:29:42: pid 4443: DETAIL:  our join coordinator request is
> accepted by cluster leader node "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:42: pid 4441: LOG:  watchdog process is initialized
> 2017-09-08 09:29:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:29:42: pid 4441: LOG:  we have joined the watchdog cluster
> as STANDBY node
> 2017-09-08 09:29:42: pid 4441: DETAIL:  syncing the backend states from
> the MASTER watchdog node
> 2017-09-08 09:29:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:29:42: pid 4443: LOG:  received the get data request from
> local pgpool-II on IPC interface
> 2017-09-08 09:29:42: pid 4443: LOG:  get data request from local pgpool-II
> node received on IPC interface is forwarded to master watchdog node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:42: pid 4443: DETAIL:  waiting for the reply...
> 2017-09-08 09:29:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:29:42: pid 4441: LOG:  master watchdog node "localhost:11000
> Linux tishii-CF-SX3HE4BP" returned status for 1 backend nodes
> 2017-09-08 09:29:42: pid 4441: LOG:  Setting up socket for 127.0.0.1:11100
> 2017-09-08 09:29:42: pid 4445: LOG:  2 watchdog nodes are configured for
> lifecheck
> 2017-09-08 09:29:42: pid 4445: LOG:  watchdog nodes ID:0
> Name:"localhost:11100 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:42: pid 4445: DETAIL:  Host:"localhost" WD Port:21104
> pgpool-II port:11100
> 2017-09-08 09:29:42: pid 4445: LOG:  watchdog nodes ID:1
> Name:"localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:29:42: pid 4445: DETAIL:  Host:"localhost" WD Port:21004
> pgpool-II port:11000
> 2017-09-08 09:29:42: pid 4441: LOG:  pgpool-II successfully started.
> version 3.7devel (amefuriboshi)
> 2017-09-08 09:29:43: pid 4480: LOG:  set SO_REUSEPORT option to the socket
> 2017-09-08 09:29:43: pid 4478: LOG:  set SO_REUSEPORT option to the socket
> 2017-09-08 09:29:43: pid 4478: LOG:  creating watchdog heartbeat receive
> socket.
> 2017-09-08 09:29:43: pid 4478: DETAIL:  set SO_REUSEPORT
> 2017-09-08 09:29:43: pid 4480: LOG:  creating socket for sending heartbeat
> 2017-09-08 09:29:43: pid 4480: DETAIL:  set SO_REUSEPORT
> 2017-09-08 09:29:45: pid 4443: LOG:  new watchdog node connection is
> received from "127.0.0.1:21121"
> 2017-09-08 09:29:45: pid 4443: LOG:  new node joined the cluster
> hostname:"localhost" port:21004 pgpool_port:11000
> 2017-09-08 09:29:52: pid 4445: LOG:  watchdog: lifecheck started
> 2017-09-08 09:31:42: pid 4521: LOG:  failed to connect to PostgreSQL
> server by unix domain socket
> 2017-09-08 09:31:42: pid 4521: DETAIL:  connect to "/tmp/.s.PGSQL.11002"
> failed with error "No such file or directory"
> 2017-09-08 09:31:42: pid 4521: ERROR:  failed to make persistent db
> connection
> 2017-09-08 09:31:42: pid 4521: DETAIL:  connection to host:"/tmp:11002"
> failed
> 2017-09-08 09:31:42: pid 4521: LOG:  health check failed on node 0
> (timeout:0)
> 2017-09-08 09:31:42: pid 4521: LOG:  received degenerate backend request
> for node_id: 0 from pid [4521]
> 2017-09-08 09:31:42: pid 4441: LOG:  Pgpool-II parent process received
> sync backend signal from watchdog
> 2017-09-08 09:31:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4443: LOG:  failover request from local pgpool-II
> node received on IPC interface is forwarded to master watchdog node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4443: DETAIL:  waiting for the reply...
> 2017-09-08 09:31:42: pid 4441: LOG:  master watchdog has performed failover
> 2017-09-08 09:31:42: pid 4441: DETAIL:  syncing the backend states from
> the MASTER watchdog node
> 2017-09-08 09:31:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4443: LOG:  received the get data request from
> local pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4443: LOG:  get data request from local pgpool-II
> node received on IPC interface is forwarded to master watchdog node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4443: DETAIL:  waiting for the reply...
> 2017-09-08 09:31:42: pid 4521: LOG:  degenerate backend request for 1
> node(s) from pid [4521], will be handled by watchdog
> 2017-09-08 09:31:42: pid 4441: LOG:  master watchdog node "localhost:11000
> Linux tishii-CF-SX3HE4BP" returned status for 1 backend nodes
> 2017-09-08 09:31:42: pid 4441: LOG:  backend:0 is set to down status
> 2017-09-08 09:31:42: pid 4441: DETAIL:  backend:0 is DOWN on cluster
> master "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4441: LOG:  node status was chenged after the
> sync from "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4441: DETAIL:  all children needs to be restarted
> as we are not in streaming replication mode
> 2017-09-08 09:31:42: pid 4441: LOG:  Pgpool-II parent process received
> sync backend signal from watchdog
> 2017-09-08 09:31:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4441: LOG:  master watchdog has performed failover
> 2017-09-08 09:31:42: pid 4441: DETAIL:  syncing the backend states from
> the MASTER watchdog node
> 2017-09-08 09:31:42: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:42: pid 4443: LOG:  received the get data request from
> local pgpool-II on IPC interface
> 2017-09-08 09:31:42: pid 4443: LOG:  get data request from local pgpool-II
> node received on IPC interface is forwarded to master watchdog node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4443: DETAIL:  waiting for the reply...
> 2017-09-08 09:31:42: pid 4441: LOG:  master watchdog node "localhost:11000
> Linux tishii-CF-SX3HE4BP" returned status for 1 backend nodes
> 2017-09-08 09:31:42: pid 4441: LOG:  backend nodes status remains same
> after the sync from "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4451 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4451 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4453 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4453 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4455 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4455 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4457 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4457 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4458 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4458 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4461 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4461 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4463 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4463 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4464 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4464 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4466 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4466 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4468 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4468 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4470 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4470 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4472 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4472 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4474 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4474 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4499 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4499 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4500 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4500 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4501 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4501 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4502 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4502 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4503 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4503 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4504 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4504 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4505 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4505 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4506 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4506 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4507 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4507 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4508 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4508 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4509 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4509 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4510 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4510 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4512 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4512 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4514 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4514 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4516 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4516 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4517 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4517 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4518 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4518 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4513 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4513 exited
> with success and will not be restarted
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4511 exits
> with status 0
> 2017-09-08 09:31:42: pid 4441: LOG:  child process with pid: 4511 exited
> with success and will not be restarted
> 2017-09-08 09:31:43: pid 4441: LOG:  Pgpool-II parent process received
> sync backend signal from watchdog
> 2017-09-08 09:31:43: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:43: pid 4441: LOG:  master watchdog has performed failover
> 2017-09-08 09:31:43: pid 4441: DETAIL:  syncing the backend states from
> the MASTER watchdog node
> 2017-09-08 09:31:43: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:43: pid 4443: LOG:  received the get data request from
> local pgpool-II on IPC interface
> 2017-09-08 09:31:43: pid 4443: LOG:  get data request from local pgpool-II
> node received on IPC interface is forwarded to master watchdog node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:43: pid 4443: DETAIL:  waiting for the reply...
> 2017-09-08 09:31:43: pid 4441: LOG:  master watchdog node "localhost:11000
> Linux tishii-CF-SX3HE4BP" returned status for 1 backend nodes
> 2017-09-08 09:31:43: pid 4441: LOG:  backend nodes status remains same
> after the sync from "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:46: pid 4441: LOG:  Pgpool-II parent process received
> sync backend signal from watchdog
> 2017-09-08 09:31:46: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:46: pid 4441: LOG:  master watchdog has performed failover
> 2017-09-08 09:31:46: pid 4441: DETAIL:  syncing the backend states from
> the MASTER watchdog node
> 2017-09-08 09:31:46: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:31:46: pid 4443: LOG:  received the get data request from
> local pgpool-II on IPC interface
> 2017-09-08 09:31:46: pid 4443: LOG:  get data request from local pgpool-II
> node received on IPC interface is forwarded to master watchdog node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:46: pid 4443: DETAIL:  waiting for the reply...
> 2017-09-08 09:31:46: pid 4441: LOG:  master watchdog node "localhost:11000
> Linux tishii-CF-SX3HE4BP" returned status for 1 backend nodes
> 2017-09-08 09:31:46: pid 4441: LOG:  backend nodes status remains same
> after the sync from "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:31:56: pid 4622: FATAL:  pgpool is not accepting any new
> connections
> 2017-09-08 09:31:56: pid 4622: DETAIL:  all backend nodes are down, pgpool
> requires at least one valid node
> 2017-09-08 09:31:56: pid 4622: HINT:  repair the backend nodes and restart
> pgpool
> 2017-09-08 09:31:56: pid 4441: LOG:  child process with pid: 4622 exits
> with status 256
> 2017-09-08 09:31:56: pid 4441: LOG:  fork a new child process with pid:
> 4748
> 2017-09-08 09:32:10: pid 4637: FATAL:  pgpool is not accepting any new
> connections
> 2017-09-08 09:32:10: pid 4637: DETAIL:  all backend nodes are down, pgpool
> requires at least one valid node
> 2017-09-08 09:32:10: pid 4637: HINT:  repair the backend nodes and restart
> pgpool
> 2017-09-08 09:32:10: pid 4441: LOG:  child process with pid: 4637 exits
> with status 256
> 2017-09-08 09:32:10: pid 4441: LOG:  fork a new child process with pid:
> 4753
> 2017-09-08 09:33:10: pid 4443: LOG:  remote node "localhost:11000 Linux
> tishii-CF-SX3HE4BP" is shutting down
> 2017-09-08 09:33:10: pid 4443: LOG:  watchdog cluster has lost the
> coordinator node
> 2017-09-08 09:33:10: pid 4443: LOG:  unassigning the remote node
> "localhost:11000 Linux tishii-CF-SX3HE4BP" from watchdog cluster master
> 2017-09-08 09:33:10: pid 4443: LOG:  We have lost the cluster master node
> "localhost:11000 Linux tishii-CF-SX3HE4BP"
> 2017-09-08 09:33:10: pid 4443: LOG:  watchdog node state changed from
> [STANDBY] to [JOINING]
> 2017-09-08 09:33:14: pid 4443: LOG:  watchdog node state changed from
> [JOINING] to [INITIALIZING]
> 2017-09-08 09:33:15: pid 4443: LOG:  I am the only alive node in the
> watchdog cluster
> 2017-09-08 09:33:15: pid 4443: HINT:  skiping stand for coordinator state
> 2017-09-08 09:33:15: pid 4443: LOG:  watchdog node state changed from
> [INITIALIZING] to [MASTER]
> 2017-09-08 09:33:15: pid 4443: LOG:  I am announcing my self as
> master/coordinator watchdog node
> 2017-09-08 09:33:19: pid 4443: LOG:  I am the cluster leader node
> 2017-09-08 09:33:19: pid 4443: DETAIL:  our declare coordinator message is
> accepted by all nodes
> 2017-09-08 09:33:19: pid 4443: LOG:  setting the local node
> "localhost:11100 Linux tishii-CF-SX3HE4BP" as watchdog cluster master
> 2017-09-08 09:33:19: pid 4443: LOG:  I am the cluster leader node.
> Starting escalation process
> 2017-09-08 09:33:19: pid 4443: LOG:  escalation process started with
> PID:4769
> 2017-09-08 09:33:19: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:33:19: pid 4769: LOG:  watchdog: escalation started
> 2017-09-08 09:33:19: pid 4443: LOG:  watchdog escalation process with pid:
> 4769 exit with SUCCESS.
> 2017-09-08 09:33:40: pid 4445: LOG:  informing the node status change to
> watchdog
> 2017-09-08 09:33:40: pid 4445: DETAIL:  node id :1 status = "NODE DEAD"
> message:"No heartbeat signal from node"
> 2017-09-08 09:33:40: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:33:40: pid 4443: LOG:  received node status change ipc
> message
> 2017-09-08 09:33:40: pid 4443: DETAIL:  No heartbeat signal from node
> 2017-09-08 09:33:40: pid 4443: LOG:  remote node "localhost:11000 Linux
> tishii-CF-SX3HE4BP" is shutting down
> 2017-09-08 09:36:57: pid 4519: LOG:  forked new pcp worker, pid=4813
> socket=6
> 2017-09-08 09:36:57: pid 4443: LOG:  new IPC connection received
> 2017-09-08 09:36:57: pid 4519: LOG:  PCP process with pid: 4813 exit with
> SUCCESS.
> 2017-09-08 09:36:57: pid 4519: LOG:  PCP process with pid: 4813 exits with
> status 0
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20170912/a5f543c0/attachment-0001.html>


More information about the pgpool-hackers mailing list