[pgpool-hackers: 2562] Watchdog fix commit to 3.4?

Tatsuo Ishii ishii at sraoss.co.jp
Sun Oct 8 08:36:41 JST 2017


Usama,

You did not back patch following commit to 3.4:

commit bda946e718fe6f3605eb7e82ada8754bd84a279c
Author: Muhammad Usama <m.usama at gmail.com>
Date:   Tue Aug 2 17:31:22 2016 +0500

    Fix for 218: Inconsistent status of Postgresql nodes in pgPool instances
    after restart. Watchdog not syncing status.

In this commit you added:

	/* initialize Req_info */
	Req_info->master_node_id = get_next_master_node();
	Req_info->conn_counter = 0;
	Req_info->switching = false;
	Req_info->request_queue_head = Req_info->request_queue_tail = -1;
	Req_info->primary_node_id = -2; <--- added

So this was not added to 3.4.

After that I committed following back to 3.4.
https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=e4ce880bd36b8f249bf693c086a1313148f3449a

commit e4ce880bd36b8f249bf693c086a1313148f3449a
Author: Tatsuo Ishii <ishii at postgresql.org>
Date:   Wed May 10 08:30:17 2017 +0900

    Fix corner case bug in Pgpool-II starting up.
    
    It is possible that a failover request is accepted before primary node
    is searched.  This leads Pgpool-II to a strange state: there's no
    primary node if the failed node was a primary node (even if new
    primary node exists as a result of promotion of existing standby).
    
    See [pgpool-hackers: 2321] for more details.

In the commit I did this:

+       /*
+        * if the primary node id is not loaded by watchdog, search for it
+        */
+       if (Req_info->primary_node_id < 0)
+       {
+               /* Save primary node id */
+               Req_info->primary_node_id = find_primary_node_repeatedly();
+       }
+

Since in 3.4 Req_info->primary_node_id is initialized to 0,
primary_node_repeatedly() is not called at the start up and this makes
3.4 is not workable when the primary node is other than 0.

I think we need to add following into 3.4:

	Req_info->primary_node_id = -2;

But I am not sure I am right because there maybe a reason why you
did't backpatch bda946e718fe6f3605eb7e82ada8754bd84a279c to 3.4.

Can you please explain why you didn't back patch it?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list