[pgpool-hackers: 2321] Corner case bug in primary failover

Tue May 9 10:16:53 JST 2017

Usama,

While adding a new test case to 003.failover regression test, I found
a corner case bug in primary failover.

Suppose Pgpool-II starts but is yet finding primary node. If primary
failover happens, it skips finding primary node and let the initial
value of it (Req_info->primary_node_id == -1) to be used as the new
primary node id. As a result, no primary node id exists until next
failover happens.

Initialy I thought The problem is in the code of
pgpool_main.c:failover() which tries to optimize finding primary node
process.

		/*
		 * If the down node was a standby node in streaming replication
		 * mode, we can avoid calling find_primary_node_repeatedly() and
		 * recognize the former primary as the new primary node, which
		 * will reduce the time to process standby down.
		 */
		else if (MASTER_SLAVE && pool_config->master_slave_sub_mode == STREAM_MODE &&
				 reqkind == NODE_DOWN_REQUEST)
		{
			if (Req_info->primary_node_id != node_id)
				new_primary = Req_info->primary_node_id;
			else
				new_primary =  find_primary_node_repeatedly();

I was attempting to fix it by checking Req_info->primary_node_id to
see if it's initial value (-1) or not. If it's -1,
find_primary_node_repeatedly() need to be called.

But looking into pgpool_main() closely, I suspect there's a
fundamental problem:

1) It processes failover in CHECK_REQUEST *before* setting
   Req_info->primary_node_id.

	/*
	 * check for child signals to ensure child startup before reporting successfull start
	 */
	CHECK_REQUEST;

	ereport(LOG,
			(errmsg("%s successfully started. version %s (%s)", PACKAGE, VERSION, PGPOOLVERSION)));

	/*
	 * if the primary node id is not loaded by watchdog, search for it
	 */
	if (Req_info->primary_node_id < 0)
	{
		/* Save primary node id */
		Req_info->primary_node_id = find_primary_node();
	}

2) It uses find_primary_node(), rather than
   find_primary_node_repeatedly(). So if by some reasons (for example
   the backend does not come up yet), find_primary_node() will fail
   and Req_info->primary_node_id is set to -1.

I think proper fix will be moving the CHECK_REQUEST call above inside
main loop, and change the find_primary_node() call to
find_primary_node_repeatedly().

Attached is the patch to do that (plus change the
search_primary_node_timeout to smaller value in 055.backend_all_down
test. Otherwise, regression timeout is triggered) against master
branch.

What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: failover.patch
Type: text/x-patch
Size: 2343 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20170509/3a548be8/attachment.bin>