<div dir="ltr">Hi Ishii-San<div><br></div><div>I have tested the patch, It successfully takes care of the very annoying problem and it is working as expected.</div><div><br></div><div>Best regards</div><div>Muhammad Usama</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 3, 2016 at 5:25 PM, Tatsuo Ishii <span dir="ltr"><<a href="mailto:ishii@postgresql.org" target="_blank">ishii@postgresql.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Currently any attempt to connect to pgpool fails if pgpool is doing<br>
health check against failed node even if fail_over_on_backend_error is<br>
off because pgpool child first tries to connect to all backend<br>
including the failed one and exits if it fails to connect to a backend<br>
(of course it fails). This is a temporary situation and will be<br>
resolved before pgpool executes failover. However if the health check<br>
is retrying, the temporary situation keeps longer depending on the<br>
setting of health_check_max_retries and health_check_retry_delay. This<br>
is not good. Attached patch tries to mitigate the problem:<br>
<br>
- When an attempt to connect to backend fails, give up connecting to<br>
the failed node and skip to other node, rather than exiting the<br>
process if operating in streaming replication mode and the node is<br>
not primary node.<br>
<br>
- Mark the local status of the failed node to "down".<br>
<br>
- This will let the primary node be selected as a load balance node<br>
and every queries will be sent to the primary node. If there's other<br>
healthy standby nodes, one of them will be chosen as the load<br>
balance node.<br>
<br>
- After the session is over, the child process will suicide to not<br>
retain the local status.<br>
<br>
Comments?<br>
--<br>
Tatsuo Ishii<br>
SRA OSS, Inc. Japan<br>
English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>
Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>
<br>diff --git a/src/include/pool.h b/src/include/pool.h<br>
index 4c6e82f..1f43efd 100644<br>
--- a/src/include/pool.h<br>
+++ b/src/include/pool.h<br>
@@ -323,6 +323,7 @@ extern int my_master_node_id;<br>
*/<br>
#define PRIMARY_NODE_ID (Req_info->primary_node_id >=0?\<br>
Req_info->primary_node_id:REAL_MASTER_NODE_ID)<br>
+#define IS_PRIMARY_NODE_ID(node_id) (node_id == PRIMARY_NODE_ID)<br>
<br>
/*<br>
* Real primary node id. If not in the mode or there's no primary<br>
diff --git a/src/protocol/pool_connection_pool.c b/src/protocol/pool_connection_pool.c<br>
index b7cc946..7c33366 100644<br>
--- a/src/protocol/pool_connection_pool.c<br>
+++ b/src/protocol/pool_connection_pool.c<br>
@@ -812,8 +812,8 @@ static POOL_CONNECTION_POOL_SLOT *create_cp(POOL_CONNECTION_POOL_SLOT *cp, int s<br>
}<br>
<br>
/*<br>
- * create actual connections to backends<br>
- * new connection resides in TopMemoryContext<br>
+ * Create actual connections to backends.<br>
+ * New connection resides in TopMemoryContext.<br>
*/<br>
static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)<br>
{<br>
@@ -851,12 +851,34 @@ static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)<br>
ereport(FATAL,<br>
(errmsg("failed to create a backend connection"),<br>
errdetail("executing failover on backend")));<br>
- }<br>
+ }<br>
else<br>
{<br>
- ereport(FATAL,<br>
- (errmsg("failed to create a backend connection"),<br>
- errdetail("not executing failover because fail_over_on_backend_error is off")));<br>
+ /*<br>
+ * If we are in streaming replication mode and the node is a<br>
+ * standby node, then we skip this node to avoid fail over.<br>
+ */<br>
+ if (STREAM && !IS_PRIMARY_NODE_ID(i))<br>
+ {<br>
+ ereport(LOG,<br>
+ (errmsg("failed to create a backend %d connection", i),<br>
+ errdetail("skip this backend because because fail_over_on_backend_error is off and we are in streaming replication mode and node is standby node")));<br>
+<br>
+ /* set down status to local status area */<br>
+ *(my_backend_status[i]) = CON_DOWN;<br>
+<br>
+ /* make sure that we need to restart the process after<br>
+ * finishing this session<br>
+ */<br>
+ pool_get_my_process_info()->need_to_restart = 1;<br>
+ continue;<br>
+ }<br>
+ else<br>
+ {<br>
+ ereport(FATAL,<br>
+ (errmsg("failed to create a backend %d connection", i),<br>
+ errdetail("not executing failover because fail_over_on_backend_error is off")));<br>
+ }<br>
}<br>
child_exit(POOL_EXIT_AND_RESTART);<br>
}<br>
diff --git a/src/utils/pool_process_reporting.c b/src/utils/pool_process_reporting.c<br>
index 9b190c7..6cfd860 100644<br>
--- a/src/utils/pool_process_reporting.c<br>
+++ b/src/utils/pool_process_reporting.c<br>
@@ -5,7 +5,7 @@<br>
* pgpool: a language independent connection pool server for PostgreSQL<br>
* written by Tatsuo Ishii<br>
*<br>
- * Copyright (c) 2003-2015 PgPool Global Development Group<br>
+ * Copyright (c) 2003-2016 PgPool Global Development Group<br>
*<br>
* Permission to use, copy, modify, and distribute this software and<br>
* its documentation for any purpose and without fee is hereby<br>
<br>_______________________________________________<br>
pgpool-hackers mailing list<br>
<a href="mailto:pgpool-hackers@pgpool.net">pgpool-hackers@pgpool.net</a><br>
<a href="http://www.pgpool.net/mailman/listinfo/pgpool-hackers" rel="noreferrer" target="_blank">http://www.pgpool.net/mailman/listinfo/pgpool-hackers</a><br>
<br></blockquote></div><br></div></div></div>