<div dir="ltr">Hi Ishii-San<div><br></div><div>I have tested the patch, It successfully takes care of the very annoying problem and it is working as expected.</div><div><br></div><div>Best regards</div><div>Muhammad Usama</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 3, 2016 at 5:25 PM, Tatsuo Ishii <span dir="ltr">&lt;<a href="mailto:ishii@postgresql.org" target="_blank">ishii@postgresql.org</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Currently any attempt to connect to pgpool fails if pgpool is doing<br>
health check against failed node even if fail_over_on_backend_error is<br>
off because pgpool child first tries to connect to all backend<br>
including the failed one and exits if it fails to connect to a backend<br>
(of course it fails). This is a temporary situation and will be<br>
resolved before pgpool executes failover. However if the health check<br>
is retrying, the temporary situation keeps longer depending on the<br>
setting of health_check_max_retries and health_check_retry_delay. This<br>
is not good. Attached patch tries to mitigate the problem:<br>
<br>
- When an attempt to connect to backend fails, give up connecting to<br>
  the failed node and skip to other node, rather than exiting the<br>
  process if operating in streaming replication mode and the node is<br>
  not primary node.<br>
<br>
- Mark the local status of the failed node to &quot;down&quot;.<br>
<br>
- This will let the primary node be selected as a load balance node<br>
  and every queries will be sent to the primary node. If there&#39;s other<br>
  healthy standby nodes, one of them will be chosen as the load<br>
  balance node.<br>
<br>
- After the session is over, the child process will suicide to not<br>
  retain the local status.<br>
<br>
Comments?<br>
--<br>
Tatsuo Ishii<br>
SRA OSS, Inc. Japan<br>
English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>
Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>
<br>diff --git a/src/include/pool.h b/src/include/pool.h<br>
index 4c6e82f..1f43efd 100644<br>
--- a/src/include/pool.h<br>
+++ b/src/include/pool.h<br>
@@ -323,6 +323,7 @@ extern int my_master_node_id;<br>
  */<br>
 #define PRIMARY_NODE_ID (Req_info-&gt;primary_node_id &gt;=0?\<br>
                                                 Req_info-&gt;primary_node_id:REAL_MASTER_NODE_ID)<br>
+#define IS_PRIMARY_NODE_ID(node_id)    (node_id == PRIMARY_NODE_ID)<br>
<br>
 /*<br>
  * Real primary node id. If not in the mode or there&#39;s no primary<br>
diff --git a/src/protocol/pool_connection_pool.c b/src/protocol/pool_connection_pool.c<br>
index b7cc946..7c33366 100644<br>
--- a/src/protocol/pool_connection_pool.c<br>
+++ b/src/protocol/pool_connection_pool.c<br>
@@ -812,8 +812,8 @@ static POOL_CONNECTION_POOL_SLOT *create_cp(POOL_CONNECTION_POOL_SLOT *cp, int s<br>
 }<br>
<br>
 /*<br>
- * create actual connections to backends<br>
- * new connection resides in TopMemoryContext<br>
+ * Create actual connections to backends.<br>
+ * New connection resides in TopMemoryContext.<br>
  */<br>
 static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)<br>
 {<br>
@@ -851,12 +851,34 @@ static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)<br>
                                ereport(FATAL,<br>
                                        (errmsg(&quot;failed to create a backend connection&quot;),<br>
                                                 errdetail(&quot;executing failover on backend&quot;)));<br>
-                       }<br>
+                       }<br>
                        else<br>
                        {<br>
-                               ereport(FATAL,<br>
-                                       (errmsg(&quot;failed to create a backend connection&quot;),<br>
-                                                errdetail(&quot;not executing failover because fail_over_on_backend_error is off&quot;)));<br>
+                               /*<br>
+                                * If we are in streaming replication mode and the node is a<br>
+                                * standby node, then we skip this node to avoid fail over.<br>
+                                */<br>
+                               if (STREAM &amp;&amp; !IS_PRIMARY_NODE_ID(i))<br>
+                               {<br>
+                                       ereport(LOG,<br>
+                                                       (errmsg(&quot;failed to create a backend %d connection&quot;, i),<br>
+                                                        errdetail(&quot;skip this backend because because fail_over_on_backend_error is off and we are in streaming replication mode and node is standby node&quot;)));<br>
+<br>
+                                       /* set down status to local status area */<br>
+                                       *(my_backend_status[i]) = CON_DOWN;<br>
+<br>
+                                       /* make sure that we need to restart the process after<br>
+                                        * finishing this session<br>
+                                        */<br>
+                                       pool_get_my_process_info()-&gt;need_to_restart = 1;<br>
+                                       continue;<br>
+                               }<br>
+                               else<br>
+                               {<br>
+                                       ereport(FATAL,<br>
+                                                       (errmsg(&quot;failed to create a backend %d connection&quot;, i),<br>
+                                                        errdetail(&quot;not executing failover because fail_over_on_backend_error is off&quot;)));<br>
+                               }<br>
                        }<br>
                        child_exit(POOL_EXIT_AND_RESTART);<br>
                }<br>
diff --git a/src/utils/pool_process_reporting.c b/src/utils/pool_process_reporting.c<br>
index 9b190c7..6cfd860 100644<br>
--- a/src/utils/pool_process_reporting.c<br>
+++ b/src/utils/pool_process_reporting.c<br>
@@ -5,7 +5,7 @@<br>
  * pgpool: a language independent connection pool server for PostgreSQL<br>
  * written by Tatsuo Ishii<br>
  *<br>
- * Copyright (c) 2003-2015     PgPool Global Development Group<br>
+ * Copyright (c) 2003-2016     PgPool Global Development Group<br>
  *<br>
  * Permission to use, copy, modify, and distribute this software and<br>
  * its documentation for any purpose and without fee is hereby<br>
<br>_______________________________________________<br>
pgpool-hackers mailing list<br>
<a href="mailto:pgpool-hackers@pgpool.net">pgpool-hackers@pgpool.net</a><br>
<a href="http://www.pgpool.net/mailman/listinfo/pgpool-hackers" rel="noreferrer" target="_blank">http://www.pgpool.net/mailman/listinfo/pgpool-hackers</a><br>
<br></blockquote></div><br></div></div></div>