[pgpool-hackers: 1538] Re: Allow to access to pgpool while doing health checking

Wed May 4 20:17:53 JST 2016

> On Wed, May 4, 2016 at 5:53 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> 
>> Usama,
>>
>> Thank you for testing the patch. Shall I commit/push it after you fix
>> the regression test issue? Or shall I commit/push it now? Either is
>> fine for me.
>>
> 
> I have already pushed the fix for regression failure, and was waiting for
> the buildfarm results for the confirmation.  Todays results verified the
> fix, and you can go ahead with committing the patch.

Ok, will do.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Kind regards
> Muhammad Usama
> 
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > Hi Ishii-San
>> >
>> > I have tested the patch, It successfully takes care of the very annoying
>> > problem and it is working as expected.
>> >
>> > Best regards
>> > Muhammad Usama
>> >
>> >
>> > On Tue, May 3, 2016 at 5:25 PM, Tatsuo Ishii <ishii at postgresql.org>
>> wrote:
>> >
>> >> Currently any attempt to connect to pgpool fails if pgpool is doing
>> >> health check against failed node even if fail_over_on_backend_error is
>> >> off because pgpool child first tries to connect to all backend
>> >> including the failed one and exits if it fails to connect to a backend
>> >> (of course it fails). This is a temporary situation and will be
>> >> resolved before pgpool executes failover. However if the health check
>> >> is retrying, the temporary situation keeps longer depending on the
>> >> setting of health_check_max_retries and health_check_retry_delay. This
>> >> is not good. Attached patch tries to mitigate the problem:
>> >>
>> >> - When an attempt to connect to backend fails, give up connecting to
>> >>   the failed node and skip to other node, rather than exiting the
>> >>   process if operating in streaming replication mode and the node is
>> >>   not primary node.
>> >>
>> >> - Mark the local status of the failed node to "down".
>> >>
>> >> - This will let the primary node be selected as a load balance node
>> >>   and every queries will be sent to the primary node. If there's other
>> >>   healthy standby nodes, one of them will be chosen as the load
>> >>   balance node.
>> >>
>> >> - After the session is over, the child process will suicide to not
>> >>   retain the local status.
>> >>
>> >> Comments?
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS, Inc. Japan
>> >> English: http://www.sraoss.co.jp/index_en.php
>> >> Japanese:http://www.sraoss.co.jp
>> >>
>> >> diff --git a/src/include/pool.h b/src/include/pool.h
>> >> index 4c6e82f..1f43efd 100644
>> >> --- a/src/include/pool.h
>> >> +++ b/src/include/pool.h
>> >> @@ -323,6 +323,7 @@ extern int my_master_node_id;
>> >>   */
>> >>  #define PRIMARY_NODE_ID (Req_info->primary_node_id >=0?\
>> >>
>> >>  Req_info->primary_node_id:REAL_MASTER_NODE_ID)
>> >> +#define IS_PRIMARY_NODE_ID(node_id)    (node_id == PRIMARY_NODE_ID)
>> >>
>> >>  /*
>> >>   * Real primary node id. If not in the mode or there's no primary
>> >> diff --git a/src/protocol/pool_connection_pool.c
>> >> b/src/protocol/pool_connection_pool.c
>> >> index b7cc946..7c33366 100644
>> >> --- a/src/protocol/pool_connection_pool.c
>> >> +++ b/src/protocol/pool_connection_pool.c
>> >> @@ -812,8 +812,8 @@ static POOL_CONNECTION_POOL_SLOT
>> >> *create_cp(POOL_CONNECTION_POOL_SLOT *cp, int s
>> >>  }
>> >>
>> >>  /*
>> >> - * create actual connections to backends
>> >> - * new connection resides in TopMemoryContext
>> >> + * Create actual connections to backends.
>> >> + * New connection resides in TopMemoryContext.
>> >>   */
>> >>  static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)
>> >>  {
>> >> @@ -851,12 +851,34 @@ static POOL_CONNECTION_POOL
>> >> *new_connection(POOL_CONNECTION_POOL *p)
>> >>                                 ereport(FATAL,
>> >>                                         (errmsg("failed to create a
>> >> backend connection"),
>> >>                                                  errdetail("executing
>> >> failover on backend")));
>> >> -                       }
>> >> +                       }
>> >>                         else
>> >>                         {
>> >> -                               ereport(FATAL,
>> >> -                                       (errmsg("failed to create a
>> >> backend connection"),
>> >> -                                                errdetail("not
>> executing
>> >> failover because fail_over_on_backend_error is off")));
>> >> +                               /*
>> >> +                                * If we are in streaming replication
>> mode
>> >> and the node is a
>> >> +                                * standby node, then we skip this node
>> to
>> >> avoid fail over.
>> >> +                                */
>> >> +                               if (STREAM && !IS_PRIMARY_NODE_ID(i))
>> >> +                               {
>> >> +                                       ereport(LOG,
>> >> +                                                       (errmsg("failed
>> to
>> >> create a backend %d connection", i),
>> >> +                                                        errdetail("skip
>> >> this backend because because fail_over_on_backend_error is off and we
>> are
>> >> in streaming replication mode and node is standby node")));
>> >> +
>> >> +                                       /* set down status to local
>> status
>> >> area */
>> >> +                                       *(my_backend_status[i]) =
>> CON_DOWN;
>> >> +
>> >> +                                       /* make sure that we need to
>> >> restart the process after
>> >> +                                        * finishing this session
>> >> +                                        */
>> >> +
>> >>  pool_get_my_process_info()->need_to_restart = 1;
>> >> +                                       continue;
>> >> +                               }
>> >> +                               else
>> >> +                               {
>> >> +                                       ereport(FATAL,
>> >> +                                                       (errmsg("failed
>> to
>> >> create a backend %d connection", i),
>> >> +                                                        errdetail("not
>> >> executing failover because fail_over_on_backend_error is off")));
>> >> +                               }
>> >>                         }
>> >>                         child_exit(POOL_EXIT_AND_RESTART);
>> >>                 }
>> >> diff --git a/src/utils/pool_process_reporting.c
>> >> b/src/utils/pool_process_reporting.c
>> >> index 9b190c7..6cfd860 100644
>> >> --- a/src/utils/pool_process_reporting.c
>> >> +++ b/src/utils/pool_process_reporting.c
>> >> @@ -5,7 +5,7 @@
>> >>   * pgpool: a language independent connection pool server for PostgreSQL
>> >>   * written by Tatsuo Ishii
>> >>   *
>> >> - * Copyright (c) 2003-2015     PgPool Global Development Group
>> >> + * Copyright (c) 2003-2016     PgPool Global Development Group
>> >>   *
>> >>   * Permission to use, copy, modify, and distribute this software and
>> >>   * its documentation for any purpose and without fee is hereby
>> >>
>> >> _______________________________________________
>> >> pgpool-hackers mailing list
>> >> pgpool-hackers at pgpool.net
>> >> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>> >>
>> >>
>>