[pgpool-hackers: 1539] Re: Allow to access to pgpool while doing health checking

Tatsuo Ishii ishii at postgresql.org
Thu May 5 14:54:39 JST 2016


> It will be great to get this fix committed.
> 
> Are you planning to back port this fix?

Yes, I have committed to through 3.2 stable. (3.1 does not behaves
differently and I did not back patch it). It always does fail over when
failing connect to backend even if fail_over_on_backend_error = on).

> On Wed, May 4, 2016 at 12:27 PM, Muhammad Usama <m.usama at gmail.com> wrote:
> 
>>
>>
>> On Wed, May 4, 2016 at 5:53 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>
>>> Usama,
>>>
>>> Thank you for testing the patch. Shall I commit/push it after you fix
>>> the regression test issue? Or shall I commit/push it now? Either is
>>> fine for me.
>>>
>>
>>
>> I have already pushed the fix for regression failure, and was waiting for
>> the buildfarm results for the confirmation.  Todays results verified the
>> fix, and you can go ahead with committing the patch.
>>
>> Kind regards
>> Muhammad Usama
>>
>>>
>>> Best regards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese:http://www.sraoss.co.jp
>>>
>>> > Hi Ishii-San
>>> >
>>> > I have tested the patch, It successfully takes care of the very annoying
>>> > problem and it is working as expected.
>>> >
>>> > Best regards
>>> > Muhammad Usama
>>> >
>>> >
>>> > On Tue, May 3, 2016 at 5:25 PM, Tatsuo Ishii <ishii at postgresql.org>
>>> wrote:
>>> >
>>> >> Currently any attempt to connect to pgpool fails if pgpool is doing
>>> >> health check against failed node even if fail_over_on_backend_error is
>>> >> off because pgpool child first tries to connect to all backend
>>> >> including the failed one and exits if it fails to connect to a backend
>>> >> (of course it fails). This is a temporary situation and will be
>>> >> resolved before pgpool executes failover. However if the health check
>>> >> is retrying, the temporary situation keeps longer depending on the
>>> >> setting of health_check_max_retries and health_check_retry_delay. This
>>> >> is not good. Attached patch tries to mitigate the problem:
>>> >>
>>> >> - When an attempt to connect to backend fails, give up connecting to
>>> >>   the failed node and skip to other node, rather than exiting the
>>> >>   process if operating in streaming replication mode and the node is
>>> >>   not primary node.
>>> >>
>>> >> - Mark the local status of the failed node to "down".
>>> >>
>>> >> - This will let the primary node be selected as a load balance node
>>> >>   and every queries will be sent to the primary node. If there's other
>>> >>   healthy standby nodes, one of them will be chosen as the load
>>> >>   balance node.
>>> >>
>>> >> - After the session is over, the child process will suicide to not
>>> >>   retain the local status.
>>> >>
>>> >> Comments?
>>> >> --
>>> >> Tatsuo Ishii
>>> >> SRA OSS, Inc. Japan
>>> >> English: http://www.sraoss.co.jp/index_en.php
>>> >> Japanese:http://www.sraoss.co.jp
>>> >>
>>> >> diff --git a/src/include/pool.h b/src/include/pool.h
>>> >> index 4c6e82f..1f43efd 100644
>>> >> --- a/src/include/pool.h
>>> >> +++ b/src/include/pool.h
>>> >> @@ -323,6 +323,7 @@ extern int my_master_node_id;
>>> >>   */
>>> >>  #define PRIMARY_NODE_ID (Req_info->primary_node_id >=0?\
>>> >>
>>> >>  Req_info->primary_node_id:REAL_MASTER_NODE_ID)
>>> >> +#define IS_PRIMARY_NODE_ID(node_id)    (node_id == PRIMARY_NODE_ID)
>>> >>
>>> >>  /*
>>> >>   * Real primary node id. If not in the mode or there's no primary
>>> >> diff --git a/src/protocol/pool_connection_pool.c
>>> >> b/src/protocol/pool_connection_pool.c
>>> >> index b7cc946..7c33366 100644
>>> >> --- a/src/protocol/pool_connection_pool.c
>>> >> +++ b/src/protocol/pool_connection_pool.c
>>> >> @@ -812,8 +812,8 @@ static POOL_CONNECTION_POOL_SLOT
>>> >> *create_cp(POOL_CONNECTION_POOL_SLOT *cp, int s
>>> >>  }
>>> >>
>>> >>  /*
>>> >> - * create actual connections to backends
>>> >> - * new connection resides in TopMemoryContext
>>> >> + * Create actual connections to backends.
>>> >> + * New connection resides in TopMemoryContext.
>>> >>   */
>>> >>  static POOL_CONNECTION_POOL *new_connection(POOL_CONNECTION_POOL *p)
>>> >>  {
>>> >> @@ -851,12 +851,34 @@ static POOL_CONNECTION_POOL
>>> >> *new_connection(POOL_CONNECTION_POOL *p)
>>> >>                                 ereport(FATAL,
>>> >>                                         (errmsg("failed to create a
>>> >> backend connection"),
>>> >>                                                  errdetail("executing
>>> >> failover on backend")));
>>> >> -                       }
>>> >> +                       }
>>> >>                         else
>>> >>                         {
>>> >> -                               ereport(FATAL,
>>> >> -                                       (errmsg("failed to create a
>>> >> backend connection"),
>>> >> -                                                errdetail("not
>>> executing
>>> >> failover because fail_over_on_backend_error is off")));
>>> >> +                               /*
>>> >> +                                * If we are in streaming replication
>>> mode
>>> >> and the node is a
>>> >> +                                * standby node, then we skip this
>>> node to
>>> >> avoid fail over.
>>> >> +                                */
>>> >> +                               if (STREAM && !IS_PRIMARY_NODE_ID(i))
>>> >> +                               {
>>> >> +                                       ereport(LOG,
>>> >> +
>>>  (errmsg("failed to
>>> >> create a backend %d connection", i),
>>> >> +
>>> errdetail("skip
>>> >> this backend because because fail_over_on_backend_error is off and we
>>> are
>>> >> in streaming replication mode and node is standby node")));
>>> >> +
>>> >> +                                       /* set down status to local
>>> status
>>> >> area */
>>> >> +                                       *(my_backend_status[i]) =
>>> CON_DOWN;
>>> >> +
>>> >> +                                       /* make sure that we need to
>>> >> restart the process after
>>> >> +                                        * finishing this session
>>> >> +                                        */
>>> >> +
>>> >>  pool_get_my_process_info()->need_to_restart = 1;
>>> >> +                                       continue;
>>> >> +                               }
>>> >> +                               else
>>> >> +                               {
>>> >> +                                       ereport(FATAL,
>>> >> +
>>>  (errmsg("failed to
>>> >> create a backend %d connection", i),
>>> >> +                                                        errdetail("not
>>> >> executing failover because fail_over_on_backend_error is off")));
>>> >> +                               }
>>> >>                         }
>>> >>                         child_exit(POOL_EXIT_AND_RESTART);
>>> >>                 }
>>> >> diff --git a/src/utils/pool_process_reporting.c
>>> >> b/src/utils/pool_process_reporting.c
>>> >> index 9b190c7..6cfd860 100644
>>> >> --- a/src/utils/pool_process_reporting.c
>>> >> +++ b/src/utils/pool_process_reporting.c
>>> >> @@ -5,7 +5,7 @@
>>> >>   * pgpool: a language independent connection pool server for
>>> PostgreSQL
>>> >>   * written by Tatsuo Ishii
>>> >>   *
>>> >> - * Copyright (c) 2003-2015     PgPool Global Development Group
>>> >> + * Copyright (c) 2003-2016     PgPool Global Development Group
>>> >>   *
>>> >>   * Permission to use, copy, modify, and distribute this software and
>>> >>   * its documentation for any purpose and without fee is hereby
>>> >>
>>> >> _______________________________________________
>>> >> pgpool-hackers mailing list
>>> >> pgpool-hackers at pgpool.net
>>> >> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>> >>
>>> >>
>>>
>>
>>
>> _______________________________________________
>> pgpool-hackers mailing list
>> pgpool-hackers at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>
>>
> 
> 
> -- 
> Ahsan Hadi
> Snr Director Product Development
> EnterpriseDB Corporation
> The Enterprise Postgres Company
> 
> Phone: +92-51-8358874
> Mobile: +92-333-5162114
> 
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
> 
> This e-mail message (and any attachment) is intended for the use of the
> individual or entity to whom it is addressed. This message contains
> information from EnterpriseDB Corporation that may be privileged,
> confidential, or exempt from disclosure under applicable law. If you are
> not the intended recipient or authorized to receive this for the intended
> recipient, any use, dissemination, distribution, retention, archiving, or
> copying of this communication is strictly prohibited. If you have received
> this e-mail in error, please notify the sender immediately by reply e-mail
> and delete this message.


More information about the pgpool-hackers mailing list