[pgpool-general: 901] Re: How can I automate actions when synchronous standby fails?

Sat Aug 18 14:27:09 JST 2012

> I'm thinking of using pgpool-II 3.2 to automate failover in
> synchronous streaming replication.  Please let me ask some questions
> for which I couldn't find clear answers in the pgpool-II manual.
> 
> The system consists of the following nodes:
> 
> dbnode0: DB server (initially primary server)
> dbnode1: DB server (initially standby server)
> appnode1...appnodeN: application servers; Java EE servers and pgpool
> run on all of them
> 
> [relevant settings in pgpool.conf]
> backend_hostname0 = dbnode0
> backend_port0 = 5432
> backend_hostname1 = dbnode1
> backend_port1 = 5432
> 
> According to the current specification of synchronous streaming
> replication, the applications hang when the standby goes down.  To
> resume those hung applications, I want to set
> synchronous_standby_names to '' and reload postgresql.conf
> automatically when the standby stops for any reason.  The related
> description in the manual is:
> 
> http://www.postgresql.org/docs/9.1/static/warm-standby.html#SYNCHRONOUS-REPLICATION
> 
> [excerpt]
> If you really do lose your last standby server then you should disable
> synchronous_standby_names and reload the configuration file on the
> primary server.
> 
> 
> Q1
> How can I achieve this with pgpool?  Is failover_command invoked when
> the standby stops working?

Yes.

> Q2
> What do the following special characters mean in failover_command
> description?  How does "master" differ from "primary"?  In my
> configuration, what values do they provide when the standby (dbnode1)
> goes down?
> 
> %M Old master node ID.
> %P Old primary node ID.

Usually they are same. They might be different if you do not have any
primary node (failed to promote to primary case).

> Q3
> What kind of problems could occur when many pgpool instances on the
> application servers invoke failover_command simultaneously and
> independently of one another?  What should I do to avoid those
> potential problems?

If you turn on watchdog, the second failover attempt will fail.

> Q4
> I found the below sentence in pgpool manual.  Does this apply even
> when the standby fails?  If yes, I would like to know any workaround
> or reason, because I believe standby failure should not affect
> application processing which is performed on the normal primary.
> 
> "When a failover is performed, pgpool kills all its child processes,
> which will in turn terminate all active sessions to pgpool."

Excerpt from main.c.
/*
 * Before we tried to minimize restarting pgpool to protect existing
 * connections from clients to pgpool children. What we did here was,
 * if children other than master went down, we did not fail over.
 * This is wrong. Think about following scenario. If someone
 * accidentally plugs out the network cable, the TCP/IP stack keeps
 * retrying for long time (typically 2 hours). The only way to stop
 * the retry is restarting the process.  Bottom line is, we need to
 * restart all children in any case.  See pgpool-general list posting
 * "TCP connections are *not* closed when a backend timeout" on Jul 13
 * 2008 for more details.
 */
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp