[pgpool-general: 2086] Re: Fwd: node failover

Tatsuo Ishii ishii at postgresql.org
Fri Aug 30 00:30:14 JST 2013


Health check was ok, but one of child process(pid 15210) could not
read/write a socket to PostgreSQL.  That's the reason of fail over.  I
cannot guess the cause of the error from the log. Maybe PostgreSQL
problem or network error? (in this case you should see an error in
PostgreSQL log).

In the mean time you should consider to turn off this:

fail_over_on_backend_error = on
                                   # Initiates failover when reading/writing to the
                                   # backend communication socket fails
                                   # If set to off, pgpool will report an
                                   # error and disconnect the session.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hello,
> 
> I'm having problem with failover triggering every night because for me
> unknown reason. I have latest pgpool-3.3.0 version with postgresql 9.0
> nodes.
> 
> I have streaming replication and load balance, without watchdog.
> I have enabled healt check with this paremeters:
> 
> health_check_period = 5
> health_check_timeout = 20
> health_check_user = 'postgres'
> health_check_max_retries = 1
> health_check_retry_delay = 1
> 
> 
> Here is my pgpool.log with debug option enabled:
> 
> 2013-08-28 05:15:23 DEBUG: pid 10079: s_do_auth: auth kind: 0
> 2013-08-28 05:15:23 DEBUG: pid 10079: s_do_auth: backend key data received
> 2013-08-28 05:15:23 DEBUG: pid 10079: s_do_auth: transaction state: I
> 2013-08-28 05:15:23 DEBUG: pid 10079: health check: clearing alarm
> 2013-08-28 05:15:23 DEBUG: pid 10079: health check: clearing alarm
> 2013-08-28 05:15:28 DEBUG: pid 10079: starting health checking
> 2013-08-28 05:15:28 DEBUG: pid 10079: health check: clearing alarm
> 2013-08-28 05:15:28 DEBUG: pid 10079: health_check: 0 th DB node status: 2
> 2013-08-28 05:15:28 DEBUG: pid 10079: pool_ssl: SSL requested but SSL
> support is not available
> 2013-08-28 05:15:28 DEBUG: pid 10079: s_do_auth: auth kind: 0
> 2013-08-28 05:15:28 DEBUG: pid 10079: s_do_auth: backend key data received
> 2013-08-28 05:15:28 DEBUG: pid 10079: s_do_auth: transaction state: I
> 2013-08-28 05:15:28 DEBUG: pid 10079: health_check: 1 th DB node status: 2
> 2013-08-28 05:15:28 DEBUG: pid 10079: pool_ssl: SSL requested but SSL
> support is not available
> 2013-08-28 05:15:28 DEBUG: pid 10079: s_do_auth: auth kind: 0
> 2013-08-28 05:15:28 DEBUG: pid 10079: s_do_auth: backend key data received
> 2013-08-28 05:15:28 DEBUG: pid 10079: s_do_auth: transaction state: I
> 2013-08-28 05:15:28 DEBUG: pid 10079: health check: clearing alarm
> 2013-08-28 05:15:28 DEBUG: pid 10079: health check: clearing alarm
> 2013-08-28 05:15:29 ERROR: pid 15210: pool_read: read failed (Connection
> timed out)
> 2013-08-28 05:15:29 LOG:   pid 15210: degenerate_backend_set: 0 fail over
> request from pid 15210
> 2013-08-28 05:15:29 ERROR: pid 15210: pool_flush_it: write failed to
> backend (0). reason: Broken pipe offset: 0 wlen: 5
> 2013-08-28 05:15:29 DEBUG: pid 10079: failover_handler called
> 2013-08-28 05:15:29 DEBUG: pid 10079: failover_handler: starting to select
> new master node
> 2013-08-28 05:15:29 LOG:   pid 10079: starting degeneration. shutdown host
> intrix-c1(5432)
> 2013-08-28 05:15:29 LOG:   pid 10079: Restart all children
> 2013-08-28 05:15:29 DEBUG: pid 10079: failover_handler: kill 18940
> 2013-08-28 05:15:29 DEBUG: pid 10079: failover_handler: kill 10507
> 2013-08-28 05:15:29 DEBUG: pid 10079: failover_handler: kill 2139
> 
> -- 
> Armin


More information about the pgpool-general mailing list