[pgpool-general: 1894] Re: Pgpool is unable to connect backend PostgreSQL

Tatsuo Ishii ishii at postgresql.org
Mon Jul 15 11:00:57 JST 2013


> Hi
> I am hitting the same issue as described in the mail [pgpool-general: 1815]
> Pgpool is unable to connect backend PostgreSQL.

I guess [pgpool-general: 1815] is different from you (my guess is the
case somewhat related to Amazon EC2 environment problem). Moreover,
you case seems very unique and strange.

> While connected to a single postgres node, after a while pgpool looses
> connection to a running postgres db, restarts all children processes and
> stays in running state unable to connect to db.
> 
> Pgpool version 3.2.3
> Postgres version 9.2.4
> 
> Part of the log:
> --------------------
> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]: connection received:
> host=10.4.225.120 port=41090

Process 11465 is a pgpool child process and is responsible for actual
pgpool functions.

> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]: Protocol Major: 3 Minor: 0
> database: hpadb user: hpauser
> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]: new_connection: connecting
> 0 backend
> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]:
> connect_inet_domain_socket_by_port: health check timer expired

This seems very strange. The error comes here:

		if (health_check_timer_expired)		/* has health check timer expired */
		{
			pool_log("connect_inet_domain_socket_by_port: health check timer expired");
			close(fd);
			return -1;
		}

"health_check_timer_expired" is a global variable used in pgpool main
process, which is responsible for management of pgpool, including:
health check, failover etc. The variable is only meaningful in the
main process and should not be set to non 0 in pgpool child. Moreover,
the only place set the variable to non 0 is the signal handler which
is set by main process.

One the error occurs, pgpool starts failover as you see.

I've never seen this kind of report before. What kind of pgpool
installation are you using? (compiled from source code or from
packes?) What kind of platform are you using? How is like your
pgpool.conf?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]: connection to
> purple1_node1_ps(5432) failed
> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]: new_connection: create_cp()
> failed
> Jul 12 09:32:14 purple1-node1-ps pgpool[11465]: degenerate_backend_set: 0
> fail over request from pid 11465
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler called
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: starting
> to select new master node
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: starting degeneration.
> shutdown host purple1_node1_ps(5432)
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: no valid
> DB node found
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: Restart all children
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 4388
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 9597
> Jul 12 09:32:14 purple1-node1-ps pgpool[18648]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[4388]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps rsyslogd-2177: imuxsock lost 85 messages
> from pid 9597 due to rate-limiting
> Jul 12 09:32:14 purple1-node1-ps pgpool[9597]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 18648
> Jul 12 09:32:14 purple1-node1-ps pgpool[29409]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 29409
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 11454
> Jul 12 09:32:14 purple1-node1-ps pgpool[14323]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[11454]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 14323
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 22349
> Jul 12 09:32:14 purple1-node1-ps pgpool[22349]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 23617
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 29410
> Jul 12 09:32:14 purple1-node1-ps pgpool[31511]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[29410]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps pgpool[14317]: failover_handler: kill 31511
> Jul 12 09:32:14 purple1-node1-ps pgpool[4385]: child received shutdown
> request signal 3
> Jul 12 09:32:14 purple1-node1-ps rsyslogd-2177: imuxsock lost 757 messages
> from pid 23617 due to rate-limiting
> 
> Could you please explain?
> Thanks
> Larisa.


More information about the pgpool-general mailing list