[pgpool-general: 2190] connection on node 1 was terminated due to conflict with recovery

Sergey Arlashin sergeyarl.maillist at gmail.com
Sat Oct 12 18:43:29 JST 2013


Hi! 
We are using Pgpool2 3.3.0 with 2 PostgreSQL 9.2 backends with streaming replication. 

Lately the slave PostgreSQL instance has fallen out of the cluster a couple of times.  
This is the piece of pgpool log:

Oct 11 06:45:32 lb-node1 pgpool[5546]: connection on node 1 was terminated due to conflict with recovery
Oct 11 06:45:32 lb-node1 pgpool[5546]: do_child: exits with status 1 due to error
Oct 11 06:45:32 lb-node1 pgpool[20843]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
Oct 11 06:45:32 lb-node1 pgpool[7024]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
Oct 11 06:45:32 lb-node1 pgpool[813]: connection on node 1 was terminated due to conflict with recovery
Oct 11 06:45:32 lb-node1 pgpool[813]: do_child: exits with status 1 due to error
Oct 11 06:45:32 lb-node1 pgpool[18666]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
Oct 11 06:45:32 lb-node1 pgpool[20843]: pool_read: read failed (Connection reset by peer)
Oct 11 06:45:32 lb-node1 pgpool[20843]: degenerate_backend_set: 1 fail over request from pid 20843
Oct 11 06:45:32 lb-node1 pgpool[20843]: pool_flush_it: write failed to backend (1). reason: Broken pipe offset: 0 wlen: 5
Oct 11 06:45:32 lb-node1 pgpool[18609]: starting degeneration. shutdown host db-node2.site(5432)
Oct 11 06:45:32 lb-node1 pgpool[18609]: Restart all children
Oct 11 06:45:32 lb-node1 pgpool[20843]: pool_flush_it: write failed to backend (1). reason: Broken pipe offset: 0 wlen: 5
Oct 11 06:45:32 lb-node1 pgpool[18609]: execute command: /etc/pgpool2/scripts/failover.sh 1 0 db-node1.site /var/lib/postgresql/9.2/main/switch_master
Oct 11 06:45:32 lb-node1 pgpool[18609]: find_primary_node_repeatedly: waiting for finding a primary node

Here is the piece of postgresql.log from the slave instance:

2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
	SQL function "get_visited_urls" statement 1
2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
2013-10-11 06:45:32 UTC FATAL:  terminating connection due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
	SQL function "get_visited_urls" statement 1
2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
	SQL function "get_visited_urls" statement 1
2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
	SQL function "get_visited_urls" statement 1
2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
2013-10-11 06:45:32 UTC FATAL:  terminating connection due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
	SQL function "get_visited_urls" statement 1
2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
2013-10-11 06:45:32 UTC FATAL:  terminating connection due to conflict with recovery
2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
2013-10-11 06:45:32 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2013-10-11 06:45:32 UTC LOG:  unexpected EOF on client connection with an open transaction

So I assume "terminating connection due to conflict with recovery" was the reason why the slave instance was marked as faulty. Is that correct? 
Could you please tell what is the proper way of handling this type of errors? 

Thanks in advance.


--
Best regards,
Sergey Arlashin



More information about the pgpool-general mailing list