[pgpool-general: 2191] Re: connection on node 1 was terminated due to conflict with recovery

Tatsuo Ishii ishii at postgresql.org
Sat Oct 12 23:11:50 JST 2013


FAQ #1.17 might help you.

http://www.pgpool.net/mediawiki/index.php/FAQ#I_see_standby_servers_go_down_status_in_steaming_replication_mode_and_see_PostgreSQL_messages_.22terminating_connection_due_to_conflict.22_Why.3F
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hi! 
> We are using Pgpool2 3.3.0 with 2 PostgreSQL 9.2 backends with streaming replication. 
> 
> Lately the slave PostgreSQL instance has fallen out of the cluster a couple of times.  
> This is the piece of pgpool log:
> 
> Oct 11 06:45:32 lb-node1 pgpool[5546]: connection on node 1 was terminated due to conflict with recovery
> Oct 11 06:45:32 lb-node1 pgpool[5546]: do_child: exits with status 1 due to error
> Oct 11 06:45:32 lb-node1 pgpool[20843]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
> Oct 11 06:45:32 lb-node1 pgpool[7024]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
> Oct 11 06:45:32 lb-node1 pgpool[813]: connection on node 1 was terminated due to conflict with recovery
> Oct 11 06:45:32 lb-node1 pgpool[813]: do_child: exits with status 1 due to error
> Oct 11 06:45:32 lb-node1 pgpool[18666]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
> Oct 11 06:45:32 lb-node1 pgpool[20843]: pool_read: read failed (Connection reset by peer)
> Oct 11 06:45:32 lb-node1 pgpool[20843]: degenerate_backend_set: 1 fail over request from pid 20843
> Oct 11 06:45:32 lb-node1 pgpool[20843]: pool_flush_it: write failed to backend (1). reason: Broken pipe offset: 0 wlen: 5
> Oct 11 06:45:32 lb-node1 pgpool[18609]: starting degeneration. shutdown host db-node2.site(5432)
> Oct 11 06:45:32 lb-node1 pgpool[18609]: Restart all children
> Oct 11 06:45:32 lb-node1 pgpool[20843]: pool_flush_it: write failed to backend (1). reason: Broken pipe offset: 0 wlen: 5
> Oct 11 06:45:32 lb-node1 pgpool[18609]: execute command: /etc/pgpool2/scripts/failover.sh 1 0 db-node1.site /var/lib/postgresql/9.2/main/switch_master
> Oct 11 06:45:32 lb-node1 pgpool[18609]: find_primary_node_repeatedly: waiting for finding a primary node
> 
> Here is the piece of postgresql.log from the slave instance:
> 
> 2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
> 	SQL function "get_visited_urls" statement 1
> 2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
> 2013-10-11 06:45:32 UTC FATAL:  terminating connection due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
> 2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
> 	SQL function "get_visited_urls" statement 1
> 2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
> 2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
> 	SQL function "get_visited_urls" statement 1
> 2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
> 2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
> 	SQL function "get_visited_urls" statement 1
> 2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
> 2013-10-11 06:45:32 UTC FATAL:  terminating connection due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
> 2013-10-11 06:45:32 UTC ERROR:  canceling statement due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC CONTEXT:  SQL function "get_string_value" statement 1
> 	SQL function "get_visited_urls" statement 1
> 2013-10-11 06:45:32 UTC STATEMENT:  select vac.get_visited_urls((4))
> 2013-10-11 06:45:32 UTC FATAL:  terminating connection due to conflict with recovery
> 2013-10-11 06:45:32 UTC DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-10-11 06:45:32 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
> 2013-10-11 06:45:32 UTC LOG:  unexpected EOF on client connection with an open transaction
> 
> So I assume "terminating connection due to conflict with recovery" was the reason why the slave instance was marked as faulty. Is that correct? 
> Could you please tell what is the proper way of handling this type of errors? 
> 
> Thanks in advance.
> 
> 
> --
> Best regards,
> Sergey Arlashin
> 
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list