[pgpool-general: 2386] Help error code e1012 on pgpool II 3.3.0 while clicking Recovery button

Syed Irfan syedirfan_77 at yahoo.com
Thu Jan 9 20:01:35 JST 2014


This is the second time i am asking the question and still have not got any reply. Any kind of help would be appreciated.


I have configured as per "Simple Streaming replication setting with pgpool-II(multiple servers version)" http://www.pgpool.net/pgpool-web/contrib_docs/simple_sr_setting2_3.0/ by Tatsuo Ishii. I tested the pgpool for some test case scenario and found error's while using "Recovery" button.

Scenario:
           3 Linux systems with 1st system(172.16.80.47) pgpool II 3.3.0 other two systems are in master/slave SR mode that is
system Two (172.16.80.49) has primary database of postgresql 9.1.11 and
Third system (172.16.80.50) has standby database with postgresql 9.1.11.
When i login to pgpool it shows 172.16.80.49 is Running as primary and 172.16.80.50 as standby.
When i shutdown manually standby i.e 172.16.80.50 pgpool updates the status that standby is down and Recover button allows me to recover the standby database successfully in 5 seconds.
If i shutdown 172.16.80.49 primary database the pgpool status updates that primary is down and standby takes over as new primary so now 172.16.80.50 shows Running as Primary and 172.16.80.49 shows down with Recovery button enabled now
If i click the recovery button of old primary i.e 172.16.80.49 pgpool shows status as Running in standby.
Now if i shut down the new primary i.e 172.16.80.50 the pgpool status shows 172.16.80.50 as down and 172.16.80.49 takes the role of primary from standby and Recovery button is enabled for 172.16.80.50.
Now the problem starts when i click on Recovery button of 172.16.80.50 it immediately pops Error
Error code e1012
Error Message pcp_recovery_node command error occurred. 

My question is why does the Recovery for 172.16.80.50 fails the second time while first it ran successfully. The logs are shown below for pgpool and postgres db of 172.16.80.49 & 50 kindly bear with me cause the logs are quite lenghty

For pgpool log is as follows:-


2014-01-09 20:54:19 DEBUG: pid 30976: key: pid_file_name

2014-01-09 20:54:19 DEBUG: pid 30976: value: '/var/run/pgpool/pgpool.pid' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: logdir
2014-01-09 20:54:19 DEBUG: pid 30976: value: '/var/log/pgpool' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: connection_cache
2014-01-09 20:54:19 DEBUG: pid 30976: value: on kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: reset_query_list
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'ABORT; DISCARD ALL' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: extract_string_tokens: token: ABORT
2014-01-09 20:54:19 DEBUG: pid 30976: extract_string_tokens: token:  DISCARD ALL
2014-01-09 20:54:19 DEBUG: pid 30976: key: replication_mode
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: replicate_select
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: replicate_select: 0
2014-01-09 20:54:19 DEBUG: pid 30976: key: insert_lock
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: lobj_lock_table
2014-01-09 20:54:19 DEBUG: pid 30976: value: '' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: replication_stop_on_mismatch
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: replication_stop_on_mismatch: 0
2014-01-09 20:54:19 DEBUG: pid 30976: key: failover_if_affected_tuples_mismatch
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: failover_if_affected_tuples_mismatch: 0
2014-01-09 20:54:19 DEBUG: pid 30976: key: load_balance_mode
2014-01-09 20:54:19 DEBUG: pid 30976: value: on kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: ignore_leading_white_space
2014-01-09 20:54:19 DEBUG: pid 30976: value: on kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: white_function_list
2014-01-09 20:54:19 DEBUG: pid 30976: value: '' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: black_function_list
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'currval,lastval,nextval,setval' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: extract_string_tokens: token: currval
2014-01-09 20:54:19 DEBUG: pid 30976: extract_string_tokens: token: lastval
2014-01-09 20:54:19 DEBUG: pid 30976: extract_string_tokens: token: nextval
2014-01-09 20:54:19 DEBUG: pid 30976: extract_string_tokens: token: setval
2014-01-09 20:54:19 DEBUG: pid 30976: add_to_patterns: regex pattern: ^currval$
2014-01-09 20:54:19 DEBUG: pid 30976: add_to_patterns: regex pattern: ^lastval$
2014-01-09 20:54:19 DEBUG: pid 30976: add_to_patterns: regex pattern: ^nextval$
2014-01-09 20:54:19 DEBUG: pid 30976: add_to_patterns: regex pattern: ^setval$
2014-01-09 20:54:19 DEBUG: pid 30976: key: master_slave_mode
2014-01-09 20:54:19 DEBUG: pid 30976: value: on kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: master_slave_sub_mode
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'stream' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: sr_check_period
2014-01-09 20:54:19 DEBUG: pid 30976: value: 10 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: key: sr_check_user
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'postgres' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: sr_check_password
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'postgres' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: delay_threshold
2014-01-09 20:54:19 DEBUG: pid 30976: value: 10000000 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: key: follow_master_command
2014-01-09 20:54:19 DEBUG: pid 30976: value: '' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: parallel_mode
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: enable_query_cache
2014-01-09 20:54:19 DEBUG: pid 30976: value: off kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: pgpool2_hostname
2014-01-09 20:54:19 DEBUG: pid 30976: value: '' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: health_check_period
2014-01-09 20:54:19 DEBUG: pid 30976: value: 10 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: key: health_check_timeout
2014-01-09 20:54:19 DEBUG: pid 30976: value: 20 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: key: health_check_user
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'postgres' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: health_check_password
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'postgres' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: failover_command
2014-01-09 20:54:19 DEBUG: pid 30976: value: '/usr/local/etc/failover.sh %d "%h" %p %D %m %M "%H" %P' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: failback_command
2014-01-09 20:54:19 DEBUG: pid 30976: value: '/bin/rm -f /var/log/pgpool/trigger/trigger_file1' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: fail_over_on_backend_error
2014-01-09 20:54:19 DEBUG: pid 30976: value: on kind: 1
2014-01-09 20:54:19 DEBUG: pid 30976: key: recovery_user
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'postgres' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: recovery_password
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'postgres' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: recovery_1st_stage_command
2014-01-09 20:54:19 DEBUG: pid 30976: value: 'basebackup.sh' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: recovery_2nd_stage_command
2014-01-09 20:54:19 DEBUG: pid 30976: value: '' kind: 4
2014-01-09 20:54:19 DEBUG: pid 30976: key: recovery_timeout
2014-01-09 20:54:19 DEBUG: pid 30976: value: 90 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: key: client_idle_limit_in_recovery
2014-01-09 20:54:19 DEBUG: pid 30976: value: 0 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: key: relcache_expire
2014-01-09 20:54:19 DEBUG: pid 30976: value: 0 kind: 2
2014-01-09 20:54:19 DEBUG: pid 30976: num_backends: 2 total_weight: 2.000000
2014-01-09 20:54:19 DEBUG: pid 30976: backend 0 weight: 1073741823.500000
2014-01-09 20:54:19 DEBUG: pid 30976: backend 0 flag: 0000
2014-01-09 20:54:19 DEBUG: pid 30976: backend 1 weight: 1073741823.500000
2014-01-09 20:54:19 DEBUG: pid 30976: backend 1 flag: 0000
2014-01-09 20:54:19 LOG:   pid 30976: Backend status file /var/log/pgpool/pgpool_status discarded
2014-01-09 20:54:19 DEBUG: pid 30977: I am 30977
2014-01-09 20:54:19 DEBUG: pid 30977: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30978: I am 30978
2014-01-09 20:54:19 DEBUG: pid 30979: I am 30979
2014-01-09 20:54:19 DEBUG: pid 30978: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30979: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30980: I am 30980
2014-01-09 20:54:19 DEBUG: pid 30980: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30982: I am 30982
2014-01-09 20:54:19 DEBUG: pid 30981: I am 30981
2014-01-09 20:54:19 DEBUG: pid 30982: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30983: I am 30983
2014-01-09 20:54:19 DEBUG: pid 30983: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30984: I am 30984
2014-01-09 20:54:19 DEBUG: pid 30984: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30985: I am 30985
2014-01-09 20:54:19 DEBUG: pid 30985: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30987: I am 30987
2014-01-09 20:54:19 DEBUG: pid 30988: I am 30988
2014-01-09 20:54:19 DEBUG: pid 30988: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30987: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30986: I am 30986
2014-01-09 20:54:19 DEBUG: pid 30989: I am 30989
2014-01-09 20:54:19 DEBUG: pid 30986: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30989: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30990: I am 30990
2014-01-09 20:54:19 DEBUG: pid 30990: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30992: I am 30992
2014-01-09 20:54:19 DEBUG: pid 30991: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30992: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30994: I am 30994
2014-01-09 20:54:19 DEBUG: pid 30995: I am 30995
2014-01-09 20:54:19 DEBUG: pid 30996: I am 30996
2014-01-09 20:54:19 DEBUG: pid 30994: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30995: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30996: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30997: I am 30997
2014-01-09 20:54:19 DEBUG: pid 30997: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30998: I am 30998
2014-01-09 20:54:19 DEBUG: pid 30999: I am 30999
2014-01-09 20:54:19 DEBUG: pid 30998: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30999: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31001: I am 31001
2014-01-09 20:54:19 DEBUG: pid 31001: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31000: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30993: I am 30993
2014-01-09 20:54:19 DEBUG: pid 30993: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31002: I am 31002
2014-01-09 20:54:19 DEBUG: pid 31002: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31003: I am 31003
2014-01-09 20:54:19 DEBUG: pid 31003: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31005: I am 31005
2014-01-09 20:54:19 DEBUG: pid 31004: I am 31004
2014-01-09 20:54:19 DEBUG: pid 31005: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31006: I am 31006
2014-01-09 20:54:19 LOG:   pid 30976: pgpool-II successfully started. version 3.3.0 (tokakiboshi)
2014-01-09 20:54:19 DEBUG: pid 31006: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31007: I am 31007
2014-01-09 20:54:19 DEBUG: pid 31007: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31008: I am 31008
2014-01-09 20:54:19 DEBUG: pid 31008: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31010: I am 31010
2014-01-09 20:54:19 DEBUG: pid 31010: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 30976: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:19 DEBUG: pid 31010: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:19 DEBUG: pid 31010: s_do_auth: auth kind: 0
2014-01-09 20:54:19 DEBUG: pid 31010: s_do_auth: backend key data received
2014-01-09 20:54:19 DEBUG: pid 31010: s_do_auth: transaction state: I
2014-01-09 20:54:19 ERROR: pid 31010: connect_inet_domain_socket: getsockopt() detected error: Connection refused
2014-01-09 20:54:19 ERROR: pid 31010: make_persistent_db_connection: connection to 172.16.80.50(5432) failed
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: extended:0 query:SELECT pg_current_xlog_location()
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: auth kind: 0
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: backend key data received
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: transaction state: I
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: extended:0 query:SELECT pg_is_in_recovery()
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: kind: T
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: row description received
2014-01-09 20:54:19 DEBUG: pid 30976: num_fileds: 1
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: kind: D
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: data row received
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: row description received
2014-01-09 20:54:19 DEBUG: pid 31010: num_fileds: 1
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: kind: D
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: data row received
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: Ready for query
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: kind: C
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: Command complete received
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: kind: Z
2014-01-09 20:54:19 DEBUG: pid 31010: do_query: Ready for query
2014-01-09 20:54:19 DEBUG: pid 31010: check_replication_time_lag: DB node is valid but no persistent connection
2014-01-09 20:54:19 ERROR: pid 31010: check_replication_time_lag: could not connect to DB node 1, check sr_check_user and sr_check_password
2014-01-09 20:54:19 DEBUG: pid 30976: starting health checking
2014-01-09 20:54:19 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:19 DEBUG: pid 30976: health_check: 0 th DB node status: 1
2014-01-09 20:54:19 DEBUG: pid 30976: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: auth kind: 0
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: backend key data received
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: transaction state: I
2014-01-09 20:54:19 DEBUG: pid 30976: health_check: 1 th DB node status: 1
2014-01-09 20:54:19 ERROR: pid 30976: connect_inet_domain_socket: getsockopt() detected error: Connection refused
2014-01-09 20:54:19 ERROR: pid 30976: make_persistent_db_connection: connection to 172.16.80.50(5432) failed
2014-01-09 20:54:19 DEBUG: pid 30976: health_check: 0 th DB node status: 1
2014-01-09 20:54:19 DEBUG: pid 30976: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: auth kind: 0
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: backend key data received
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: transaction state: I
2014-01-09 20:54:19 DEBUG: pid 30976: health_check: 1 th DB node status: 1
2014-01-09 20:54:19 ERROR: pid 30976: connect_inet_domain_socket: getsockopt() detected error: Connection refused
2014-01-09 20:54:19 ERROR: pid 30976: make_persistent_db_connection: connection to 172.16.80.50(5432) failed
2014-01-09 20:54:19 ERROR: pid 30976: health check failed. 1 th host 172.16.80.50 at port 5432 is down
2014-01-09 20:54:19 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:19 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:19 LOG:   pid 30976: set 1 th backend down status
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler called
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: starting to select new master node
2014-01-09 20:54:19 LOG:   pid 30976: starting degeneration. shutdown host 172.16.80.50(5432)
2014-01-09 20:54:19 LOG:   pid 30976: Restart all children
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30977
2014-01-09 20:54:19 DEBUG: pid 30977: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30978
2014-01-09 20:54:19 DEBUG: pid 30978: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30979
2014-01-09 20:54:19 DEBUG: pid 30980: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30980
2014-01-09 20:54:19 DEBUG: pid 30981: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30981
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30982
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30983
2014-01-09 20:54:19 DEBUG: pid 30982: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30984: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30984
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30985
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30986
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30987
2014-01-09 20:54:19 DEBUG: pid 30986: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30988: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30988
2014-01-09 20:54:19 DEBUG: pid 30987: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30989
2014-01-09 20:54:19 DEBUG: pid 30990: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30990
2014-01-09 20:54:19 DEBUG: pid 30991: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30991
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30992
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30993
2014-01-09 20:54:19 DEBUG: pid 30992: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30994: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30994
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30995
2014-01-09 20:54:19 DEBUG: pid 30995: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30996: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30996
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30997
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30998
2014-01-09 20:54:19 DEBUG: pid 30997: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30998: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 30999
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31000
2014-01-09 20:54:19 DEBUG: pid 31000: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31001
2014-01-09 20:54:19 DEBUG: pid 31001: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 31003: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31003
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31004
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31005
2014-01-09 20:54:19 DEBUG: pid 31004: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 31006: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 31005: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31006
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31007
2014-01-09 20:54:19 DEBUG: pid 30976: failover_handler: kill 31008
2014-01-09 20:54:19 LOG:   pid 30976: execute command: /usr/local/etc/failover.sh 1 "172.16.80.50" 5432 /usr/local/pgsql/data 0 0 "172.16.80.49" 0
2014-01-09 20:54:19 DEBUG: pid 31008: child received shutdown request signal 3
2014-01-09 20:54:19 DEBUG: pid 31007: child received shutdown request signal 3
+ failed_node_id=1
+ failed_host_name=172.16.80.50
+ failed_port=5432
+ failed_db_cluster=/usr/local/pgsql/data
+ new_master_id=0
+ old_master_id=0
+ new_master_host_name=172.16.80.49
+ old_primary_node_id=0
+ trigger=/var/log/pgpool/trigger/trigger_file1
+ '[' 1 = 0 ']'
2014-01-09 20:54:19 LOG:   pid 30976: find_primary_node_repeatedly: waiting for finding a primary node
2014-01-09 20:54:19 DEBUG: pid 30976: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: auth kind: 0
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: backend key data received
2014-01-09 20:54:19 DEBUG: pid 30976: s_do_auth: transaction state: I
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: extended:0 query:SELECT pg_is_in_recovery()
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: kind: T
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: row description received
2014-01-09 20:54:19 DEBUG: pid 30976: num_fileds: 1
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: kind: D
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: data row received
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: kind: C
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: Command complete received
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: kind: Z
2014-01-09 20:54:19 DEBUG: pid 30976: do_query: Ready for query
2014-01-09 20:54:19 LOG:   pid 30976: find_primary_node: primary node id is 0
2014-01-09 20:54:19 LOG:   pid 30976: failover: set new primary node: 0
2014-01-09 20:54:19 LOG:   pid 30976: failover: set new master node: 0
2014-01-09 20:54:19 DEBUG: pid 31012: I am 31012
2014-01-09 20:54:19 DEBUG: pid 31012: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31014: I am 31014
2014-01-09 20:54:19 DEBUG: pid 31013: I am 31013
2014-01-09 20:54:19 DEBUG: pid 31014: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31013: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31015: I am 31015
2014-01-09 20:54:19 DEBUG: pid 31015: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31016: I am 31016
2014-01-09 20:54:19 DEBUG: pid 31016: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31017: I am 31017
2014-01-09 20:54:19 DEBUG: pid 31017: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31018: I am 31018
2014-01-09 20:54:19 DEBUG: pid 31018: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31019: I am 31019
2014-01-09 20:54:19 DEBUG: pid 31020: I am 31020
2014-01-09 20:54:19 DEBUG: pid 31019: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31020: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31021: I am 31021
2014-01-09 20:54:19 DEBUG: pid 31021: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31022: I am 31022
2014-01-09 20:54:19 DEBUG: pid 31022: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31024: I am 31024
2014-01-09 20:54:19 DEBUG: pid 31023: I am 31023
2014-01-09 20:54:19 DEBUG: pid 31024: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31023: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31025: I am 31025
2014-01-09 20:54:19 DEBUG: pid 31025: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31026: I am 31026
2014-01-09 20:54:19 DEBUG: pid 31027: I am 31027
2014-01-09 20:54:19 DEBUG: pid 31027: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31028: I am 31028
2014-01-09 20:54:19 DEBUG: pid 31029: I am 31029
2014-01-09 20:54:19 DEBUG: pid 31028: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31029: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31030: I am 31030
2014-01-09 20:54:19 DEBUG: pid 31030: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31031: I am 31031
2014-01-09 20:54:19 DEBUG: pid 31032: I am 31032
2014-01-09 20:54:19 DEBUG: pid 31031: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31032: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31033: I am 31033
2014-01-09 20:54:19 DEBUG: pid 31033: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31034: I am 31034
2014-01-09 20:54:19 DEBUG: pid 31036: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31034: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31037: I am 31037
2014-01-09 20:54:19 DEBUG: pid 31037: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31039: I am 31039
2014-01-09 20:54:19 DEBUG: pid 31038: I am 31038
2014-01-09 20:54:19 DEBUG: pid 31038: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 LOG:   pid 31010: worker process received restart request
2014-01-09 20:54:19 DEBUG: pid 31042: I am 31042
2014-01-09 20:54:19 DEBUG: pid 31041: I am 31041
2014-01-09 20:54:19 DEBUG: pid 31042: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31043: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31041: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 LOG:   pid 30976: failover done. shutdown host 172.16.80.50(5432)
2014-01-09 20:54:19 DEBUG: pid 31040: I am 31040
2014-01-09 20:54:19 DEBUG: pid 31035: I am 31035
2014-01-09 20:54:19 DEBUG: pid 31040: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:19 DEBUG: pid 31035: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:20 LOG:   pid 31009: pcp child process received restart request
2014-01-09 20:54:20 LOG:   pid 30976: PCP child 31009 exits with status 256 in failover()
2014-01-09 20:54:20 LOG:   pid 30976: fork a new PCP child pid 31046 in failover()
2014-01-09 20:54:20 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:20 DEBUG: pid 30976: reap_handler called
2014-01-09 20:54:20 DEBUG: pid 30976: reap_handler: call wait3
2014-01-09 20:54:20 DEBUG: pid 30976: child 30977 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30978 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30979 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30980 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30981 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30982 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30983 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30984 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30985 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30986 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30987 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30988 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30989 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30990 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30991 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30992 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30993 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30994 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30995 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30996 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30997 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30998 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 30999 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31000 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31001 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31002 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31003 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31004 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31005 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31006 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31007 exits with status 0
2014-01-09 20:54:20 DEBUG: pid 30976: child 31008 exits with status 0
2014-01-09 20:54:20 LOG:   pid 30976: worker child 31010 exits with status 256
2014-01-09 20:54:20 LOG:   pid 30976: fork a new worker child pid 31047
2014-01-09 20:54:20 DEBUG: pid 30976: reap_handler: normally exited
2014-01-09 20:54:20 DEBUG: pid 31047: I am 31047
2014-01-09 20:54:20 DEBUG: pid 31047: pool_initialize_private_backend_status: initialize backend status
2014-01-09 20:54:20 DEBUG: pid 31047: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:20 DEBUG: pid 31047: s_do_auth: auth kind: 0
2014-01-09 20:54:20 DEBUG: pid 31047: s_do_auth: backend key data received
2014-01-09 20:54:20 DEBUG: pid 31047: s_do_auth: transaction state: I
2014-01-09 20:54:28 LOG:   pid 31046: starting recovering node 1
2014-01-09 20:54:28 LOG:   pid 31046: starting recovery command: "SELECT pgpool_recovery('basebackup.sh', '172.16.80.50', '/usr/local/pgsql/data')"
2014-01-09 20:54:29 LOG:   pid 31046: 1st stage is done
2014-01-09 20:54:29 LOG:   pid 31046: check_postmaster_started: try to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres (retry 0 times)
2014-01-09 20:54:29 LOG:   pid 31046: check_postmaster_started: failed to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres
2014-01-09 20:54:30 DEBUG: pid 31047: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:30 DEBUG: pid 31047: s_do_auth: auth kind: 0
2014-01-09 20:54:30 DEBUG: pid 31047: s_do_auth: backend key data received
2014-01-09 20:54:30 DEBUG: pid 31047: s_do_auth: transaction state: I
2014-01-09 20:54:30 DEBUG: pid 30976: retrying 1 th health checking
2014-01-09 20:54:30 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:30 DEBUG: pid 30976: health_check: 0 th DB node status: 1
2014-01-09 20:54:30 DEBUG: pid 30976: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:30 DEBUG: pid 30976: s_do_auth: auth kind: 0
2014-01-09 20:54:30 DEBUG: pid 30976: s_do_auth: backend key data received
2014-01-09 20:54:30 DEBUG: pid 30976: s_do_auth: transaction state: I
2014-01-09 20:54:30 DEBUG: pid 30976: health_check: 1 th DB node status: 3
2014-01-09 20:54:30 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:30 DEBUG: pid 30976: health check: clearing alarm
2014-01-09 20:54:32 LOG:   pid 31046: check_postmaster_started: try to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres (retry 1 times)
2014-01-09 20:54:32 LOG:   pid 31046: check_postmaster_started: failed to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres
2014-01-09 20:54:35 LOG:   pid 31046: check_postmaster_started: try to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres (retry 2 times)
2014-01-09 20:54:35 LOG:   pid 31046: check_postmaster_started: failed to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres
2014-01-09 20:54:38 LOG:   pid 31046: check_postmaster_started: try to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres (retry 3 times)
2014-01-09 20:54:38 LOG:   pid 31046: check_postmaster_started: failed to connect to postmaster on hostname:172.16.80.50 database:postgres user:postgres
2014-01-09 20:54:40 DEBUG: pid 31047: pool_ssl: SSL requested but SSL support is not available
2014-01-09 20:54:40 DEBUG: pid 31047: s_do_auth: auth kind: 0
2014-01-09 20:54:40 DEBUG: pid 31047: s_do_auth: backend key data received
2014-01-09 20:54:40 DEBUG: pid 31047: s_do_auth: transaction state: I
2014-01-09 20:54:40 DEBUG: pid 30976: starting health checking
2014-01-09 20:54:40 DEBUG: pid 30976: health check: clearing alarm


********************************************************
The postgres log of 172.16.80.49 shows as follows

21183 2014-01-09 15:27:31 IST LOG:  database system was shut down at 2014-01-09 15:26:49 IST

21181 2014-01-09 15:27:31 IST LOG:  database system is ready to accept connections
21186 2014-01-09 15:27:31 IST LOG:  autovacuum launcher started
21194 2014-01-09 15:28:07 IST LOG:  statement: SELECT pg_current_xlog_location()
21193 2014-01-09 15:28:07 IST LOG:  statement: SELECT pg_is_in_recovery()
21197 2014-01-09 15:28:07 IST LOG:  statement: SELECT pg_is_in_recovery()
21201 2014-01-09 15:28:08 IST LOG:  statement: SELECT pg_is_in_recovery()
21203 2014-01-09 15:28:11 IST LOG:  statement: SELECT pg_is_in_recovery()
21206 2014-01-09 15:28:14 IST LOG:  statement: SELECT pg_is_in_recovery()
21207 2014-01-09 15:28:16 IST LOG:  statement: SELECT pgpool_recovery('basebackup.sh', '172.16.80.50', '/usr/local/pgsql/data')
21211 2014-01-09 15:28:16 IST LOG:  statement: SELECT pg_start_backup('Streaming Replication', true)
 pg_start_backup 
-----------------
 0/B1000020
(1 row)


mkdir: cannot create directory `/usr/local/pgsql/data/pg_xlog': File exists
21221 2014-01-09 15:28:17 IST LOG:  statement: SELECT pg_stop_backup()
NOTICE:  WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
 pg_stop_backup 
----------------
 0/B10000D8
(1 row)


21207 2014-01-09 15:28:17 IST LOG:  statement: SELECT pgpool_remote_start('172.16.80.50', '/usr/local/pgsql/data')
21279 2014-01-09 15:30:02 IST LOG:  statement: SELECT pg_is_in_recovery()


****************************************** 
The postgres log of 172.16.80.50 shows as follows


28028 2014-01-09 21:28:03 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28029 2014-01-09 21:28:04 BDT FATAL:  the database system is starting up
28030 2014-01-09 21:28:07 BDT FATAL:  the database system is starting up
28031 2014-01-09 21:28:07 BDT FATAL:  the database system is starting up
28032 2014-01-09 21:28:08 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28033 2014-01-09 21:28:13 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28034 2014-01-09 21:28:18 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28036 2014-01-09 21:28:23 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28037 2014-01-09 21:28:28 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28038 2014-01-09 21:28:33 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36
28039 2014-01-09 21:28:38 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36

Thanks and Regards,
Syed Irfan.

Sr. Developer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20140109/6d27d902/attachment-0001.html>


More information about the pgpool-general mailing list