[pgpool-hackers: 2596] Manual failover with pgpool and repmgr
Juliano
jplinux at protonmail.com
Wed Nov 15 19:35:12 JST 2017
Hi guys
After executed a manual failover I have been recovered the repmgr replication between s1 (master - read/write) and s2 (standby - read only):
repmgr cluster show
Role | Name | Upstream | Connection String
----------+------|----------|----------------------------------------------
* master | s1 | | host=192.168.0.1 dbname=repmgr user=repmgr
standby | s2 | s1 | host=192.168.0.2 dbname=repmgr user=repmgr
So, the problem is after swapping the active nodes using repmgr (1. stop postgres on standby, 2. promote the master, 3. clone the standby), pgpool can't recognize the nodes correctly and shows me the master node as down:
show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
0 | 192.168.0.1 | 5432 | down | 0.500000 | standby | 0 | false | 0
1 | 192.168.0.2 | 5432 | up | 0.500000 | standby | 0 | true | 0
The replication is working fine and repmgr shows me everything is correct:
repmgr cluster show
Role | Name | Upstream | Connection String
----------+------|----------|----------------------------------------------
* master | s1 | | host=192.168.0.1 dbname=repmgr user=repmgr
standby | s2 | s1 | host=192.168.0.2 dbname=repmgr user=repmgr
So, I have tried to fix pgpool using pcp commands without success, and restarted pgpool service:
Detach command is not accepted:
pcp_detach_node 0 -h localhost -U postgres
ERROR: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
I can promote the node 0 (down) but nothing happens:
pcp_promote_node 0 -U postgres -h localhost
pcp_promote_node -- Command Successful
show pool_nodes
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
0 | 192.168.0.1 | 5432 | down | 0.500000 | standby | 0 | false | 0
1 | 192.168.0.2 | 5432 | up | 0.500000 | standby | 3 | true | 0
(2 rows)
And I can't recovery node 1 (standby):
pcp_recovery_node 1 -U postgres -h localhost
ERROR: process recovery request failed
DETAIL: primary server cannot be recovered by online recovery.
Here is the main config on pgpool.conf
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_flag1 = 'ALLOW_TO_FAILOVER'
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream'
failover_command = ''
recovery_1st_stage_command = ''
Please, help me. I don't know what I am doing wrong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20171115/e937b8ef/attachment.html>
More information about the pgpool-hackers
mailing list