[pgpool-hackers: 2601] Re: Manual failover with pgpool and repmgr

Thu Nov 16 06:03:06 JST 2017

Assuming that node 0 is already up and running, and node 1 is a
standby node connecting to node 0, then you can use pcp_promote_node
to make node 0 a primary again.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Thanks for your suggestion, I have raised this question on Repmgr group as well.
> 
> So, a different question could be: How to change the node 0 status value from DOWN to UP on pool_nodes?
> 
> show pool_nodes;
> -[ RECORD 1 ]-----+---------------
> node_id           | 0
> hostname          | 192.168.0.1
> port              | 5432
> status            | down
> lb_weight         | 0.500000
> role              | standby
> select_cnt        | 0
> load_balance_node | false
> replication_delay | 0
> 
> Then I might to be able to promote this node to primary again:
> pcp_promote_node 0 -U postgres -h localhost
> 
>> -------- Original Message --------
>> Subject: Re: [pgpool-hackers: 2596] Manual failover with pgpool and repmgr
>> Local Time: November 15, 2017 11:11 AM
>> UTC Time: November 15, 2017 11:11 AM
>> From: ishii at sraoss.co.jp
>> To: jplinux at protonmail.com
>> pgpool-hackers at pgpool.net
>>
>> Pgpool-II development team does not guarantee Pgpool-II works with repmgr.
>> Probably you'd better to ask someone else who is familiar with repmgr.
>>
>> Best regards,
>>
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>>> Hi guys
>>> After executed a manual failover I have been recovered the repmgr replication between s1 (master - read/write) and s2 (standby - read only):
>>> repmgr cluster show
>>> Role | Name | Upstream | Connection String
>>> ----------+------|----------|----------------------------------------------
>>>
>>> - master | s1 | | host=192.168.0.1 dbname=repmgr user=repmgr
>>> standby | s2 | s1 | host=192.168.0.2 dbname=repmgr user=repmgr
>>>
>>> So, the problem is after swapping the active nodes using repmgr (1. stop postgres on standby, 2. promote the master, 3. clone the standby), pgpool can't recognize the nodes correctly and shows me the master node as down:
>>> show pool_nodes;
>>> node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
>>> ---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
>>> 0 | 192.168.0.1 | 5432 | down | 0.500000 | standby | 0 | false | 0
>>> 1 | 192.168.0.2 | 5432 | up | 0.500000 | standby | 0 | true | 0
>>> The replication is working fine and repmgr shows me everything is correct:
>>> repmgr cluster show
>>> Role | Name | Upstream | Connection String
>>> ----------+------|----------|----------------------------------------------
>>>
>>> - master | s1 | | host=192.168.0.1 dbname=repmgr user=repmgr
>>> standby | s2 | s1 | host=192.168.0.2 dbname=repmgr user=repmgr
>>>
>>> So, I have tried to fix pgpool using pcp commands without success, and restarted pgpool service:
>>> Detach command is not accepted:
>>> pcp_detach_node 0 -h localhost -U postgres
>>> ERROR: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
>>> I can promote the node 0 (down) but nothing happens:
>>> pcp_promote_node 0 -U postgres -h localhost
>>> pcp_promote_node -- Command Successful
>>> show pool_nodes
>>> node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
>>> ---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
>>> 0 | 192.168.0.1 | 5432 | down | 0.500000 | standby | 0 | false | 0
>>> 1 | 192.168.0.2 | 5432 | up | 0.500000 | standby | 3 | true | 0
>>> (2 rows)
>>> And I can't recovery node 1 (standby):
>>> pcp_recovery_node 1 -U postgres -h localhost
>>> ERROR: process recovery request failed
>>> DETAIL: primary server cannot be recovered by online recovery.
>>> Here is the main config on pgpool.conf
>>> backend_flag0 = 'ALLOW_TO_FAILOVER'
>>> backend_flag1 = 'ALLOW_TO_FAILOVER'
>>> load_balance_mode = on
>>> master_slave_mode = on
>>> master_slave_sub_mode = 'stream'
>>> failover_command = ''
>>> recovery_1st_stage_command = ''
>>> Please, help me. I don't know what I am doing wrong