[pgpool-hackers: 226] Re: Unexpected behavior of pgpool-II

Tatsuo Ishii ishii at postgresql.org
Fri May 3 21:28:44 JST 2013


Sorry for delay. 

> According to our analysis, it looked like follow_master_command was issued twice on this environment.
> So what we want confirm at this time is how the follow_master_command work?
> Our recognition so far is that the failover_command is issued once when the pgpool detect a fail node,
> and then follow_master_command is issued once for new master node.
> However according to the log mentioned below showed us was that the follow_master_command was issued twice.

No, follow_master_command is called for each down node, not once.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hi Tatsuo,
> 
> Do you have any feedback on this issue? The user is asking us for an update
> on this issue, we weren't able to reproduce the problem locally hence we
> asked for the pgpool logs that were shared with you.
> 
> What would be the best way to progress this further?
> 
> -- Ahsan
> 
> 
> 
> On Mon, Apr 29, 2013 at 2:05 PM, Asif Naeem <anaeem.it at gmail.com> wrote:
> 
>> Please find attached file, It contains more feedback from customer. Thanks.
>>
>> Best Regards,
>> Asif Naeem
>>
>>
>> On Sun, Apr 28, 2013 at 3:25 PM, Asif Naeem <anaeem.it at gmail.com> wrote:
>>
>>> Hi Tatsuo,
>>>
>>> Please find the attach file, it contains pgpool log information with
>>> details. Thanks.
>>>
>>> Best Regards,
>>> Asif Naeem
>>>
>>> On Mon, Apr 22, 2013 at 3:34 AM, Tatsuo Ishii <ishii at postgresql.org>wrote:
>>>
>>>> Can you please show pgpool log?
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>> > Hi,
>>>> >
>>>> > There is unexpected behavior in pgpool2 that seems to be a bug. Please
>>>> find
>>>> > the details as following i.e.
>>>> >
>>>> > Phenomenon
>>>> >
>>>> >> pgpool2 configuration option "follow_master_command" special
>>>> character %H
>>>> >> (Hostname of the new master node) is not working as expected
>>>> >> In master slave mode after failover, 2nd node become new master but
>>>> got
>>>> >> unexpected hostname (via %H). Please find attached pgpool.conf for
>>>> details.
>>>> >
>>>> >
>>>> > Versions
>>>> >
>>>> >> OS version: RHEL - 64
>>>> >> pgpool version: pgpool-II version 3.2.1 (namameboshi)
>>>> >
>>>> >
>>>> > *Details/Observations* :-
>>>> >
>>>> >> At Start
>>>> >
>>>> >>  node_id | hostname | port | status | lb_weight |  role
>>>> >> ---------+----------+------+--------+-----------+---------
>>>> >>  0       | dmp02    | 5444 | 2      | 0.333333  | primary
>>>> >>  1       | dmp03    | 5444 | 2      | 0.333333  | standby
>>>> >>  2       | dmp04    | 5444 | 2      | 0.333333  | standby
>>>> >>
>>>> >
>>>> >
>>>> >> Failover arguments when the user stopped DMP02.
>>>> >
>>>> >> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:12 JST
>>>> >
>>>> > FAILED_NODE_ID =0
>>>> >> FAILED_NODE_NAME=dmp02
>>>> >> FAILED_NODE_PORT=5444
>>>> >> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
>>>> >> NEW_MASTER_ID =1
>>>> >> OLD_MASTER_ID =0
>>>> >> NEW_MASTER_NAME =dmp03
>>>> >> OLD_PRIMARY_ID =0
>>>> >> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>>>> >> 0
>>>> >>
>>>> >
>>>> >
>>>> >> Then, these arguments told user that the master node has been changed
>>>> >
>>>> >> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:13 JST
>>>> >> FAILED_NODE_ID =0
>>>> >> FAILED_NODE_NAME=dmp02
>>>> >> FAILED_NODE_PORT=5444
>>>> >> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
>>>> >> NEW_MASTER_ID =1
>>>> >> OLD_MASTER_ID =0
>>>> >> NEW_MASTER_NAME =dmp03
>>>> >> OLD_PRIMARY_ID =0
>>>> >> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>>>> >> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:17 JST
>>>> >> FAILED_NODE_ID =2
>>>> >> FAILED_NODE_NAME=dmp04
>>>> >> FAILED_NODE_PORT=5444
>>>> >> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
>>>> >> NEW_MASTER_ID =1
>>>> >> OLD_MASTER_ID =0
>>>> >> NEW_MASTER_NAME =dmp02  *<--- *** The master name is unexpected ****
>>>> >> OLD_PRIMARY_ID =0
>>>> >> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>>>> >>
>>>> >
>>>> >
>>>> >>  node_id | hostname | port | status | lb_weight |  role
>>>> >> ---------+----------+------+--------+-----------+---------
>>>> >>  0       | dmp02    | 5444 | 2      | 0.333333  | standby
>>>> >>  1       | dmp03    | 5444 | 2      | 0.333333  | primary
>>>> >>  2       | dmp04    | 5444 | 2      | 0.333333  | standby
>>>> >>
>>>> >
>>>> >
>>>> >>   pid  | usesysid |   usename    | application_name |  client_addr  |
>>>> >> client_hostname | client_port |    backend_start                 |
>>>> state
>>>> >>   | sent_location | write_location | flush_location | replay_location
>>>> |
>>>> >> sync_priority | sync_state
>>>> >>
>>>> >>
>>>> -------+----------+--------------+------------------+---------------+-----------------+-------------+----------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
>>>> >>  20641 |       10 | enterprisedb | dmp02            | 172.24.217.23 |
>>>> >>             |       16948 | 08-FEB-13 17:39:44.078173 +09:00 |
>>>> streaming |
>>>> >> 14/C50000E0   | 14/C50000E0    | 14/C50000E0    | 14/C50000E0     |
>>>> >>     0 | async
>>>> >> (1 çs)
>>>> >>
>>>> >
>>>> > There is one pg_stat_replication record because dmp04 is connecting to
>>>> dmp02
>>>> >
>>>> > This phenomenon is reproduced by user. What we need to do is just make
>>>> the
>>>> > scripts like follow_master.sh and launch pgpool process with -F [path
>>>> to
>>>> > pcp.conf].
>>>> >
>>>> > 1. Before failover
>>>> >
>>>> >> -bash-3.2$ psql -h 10.1.1.187 -p 9999 -c "show pool_nodes"
>>>> >> node_id | hostname | port | status | lb_weight | role
>>>> >> ---------+----------+------+--------+-----------+---------
>>>> >> 0 | slave1 | 5444 | 2 | 0.333333 | primary
>>>> >> 1 | slave2 | 5444 | 2 | 0.333333 | standby
>>>> >> 2 | slave3 | 5444 | 2 | 0.333333 | standby
>>>> >> (3 行)
>>>> >
>>>> >
>>>> > 2. We stopped slave 1 node with "pg_ctl -D /data -m immediate stop".
>>>> >
>>>> > 3. After failover
>>>> >
>>>> >> -bash-3.2$ psql -h 10.1.1.187 -p 9999 -c "show pool_nodes"
>>>> >> node_id | hostname | port | status | lb_weight | role
>>>> >> ---------+----------+------+--------+-----------+---------
>>>> >> 0 | slave1 | 5444 | 2 | 0.333333 | standby
>>>> >> 1 | slave2 | 5444 | 2 | 0.333333 | primary
>>>> >> 2 | slave3 | 5444 | 2 | 0.333333 | standby
>>>> >> (3 行)
>>>> >
>>>> >
>>>> > 4. Then user got output like below after failover that is not expected
>>>> that
>>>> > is present at the end of follow_master.sh.log i.e.
>>>> >
>>>> >> NEW_MASTER_NAME =slave1
>>>> >
>>>> >
>>>> > [root at master1 ~]# cat /tmp/failover.sh.log
>>>> >
>>>> >> 2013年 3月 13日 水曜日 11:36:28 JST
>>>> >> FAILED_NODE_ID =0
>>>> >> FAILED_NODE_NAME=slave1
>>>> >> FAILED_NODE_PORT=5444
>>>> >> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
>>>> >> NEW_MASTER_ID =1
>>>> >> OLD_MASTER_ID =0
>>>> >> NEW_MASTER_NAME =slave2
>>>> >> OLD_PRIMARY_ID =0
>>>> >> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>>>> >> 0
>>>> >>
>>>> > [root at master1 ~]# cat /tmp/follow_master.sh.log
>>>> >
>>>> >> 2013年 3月 13日 水曜日 11:36:30 JST
>>>> >> FAILED_NODE_ID =0
>>>> >> FAILED_NODE_NAME=slave1
>>>> >> FAILED_NODE_PORT=5444
>>>> >> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
>>>> >> NEW_MASTER_ID =1
>>>> >> OLD_MASTER_ID =0
>>>> >> NEW_MASTER_NAME =slave2
>>>> >> OLD_PRIMARY_ID =0
>>>> >> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>>>> >> 2013年 3月 13日 水曜日 11:36:32 JST
>>>> >> FAILED_NODE_ID =2
>>>> >> FAILED_NODE_NAME=slave3
>>>> >> FAILED_NODE_PORT=5444
>>>> >> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
>>>> >> NEW_MASTER_ID =1
>>>> >> OLD_MASTER_ID =0
>>>> >> NEW_MASTER_NAME =*slave1
>>>> >> *OLD_PRIMARY_ID =0
>>>> >> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>>>> >
>>>> >
>>>> > Please guide. Thanks.
>>>> >
>>>> > Best Regards,
>>>> > Asif Naeem
>>>>
>>>
>>>
>>
>> _______________________________________________
>> pgpool-hackers mailing list
>> pgpool-hackers at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>
>>
> 
> 
> -- 
> Ahsan Hadi
> Snr Director Product Development
> EnterpriseDB Corporation
> The Enterprise Postgres Company
> 
> Phone: +92-51-8358874
> Mobile: +92-333-5162114
> 
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
> 
> This e-mail message (and any attachment) is intended for the use of the
> individual or entity to whom it is addressed. This message contains
> information from EnterpriseDB Corporation that may be privileged,
> confidential, or exempt from disclosure under applicable law. If you are
> not the intended recipient or authorized to receive this for the intended
> recipient, any use, dissemination, distribution, retention, archiving, or
> copying of this communication is strictly prohibited. If you have received
> this e-mail in error, please notify the sender immediately by reply e-mail
> and delete this message.


More information about the pgpool-hackers mailing list