[pgpool-hackers: 234] Re: Unexpected behavior of pgpool-II

Ahsan Hadi ahsan.hadi at enterprisedb.com
Wed May 8 18:41:29 JST 2013


On Wed, May 8, 2013 at 2:32 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Hi Ahsan,
>
> > Hi Tatsuo,
> >
> > Good to see that you found the problem. We have a customer who has been
> waiting for a fix for this issue for sometime.
> >
> > Which branches are you planning to commit this to and when?
>
> The fix has been already committed to master, 3.2-stable and 3.1-stable.
>

Thanks Tatsuo.

Were you able to reproduce the issue? As i understand it wasn't straight
forward to reproduce.



> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
> > -- Ahsan
> >
> > -----Original Message-----
> > From: Tatsuo Ishii <ishii at postgresql.org>
> > Sender: pgpool-hackers-bounces at pgpool.net
> > Date: Wed, 08 May 2013 18:02:03
> > To: <anaeem.it at gmail.com>
> > Cc: <pgpool-hackers at pgpool.net>
> > Subject: [pgpool-hackers: 231] Re: Unexpected behavior of pgpool-II
> >
> > Sorry for delay. I think I found a problem regarding %H for
> > follow_master_command. Pgpool-II assigns it "new master node", which
> > is defined as "the first live node". On the other hand %H value
> > expected in follow_master_command is "new primary node", which is not
> > neccesary same as "new master node".
> >
> > Attached patch should fix the problem(against 3.2-stable).
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > English: http://www.sraoss.co.jp/index_en.php
> > Japanese: http://www.sraoss.co.jp
> >
> >> Hi,
> >>
> >> There is unexpected behavior in pgpool2 that seems to be a bug. Please
> find
> >> the details as following i.e.
> >>
> >> Phenomenon
> >>
> >>> pgpool2 configuration option "follow_master_command" special character
> %H
> >>> (Hostname of the new master node) is not working as expected
> >>> In master slave mode after failover, 2nd node become new master but got
> >>> unexpected hostname (via %H). Please find attached pgpool.conf for
> details.
> >>
> >>
> >> Versions
> >>
> >>> OS version: RHEL - 64
> >>> pgpool version: pgpool-II version 3.2.1 (namameboshi)
> >>
> >>
> >> *Details/Observations* :-
> >>
> >>> At Start
> >>
> >>>  node_id | hostname | port | status | lb_weight |  role
> >>> ---------+----------+------+--------+-----------+---------
> >>>  0       | dmp02    | 5444 | 2      | 0.333333  | primary
> >>>  1       | dmp03    | 5444 | 2      | 0.333333  | standby
> >>>  2       | dmp04    | 5444 | 2      | 0.333333  | standby
> >>>
> >>
> >>
> >>> Failover arguments when the user stopped DMP02.
> >>
> >>> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:12 JST
> >>
> >> FAILED_NODE_ID =0
> >>> FAILED_NODE_NAME=dmp02
> >>> FAILED_NODE_PORT=5444
> >>> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> >>> NEW_MASTER_ID =1
> >>> OLD_MASTER_ID =0
> >>> NEW_MASTER_NAME =dmp03
> >>> OLD_PRIMARY_ID =0
> >>> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> >>> 0
> >>>
> >>
> >>
> >>> Then, these arguments told user that the master node has been changed
> >>
> >>> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:13 JST
> >>> FAILED_NODE_ID =0
> >>> FAILED_NODE_NAME=dmp02
> >>> FAILED_NODE_PORT=5444
> >>> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> >>> NEW_MASTER_ID =1
> >>> OLD_MASTER_ID =0
> >>> NEW_MASTER_NAME =dmp03
> >>> OLD_PRIMARY_ID =0
> >>> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> >>> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:17 JST
> >>> FAILED_NODE_ID =2
> >>> FAILED_NODE_NAME=dmp04
> >>> FAILED_NODE_PORT=5444
> >>> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> >>> NEW_MASTER_ID =1
> >>> OLD_MASTER_ID =0
> >>> NEW_MASTER_NAME =dmp02  *<--- *** The master name is unexpected ****
> >>> OLD_PRIMARY_ID =0
> >>> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> >>>
> >>
> >>
> >>>  node_id | hostname | port | status | lb_weight |  role
> >>> ---------+----------+------+--------+-----------+---------
> >>>  0       | dmp02    | 5444 | 2      | 0.333333  | standby
> >>>  1       | dmp03    | 5444 | 2      | 0.333333  | primary
> >>>  2       | dmp04    | 5444 | 2      | 0.333333  | standby
> >>>
> >>
> >>
> >>>   pid  | usesysid |   usename    | application_name |  client_addr  |
> >>> client_hostname | client_port |    backend_start                 |
> state
> >>>   | sent_location | write_location | flush_location | replay_location |
> >>> sync_priority | sync_state
> >>>
> >>>
> -------+----------+--------------+------------------+---------------+-----------------+-------------+----------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
> >>>  20641 |       10 | enterprisedb | dmp02            | 172.24.217.23 |
> >>>             |       16948 | 08-FEB-13 17:39:44.078173 +09:00 |
> streaming |
> >>> 14/C50000E0   | 14/C50000E0    | 14/C50000E0    | 14/C50000E0     |
> >>>     0 | async
> >>> (1 çs)
> >>>
> >>
> >> There is one pg_stat_replication record because dmp04 is connecting to
> dmp02
> >>
> >> This phenomenon is reproduced by user. What we need to do is just make
> the
> >> scripts like follow_master.sh and launch pgpool process with -F [path to
> >> pcp.conf].
> >>
> >> 1. Before failover
> >>
> >>> -bash-3.2$ psql -h 10.1.1.187 -p 9999 -c "show pool_nodes"
> >>> node_id | hostname | port | status | lb_weight | role
> >>> ---------+----------+------+--------+-----------+---------
> >>> 0 | slave1 | 5444 | 2 | 0.333333 | primary
> >>> 1 | slave2 | 5444 | 2 | 0.333333 | standby
> >>> 2 | slave3 | 5444 | 2 | 0.333333 | standby
> >>> (3 行)
> >>
> >>
> >> 2. We stopped slave 1 node with "pg_ctl -D /data -m immediate stop".
> >>
> >> 3. After failover
> >>
> >>> -bash-3.2$ psql -h 10.1.1.187 -p 9999 -c "show pool_nodes"
> >>> node_id | hostname | port | status | lb_weight | role
> >>> ---------+----------+------+--------+-----------+---------
> >>> 0 | slave1 | 5444 | 2 | 0.333333 | standby
> >>> 1 | slave2 | 5444 | 2 | 0.333333 | primary
> >>> 2 | slave3 | 5444 | 2 | 0.333333 | standby
> >>> (3 行)
> >>
> >>
> >> 4. Then user got output like below after failover that is not expected
> that
> >> is present at the end of follow_master.sh.log i.e.
> >>
> >>> NEW_MASTER_NAME =slave1
> >>
> >>
> >> [root at master1 ~]# cat /tmp/failover.sh.log
> >>
> >>> 2013年 3月 13日 水曜日 11:36:28 JST
> >>> FAILED_NODE_ID =0
> >>> FAILED_NODE_NAME=slave1
> >>> FAILED_NODE_PORT=5444
> >>> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> >>> NEW_MASTER_ID =1
> >>> OLD_MASTER_ID =0
> >>> NEW_MASTER_NAME =slave2
> >>> OLD_PRIMARY_ID =0
> >>> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> >>> 0
> >>>
> >> [root at master1 ~]# cat /tmp/follow_master.sh.log
> >>
> >>> 2013年 3月 13日 水曜日 11:36:30 JST
> >>> FAILED_NODE_ID =0
> >>> FAILED_NODE_NAME=slave1
> >>> FAILED_NODE_PORT=5444
> >>> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> >>> NEW_MASTER_ID =1
> >>> OLD_MASTER_ID =0
> >>> NEW_MASTER_NAME =slave2
> >>> OLD_PRIMARY_ID =0
> >>> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> >>> 2013年 3月 13日 水曜日 11:36:32 JST
> >>> FAILED_NODE_ID =2
> >>> FAILED_NODE_NAME=slave3
> >>> FAILED_NODE_PORT=5444
> >>> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> >>> NEW_MASTER_ID =1
> >>> OLD_MASTER_ID =0
> >>> NEW_MASTER_NAME =*slave1
> >>> *OLD_PRIMARY_ID =0
> >>> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> >>
> >>
> >> Please guide. Thanks.
> >>
> >> Best Regards,
> >> Asif Naeem
>



-- 
Ahsan Hadi
Snr Director Product Development
EnterpriseDB Corporation
The Enterprise Postgres Company

Phone: +92-51-8358874
Mobile: +92-333-5162114

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20130508/7ab1e502/attachment.html>


More information about the pgpool-hackers mailing list