[pgpool-hackers: 2003] Re: Proposal to make backend node failover mechanism quorum aware

Tatsuo Ishii ishii at sraoss.co.jp
Wed Jan 25 13:05:14 JST 2017


Usama,

> This is correct. If the Pgpool-II is used in maste-standby mode (With
> elastic or virtual-IP and clients only connect to one Pgpool-II server
> only) then there is not much issues that could be caused by the
> interruption of link between AZ1 and AZ2 as you defined above.
> 
> But the issue arrives when the Pgpool-II is used in the master-master
> mode(clients connect to all available Pgpool-II) then consider the
> following scenario.
> 
> a) Link between AZ1 and AZ2 broke, at that time B1 was master while B2 was
> standby.
> 
> b) Pgpool-C in AZ2 promote B2 to the master since Pgpool-C is not able to
> connect old master (B1)

I thought Pgpool-C sucides because it cannot get quorum in this case, no?

> c) A client connects to Pgpool-C and issues a write statement. It will land
> on the B2 PostgreSQL server, which was promoted as master in step b.
> 
> c-1) Another client connects to Pgpool-A and also issues a write statement
> that will land on the B1 PostgreSQL server as it the master node in AZ.
> 
> d) The link between AZ1 and AZ2 is restored, but now the PostgreSQL B1 and
> B2 both have different sets of data and with no easy way to get both
> changes in one place and restore the cluster to original state.
> 
> The above scenario will become more complicated if both availability zones
> AZ1 and AZ2 have multiple Pgpool-II nodes, since retiring the multiple
> Pgpool-II nodes logic will become more complex when link disruption between
> AZ1 and AZ2.
> 
> So the proposal tries to solve this by making sure that we should always
> have only one master PostgreSQL node in the cluster and never end up in the
> situation where we have different sets of data in different PostgreSQL
> nodes.
> 
> 
> 
>> > There is also a question ("[pgpool-general: 5179] Architecture Questions
>> > <http://www.sraoss.jp/pipermail/pgpool-general/2016-December/005237.html
>> >")
>> > posted by a user in pgpool-general mailing list who wants a similar type
>> of
>> > network that spans over two AWS availability zones and Pgpool-II has no
>> > good answer to avoid split-brain of backend nodes if the corporate link
>> > between two zones suffers a glitch.
>>
>> That seems totally different story to me because there two independent
>> streaming replication primary servers in the east and west regions.
>>
>>
> I think the original question statement was a little bit confusing. How I
> understand the user requirements later in the thread was that.
> The user has a couple of PostgreSQL nodes in two availability zones (total
> 4 PG nodes) and all four nodes are part of the single streaming replication
> setup.
> Both zones have two Pgpool-II nodes each. (Total 4 Pgpool-II nodes in the
> cluster).
> Each availability zone has one application server that connects to one of
> two Pgpool-II in the that availability zone. (That makes it master-master
> Pgpool setup). And the user is concerned about split-brain of PostgreSQL
> servers when the corporate link between zones becomes unavailable.
> 
> Thanks
> Best regards
> Muhammad Usama
> 
> 
> 
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > Thanks
>> > Best regards
>> > Muhammad Usama
>> >
>> >
>> >
>> >>
>> >> Best regards,
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS, Inc. Japan
>> >> English: http://www.sraoss.co.jp/index_en.php
>> >> Japanese:http://www.sraoss.co.jp
>> >>
>> >> >> > Hi Hackers,
>> >> >> >
>> >> >> > This is the proposal to make the failover of backend PostgreSQL
>> nodes
>> >> >> > quorum aware to make it more robust and fault tolerant.
>> >> >> >
>> >> >> > Currently Pgpool-II proceeds to failover the backend node as soon
>> as
>> >> the
>> >> >> > health check detects the failure or in case of an error occurred on
>> >> the
>> >> >> > backend connection (when fail_over_on_backend_error is set). This
>> is
>> >> good
>> >> >> > enough for the standalone Pgpool-II server.
>> >> >> >
>> >> >> > But consider the scenario where we have more than one Pgpool-II
>> (Say
>> >> >> > Pgpool-A, Pgpool-B and Pgpool-C) in the cluster connected through
>> >> >> watchdog
>> >> >> > and each Pgpool-II node is configured with two PostgreSQL backends
>> >> (B1
>> >> >> and
>> >> >> > B2).
>> >> >> >
>> >> >> > Now if due to some network glitch or an issue, Pgpool-A fails or
>> loses
>> >> >> its
>> >> >> > network connection with backend B1, The Pgpool-A will detect the
>> >> failure
>> >> >> > and detach (failover) the B1 backend and also pass this information
>> >> to
>> >> >> the
>> >> >> > other Pgpool-II nodes (Pgpool-II B and Pgpool-II C), Although the
>> >> Backend
>> >> >> > B1 was perfectly healthy and it was also reachable from Pgpool-B
>> and
>> >> >> > Pgpool-C nodes, But still because of a network glitch between
>> >> Pgpool-A
>> >> >> and
>> >> >> > Backend B1, it will get detached from the cluster and the worst
>> part
>> >> is,
>> >> >> if
>> >> >> > the B1 was a master PostgreSQL (in master-standby configuration),
>> the
>> >> >> > Pgpool-II failover would also promote the B2 PostgreSQL node as a
>> new
>> >> >> > master, hense making the way for split-brain and/or data
>> corruptions.
>> >> >> >
>> >> >> > So my proposal is that when the Watchdog is configured in Pgpool-II
>> >> the
>> >> >> > backend health check of Pgpool-II should consult with other
>> attached
>> >> >> > Pgpool-II nodes over the watchdog to decide if the Backend node is
>> >> >> actually
>> >> >> > failed or if it is just a localized glitch/false alarm. And the
>> >> failover
>> >> >> on
>> >> >> > the node should only be performed, when the majority of cluster
>> >> members
>> >> >> > agrees on the failure of nodes.
>> >> >> >
>> >> >> > This quorum aware architecture of failover will prevents the false
>> >> >> > failovers and split-brain scenarios in the Backend nodes.
>> >> >> >
>> >> >> > What are your thoughts and suggestions on this?
>> >> >> >
>> >> >> > Thanks
>> >> >> > Best regards
>> >> >> > Muhammad Usama
>> >> >>
>> >>
>>


More information about the pgpool-hackers mailing list