[pgpool-hackers: 3387] Re: Failover consensus on even number of nodes

Tatsuo Ishii ishii at sraoss.co.jp
Sat Aug 17 17:00:08 JST 2019


> Hi Ishii-San
> 
> 
> On Thu, Aug 15, 2019 at 11:42 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> Hi Usama,
>>
>> When number of Pgpool-II nodes is even, it seems consensus based
>> failover occurs if n/2 Pgpool-II agrees on the failure. For example,
>> if there are 4 nodes of Pgpool-II, 2 nodes agree on the failure,
>> failover occurs. Is there any reason behind this? I am asking because
>> it could easily lead to split brain, because 2 nodes could agree on
>> the failover while other 2 nodes disagree. Actually other HA software,
>> for example etcd, requires n/2+1 vote to gain consensus.
>>
>>
>> https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#what-is-failure-tolerance
>>
>> With n/2+1 vote requirements, there's no possibility of split brain.
>>
>>
> Yes, your observation is spot on. The original motivation to consider the
> exact n/2 votes for consensus rather (n/2 +1)
> was to ensure the working of 2 node Pgpool-II clusters.
> My understanding was that most of the users use 2 Pgpool-II nodes in their
> setup, so I wanted
> to make sure that in the case when one of the Pgpool-II nodes goes down (
> In 2 node) cluster the consensus
> should still be possible.
> But your point is also valid that makes the system prone to split-brain. So
> what are your suggestions on that?
> I think we can introduce a new configuration parameter to enable/disable
> n/2 node consensus.

If my understanding is correct, current behavior for 2 node Pgpool-II
clusters there's no difference whether failover_when_quorum_exists is
on or off. That means for 2 node Pgpool-II clusters even if we change
n/2 node consensus to n/2+1 consensus, 2 node users could keep the
existing behavior by turning off failover_when_quorum_exists. If this
is correct, we don't need to introduce the new switch for 4.1, just
change n/2 node consensus to n/2+1 consensus. What do you think?

The only concern is 4 node Pgpool-II clusters. I doubt there's 4 node
users in the field though.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list