<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 16, 2017 at 4:14 AM, Tatsuo Ishii <span dir="ltr">&lt;<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">&gt; On Fri, Mar 10, 2017 at 11:05 AM, Tatsuo Ishii &lt;<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>&gt; wrote:<br>

&gt;<br>

&gt;&gt; Usama,<br>

&gt;&gt;<br>

&gt;&gt; I have a question regarding Zone partitioning case described in<br>

&gt;&gt; section 2 in your proposal.  In my understanding after the network<br>

&gt;&gt; partitioning happens, Pgpool-II/watchdog in zone 2 will suicide<br>

&gt;&gt; because they cannot acquire quorum. So split-brain or data<br>

&gt;&gt; inconsistency due to two master node will not happen in even in<br>

&gt;&gt; Pgpool-II 3.6. Am I missing something?<br>

&gt;&gt;<br>

&gt;<br>

&gt; With the current design of watchdog the Pgpool-II/Watchdog commits suicide<br>

&gt; in only two cases.<br>

&gt;<br>

&gt; 1- When all network interfaces on the machine becomes unavailable(machine<br>

&gt; lost all IP addresses).<br>

&gt; 2- When connection to the up-stream trusted server becomes unreachable (if<br>

&gt; trusted_servers are configured)<br>

&gt;<br>

&gt; So in zone partitioning scenario described in section 2 the Pgpool-II nodes<br>

&gt; in zone 2 will not commit suicide because none<br>

&gt; of the above two conditions for node suicide exists.<br>

&gt;<br>

&gt; Also, doing the suicide as soon as the cluster looses the quorum doesn&#39;t<br>

&gt; feel like a good option because if we implement that we will end up with<br>

&gt; all the Pgpool-II nodes committing suicide as soon as the quorum is lost in<br>

&gt; the cluster and eventually the Pgpool-II service will become unavailable,<br>

&gt; and the administrator would require to manually restart Pgpool-II nodes.<br>

&gt; Current implementation makes sure that split-brain does not happen when a<br>

&gt; quorum is not available<br>

<br>

</span>How do you prevent split-brain without a quorum?<br></blockquote><div><br></div><div>Since In watchdog cluster a split-brain scenario would mean that more than one node becomes a delegate-IP holder.</div><div>So to prevent that whenever the quorum is not present in the cluster the watchdog node does not acquire a VIP and also if the cluster looses the quorum at any point in time the master node performs the de-escalation and releases the VIP.</div><div>This technique makes sure that only one delegate-IP holder watchdog exist in the cluster hence the split-brain never happens.</div><div><br></div><div>Thanks</div><div>Best Regards</div><div>Muhammad Usama</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5"><br>

&gt;  and at the same time keep looking for new/old-lost<br>

&gt; nodes to join back the cluster to make sure minimum possible service<br>

&gt; disruption happen and cluster recovers automatically without any manual<br>

&gt; intervention.<br>

&gt;<br>

&gt;<br>

&gt; Thanks<br>

&gt; Best regards<br>

&gt; Muhammad Usama<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;&gt; Best regards,<br>

&gt;&gt; --<br>

&gt;&gt; Tatsuo Ishii<br>

&gt;&gt; SRA OSS, Inc. Japan<br>

&gt;&gt; English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_<wbr>en.php</a><br>

&gt;&gt; Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.<wbr>jp</a><br>

&gt;&gt;<br>

&gt;&gt; From: Muhammad Usama &lt;<a href="mailto:m.usama@gmail.com">m.usama@gmail.com</a>&gt;<br>

&gt;&gt; Subject: Re: Proposal to make backend node failover mechanism quorum aware<br>

&gt;&gt; Date: Thu, 9 Mar 2017 00:57:58 +0500<br>

&gt;&gt; Message-ID: &lt;CAEJvTzXap+qMGLt7SQ-1hPgf=<wbr>aNuAYEsu_JQYd695hac0WagkA@<wbr>mail.<br>

&gt;&gt; <a href="http://gmail.com" rel="noreferrer" target="_blank">gmail.com</a>&gt;<br>

&gt;&gt;<br>

&gt;&gt; &gt; Hi<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Please use this document. The image quality of the previously shared<br>

&gt;&gt; &gt; version was not up to the mark.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Thanks<br>

&gt;&gt; &gt; Best regards<br>

&gt;&gt; &gt; Muhammad Usama<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; On Thu, Mar 9, 2017 at 12:53 AM, Muhammad Usama &lt;<a href="mailto:m.usama@gmail.com">m.usama@gmail.com</a>&gt;<br>

&gt;&gt; wrote:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; Hi Ishii-San<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; I have tried to create a detailed proposal to explain why and where the<br>

&gt;&gt; &gt;&gt; quorum aware backend failover mechanism would be useful.<br>

&gt;&gt; &gt;&gt; Can you please take a look at the attached pdf document and share your<br>

&gt;&gt; &gt;&gt; thoughts.<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; Thanks<br>

&gt;&gt; &gt;&gt; Kind Regards<br>

&gt;&gt; &gt;&gt; Muhammad Usama<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; On Wed, Jan 25, 2017 at 2:04 PM, Muhammad Usama &lt;<a href="mailto:m.usama@gmail.com">m.usama@gmail.com</a>&gt;<br>

&gt;&gt; wrote:<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt; On Wed, Jan 25, 2017 at 9:05 AM, Tatsuo Ishii &lt;<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>&gt;<br>

&gt;&gt; wrote:<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; Usama,<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; This is correct. If the Pgpool-II is used in maste-standby mode<br>

&gt;&gt; (With<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; elastic or virtual-IP and clients only connect to one Pgpool-II<br>

&gt;&gt; server<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; only) then there is not much issues that could be caused by the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; interruption of link between AZ1 and AZ2 as you defined above.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; But the issue arrives when the Pgpool-II is used in the<br>

&gt;&gt; master-master<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; mode(clients connect to all available Pgpool-II) then consider the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; following scenario.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; a) Link between AZ1 and AZ2 broke, at that time B1 was master while<br>

&gt;&gt; B2<br>

&gt;&gt; &gt;&gt;&gt;&gt; was<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; standby.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; b) Pgpool-C in AZ2 promote B2 to the master since Pgpool-C is not<br>

&gt;&gt; able<br>

&gt;&gt; &gt;&gt;&gt;&gt; to<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; connect old master (B1)<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; I thought Pgpool-C sucides because it cannot get quorum in this<br>

&gt;&gt; case, no?<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt; No, Pgpool-II only commits suicide only when it loses all network<br>

&gt;&gt; &gt;&gt;&gt; connections. Otherwise the master watchdog node is de-escalated when<br>

&gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt; quorum is lost.<br>

&gt;&gt; &gt;&gt;&gt; Committing a suicide everytime quorum is lost is very risky and not<br>

&gt;&gt; &gt;&gt;&gt; a feasible since it will shutdown the whole cluster as soon as a<br>

&gt;&gt; quorum<br>

&gt;&gt; &gt;&gt;&gt; loses even because of a small glitch.<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; c) A client connects to Pgpool-C and issues a write statement. It<br>

&gt;&gt; will<br>

&gt;&gt; &gt;&gt;&gt;&gt; land<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; on the B2 PostgreSQL server, which was promoted as master in step<br>

&gt;&gt; b.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; c-1) Another client connects to Pgpool-A and also issues a write<br>

&gt;&gt; &gt;&gt;&gt;&gt; statement<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; that will land on the B1 PostgreSQL server as it the master node<br>

&gt;&gt; in AZ.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; d) The link between AZ1 and AZ2 is restored, but now the PostgreSQL<br>

&gt;&gt; B1<br>

&gt;&gt; &gt;&gt;&gt;&gt; and<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; B2 both have different sets of data and with no easy way to get both<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; changes in one place and restore the cluster to original state.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; The above scenario will become more complicated if both availability<br>

&gt;&gt; &gt;&gt;&gt;&gt; zones<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; AZ1 and AZ2 have multiple Pgpool-II nodes, since retiring the<br>

&gt;&gt; multiple<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Pgpool-II nodes logic will become more complex when link disruption<br>

&gt;&gt; &gt;&gt;&gt;&gt; between<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; AZ1 and AZ2.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; So the proposal tries to solve this by making sure that we should<br>

&gt;&gt; &gt;&gt;&gt;&gt; always<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; have only one master PostgreSQL node in the cluster and never end<br>

&gt;&gt; up<br>

&gt;&gt; &gt;&gt;&gt;&gt; in the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; situation where we have different sets of data in different<br>

&gt;&gt; PostgreSQL<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; nodes.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; There is also a question (&quot;[pgpool-general: 5179] Architecture<br>

&gt;&gt; &gt;&gt;&gt;&gt; Questions<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; &lt;<a href="http://www.sraoss.jp/pipermail/pgpool-general/2016-December" rel="noreferrer" target="_blank">http://www.sraoss.jp/<wbr>pipermail/pgpool-general/2016-<wbr>December</a><br>

&gt;&gt; &gt;&gt;&gt;&gt; /005237.html<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&quot;)<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; posted by a user in pgpool-general mailing list who wants a<br>

&gt;&gt; similar<br>

&gt;&gt; &gt;&gt;&gt;&gt; type<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; of<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; network that spans over two AWS availability zones and Pgpool-II<br>

&gt;&gt; &gt;&gt;&gt;&gt; has no<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; good answer to avoid split-brain of backend nodes if the<br>

&gt;&gt; corporate<br>

&gt;&gt; &gt;&gt;&gt;&gt; link<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; between two zones suffers a glitch.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; That seems totally different story to me because there two<br>

&gt;&gt; independent<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; streaming replication primary servers in the east and west<br>

&gt;&gt; regions.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; I think the original question statement was a little bit confusing.<br>

&gt;&gt; &gt;&gt;&gt;&gt; How I<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; understand the user requirements later in the thread was that.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; The user has a couple of PostgreSQL nodes in two availability zones<br>

&gt;&gt; &gt;&gt;&gt;&gt; (total<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; 4 PG nodes) and all four nodes are part of the single streaming<br>

&gt;&gt; &gt;&gt;&gt;&gt; replication<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; setup.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Both zones have two Pgpool-II nodes each. (Total 4 Pgpool-II nodes<br>

&gt;&gt; in<br>

&gt;&gt; &gt;&gt;&gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; cluster).<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Each availability zone has one application server that connects to<br>

&gt;&gt; one<br>

&gt;&gt; &gt;&gt;&gt;&gt; of<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; two Pgpool-II in the that availability zone. (That makes it<br>

&gt;&gt; &gt;&gt;&gt;&gt; master-master<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Pgpool setup). And the user is concerned about split-brain of<br>

&gt;&gt; &gt;&gt;&gt;&gt; PostgreSQL<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; servers when the corporate link between zones becomes unavailable.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Thanks<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Best regards<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt; Muhammad Usama<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; Best regards,<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; --<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; Tatsuo Ishii<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; SRA OSS, Inc. Japan<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_<wbr>en.php</a><br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.<wbr>jp</a><br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; Thanks<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; Best regards<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt; Muhammad Usama<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; Best regards,<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; --<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; Tatsuo Ishii<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; SRA OSS, Inc. Japan<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_<wbr>en.php</a><br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.<wbr>jp</a><br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Hi Hackers,<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; This is the proposal to make the failover of backend<br>

&gt;&gt; &gt;&gt;&gt;&gt; PostgreSQL<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; nodes<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; quorum aware to make it more robust and fault tolerant.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Currently Pgpool-II proceeds to failover the backend node<br>

&gt;&gt; as<br>

&gt;&gt; &gt;&gt;&gt;&gt; soon<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; as<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; health check detects the failure or in case of an error<br>

&gt;&gt; &gt;&gt;&gt;&gt; occurred on<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; backend connection (when fail_over_on_backend_error is<br>

&gt;&gt; set).<br>

&gt;&gt; &gt;&gt;&gt;&gt; This<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; is<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; good<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; enough for the standalone Pgpool-II server.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; But consider the scenario where we have more than one<br>

&gt;&gt; &gt;&gt;&gt;&gt; Pgpool-II<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; (Say<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Pgpool-A, Pgpool-B and Pgpool-C) in the cluster connected<br>

&gt;&gt; &gt;&gt;&gt;&gt; through<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; watchdog<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; and each Pgpool-II node is configured with two PostgreSQL<br>

&gt;&gt; &gt;&gt;&gt;&gt; backends<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; (B1<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; and<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; B2).<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Now if due to some network glitch or an issue, Pgpool-A<br>

&gt;&gt; fails<br>

&gt;&gt; &gt;&gt;&gt;&gt; or<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; loses<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; its<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; network connection with backend B1, The Pgpool-A will<br>

&gt;&gt; detect<br>

&gt;&gt; &gt;&gt;&gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; failure<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; and detach (failover) the B1 backend and also pass this<br>

&gt;&gt; &gt;&gt;&gt;&gt; information<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; to<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; other Pgpool-II nodes (Pgpool-II B and Pgpool-II C),<br>

&gt;&gt; Although<br>

&gt;&gt; &gt;&gt;&gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; Backend<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; B1 was perfectly healthy and it was also reachable from<br>

&gt;&gt; &gt;&gt;&gt;&gt; Pgpool-B<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; and<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Pgpool-C nodes, But still because of a network glitch<br>

&gt;&gt; between<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; Pgpool-A<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; and<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Backend B1, it will get detached from the cluster and the<br>

&gt;&gt; &gt;&gt;&gt;&gt; worst<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; part<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; is,<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; if<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; the B1 was a master PostgreSQL (in master-standby<br>

&gt;&gt; &gt;&gt;&gt;&gt; configuration),<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Pgpool-II failover would also promote the B2 PostgreSQL<br>

&gt;&gt; node<br>

&gt;&gt; &gt;&gt;&gt;&gt; as a<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; new<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; master, hense making the way for split-brain and/or data<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; corruptions.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; So my proposal is that when the Watchdog is configured in<br>

&gt;&gt; &gt;&gt;&gt;&gt; Pgpool-II<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; backend health check of Pgpool-II should consult with<br>

&gt;&gt; other<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; attached<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Pgpool-II nodes over the watchdog to decide if the Backend<br>

&gt;&gt; &gt;&gt;&gt;&gt; node is<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; actually<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; failed or if it is just a localized glitch/false alarm.<br>

&gt;&gt; And<br>

&gt;&gt; &gt;&gt;&gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; failover<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; on<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; the node should only be performed, when the majority of<br>

&gt;&gt; &gt;&gt;&gt;&gt; cluster<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; members<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; agrees on the failure of nodes.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; This quorum aware architecture of failover will prevents<br>

&gt;&gt; the<br>

&gt;&gt; &gt;&gt;&gt;&gt; false<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; failovers and split-brain scenarios in the Backend nodes.<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; What are your thoughts and suggestions on this?<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Thanks<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Best regards<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Muhammad Usama<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt;<br>

</div></div></blockquote></div><br></div></div>