<div dir="ltr">Hi Tatsuo,<div><br></div><div>          Please find attached the zip file.</div><div><br></div><div>Thanks And Regards,</div><div><br></div><div>  Lakshmi Y M</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Sep 2, 2019 at 5:13 AM Tatsuo Ishii &lt;<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Lakshmi,<br>

<br>

Your attached files are too large to accept by the mailing list. Can<br>

you compress them and post the message along the compressed attached<br>

files?<br>

<br>

Best regards,<br>

--<br>

Tatsuo Ishii<br>

SRA OSS, Inc. Japan<br>

English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>

Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

<br>

From: Lakshmi Raghavendra &lt;<a href="mailto:lakshmiym108@gmail.com" target="_blank">lakshmiym108@gmail.com</a>&gt;<br>

Subject: Fwd: [pgpool-general: 6672] Query<br>

Date: Sun, 1 Sep 2019 23:14:30 +0530<br>

Message-ID: &lt;<a href="mailto:CAHHVJ5sRoVFEEW4EoZLgudCTTm0cqGjXhbbkpnOiimcs4euUSw@mail.gmail.com" target="_blank">CAHHVJ5sRoVFEEW4EoZLgudCTTm0cqGjXhbbkpnOiimcs4euUSw@mail.gmail.com</a>&gt;<br>

<br>

&gt; ---------- Forwarded message ---------<br>

&gt; From: Lakshmi Raghavendra &lt;<a href="mailto:lakshmiym108@gmail.com" target="_blank">lakshmiym108@gmail.com</a>&gt;<br>

&gt; Date: Sat, Aug 31, 2019 at 10:17 PM<br>

&gt; Subject: Re: [pgpool-general: 6672] Query<br>

&gt; To: Tatsuo Ishii &lt;<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>&gt;<br>

&gt; Cc: Muhammad Usama &lt;<a href="mailto:m.usama@gmail.com" target="_blank">m.usama@gmail.com</a>&gt;, &lt;<a href="mailto:pgpool-general@pgpool.net" target="_blank">pgpool-general@pgpool.net</a>&gt;<br>

&gt; <br>

&gt; <br>

&gt; Hi Usama / Tatsuo,<br>

&gt; <br>

&gt;          Received the email notification today, sorry for the delayed<br>

&gt; response.<br>

&gt; Please find attached the pgpool-II log for the same.<br>

&gt; <br>

&gt; So basically below is the short summary of the issue:<br>

&gt; <br>

&gt; <br>

&gt; Node -1 : Pgpool Master + Postgres Master<br>

&gt; <br>

&gt; Node -2 : Pgpool Standby + Postgres Standby<br>

&gt; <br>

&gt; Node-3 : Pgpool Standby + Postgres Standby<br>

&gt; <br>

&gt; <br>

&gt; When network failure happens and Node-1 goes out of network, below is the<br>

&gt; status :<br>

&gt; <br>

&gt; Node-1 : Pgpool Lost status + Postgres Standby (down)<br>

&gt; <br>

&gt; Node -2 : Pgpool Master + Postgres Master<br>

&gt; <br>

&gt; Node-3 : Pgpool Standby + Postgres Standby<br>

&gt; <br>

&gt; <br>

&gt; Now when Node-1 comes back to network , below is the status causing the<br>

&gt; pgpool cluster to get into imbalance :<br>

&gt; <br>

&gt; <br>

&gt; <br>

&gt; lcm-34-189:~ # psql -h 10.198.34.191 -p 9999 -U pgpool postgres -c &quot;show<br>

&gt; pool_nodes&quot;<br>

&gt; Password for user pgpool:<br>

&gt;  node_id |   hostname    | port | status | lb_weight |  role   | select_cnt<br>

&gt; | load_balance_node | replication_delay | last_status_change<br>

&gt; ---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------<br>

&gt;  0       | 10.198.34.188 | 5432 | up     | 0.333333  | primary | 0<br>

&gt;  | true              | 0                 | 2019-08-31 16:40:26<br>

&gt;  1       | 10.198.34.189 | 5432 | up     | 0.333333  | standby | 0<br>

&gt;  | false             | 1013552           | 2019-08-31 16:40:26<br>

&gt;  2       | 10.198.34.190 | 5432 | up     | 0.333333  | standby | 0<br>

&gt;  | false             | 0                 | 2019-08-31 16:40:26<br>

&gt; (3 rows)<br>

&gt; <br>

&gt; lcm-34-189:~ # /usr/local/bin/pcp_watchdog_info -p 9898 -h 10.198.34.191 -U<br>

&gt; pgpool<br>

&gt; Password:<br>

&gt; 3 NO lcm-34-188.dev.lcm.local:9999 Linux lcm-34-188.dev.lcm.local<br>

&gt; 10.198.34.188<br>

&gt; <br>

&gt; lcm-34-189.dev.lcm.local:9999 Linux lcm-34-189.dev.lcm.local<br>

&gt; lcm-34-189.dev.lcm.local 9999 9000 7 STANDBY<br>

&gt; lcm-34-188.dev.lcm.local:9999 Linux lcm-34-188.dev.lcm.local 10.198.34.188<br>

&gt; 9999 9000 4 MASTER<br>

&gt; lcm-34-190.dev.lcm.local:9999 Linux lcm-34-190.dev.lcm.local 10.198.34.190<br>

&gt; 9999 9000 4 MASTER<br>

&gt; lcm-34-189:~ #<br>

&gt; <br>

&gt; <br>

&gt; <br>

&gt; Thanks And Regards,<br>

&gt; <br>

&gt;    Lakshmi Y M<br>

&gt; <br>

&gt; On Tue, Aug 20, 2019 at 8:55 AM Tatsuo Ishii &lt;<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>&gt; wrote:<br>

&gt; <br>

&gt;&gt; &gt; On Sat, Aug 17, 2019 at 12:28 PM Tatsuo Ishii &lt;<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>&gt;<br>

&gt;&gt; wrote:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; Hi Pgpool Team,<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;               *We are nearing a production release and running into<br>

&gt;&gt; the<br>

&gt;&gt; &gt;&gt; &gt; below issues.*<br>

&gt;&gt; &gt;&gt; &gt; Replies at the earliest would be highly helpful and greatly<br>

&gt;&gt; appreciated.<br>

&gt;&gt; &gt;&gt; &gt; Please let us know on how to get rid of the below issues.<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; We have a 3 node pgpool + postgres cluster - M1 , M2, M3. The<br>

&gt;&gt; pgpool.conf<br>

&gt;&gt; &gt;&gt; &gt; is as attached.<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; *Case I :  *<br>

&gt;&gt; &gt;&gt; &gt; M1 - Pgpool Master + Postgres Master<br>

&gt;&gt; &gt;&gt; &gt; M2 , M3 - Pgpool slave + Postgres slave<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; - M1 goes out of network. its marked as LOST in the pgpool cluster<br>

&gt;&gt; &gt;&gt; &gt; - M2 becomes postgres master<br>

&gt;&gt; &gt;&gt; &gt; - M3 becomes pgpool master.<br>

&gt;&gt; &gt;&gt; &gt; - When M1 comes back to the network, pgpool is able to solve split<br>

&gt;&gt; brain.<br>

&gt;&gt; &gt;&gt; &gt; However, its changing the postgres master back to M1 by logging a<br>

&gt;&gt; &gt;&gt; statement<br>

&gt;&gt; &gt;&gt; &gt; - &quot;LOG:  primary node was chenged after the sync from new master&quot;, so<br>

&gt;&gt; &gt;&gt; since<br>

&gt;&gt; &gt;&gt; &gt; M2 was already postgres master (and its trigger file is not touched)<br>

&gt;&gt; its<br>

&gt;&gt; &gt;&gt; &gt; not able to sync to the new master.<br>

&gt;&gt; &gt;&gt; &gt; *I somehow want to avoid this postgres master change..please let us<br>

&gt;&gt; know<br>

&gt;&gt; &gt;&gt; if<br>

&gt;&gt; &gt;&gt; &gt; there is a way to avoid it*<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; Sorry but I don&#39;t know how to prevent this. Probably when former<br>

&gt;&gt; &gt;&gt; watchdog master recovers from an network outage and there&#39;s already<br>

&gt;&gt; &gt;&gt; PostgreSQL primary server, the watchdog master should not sync the<br>

&gt;&gt; &gt;&gt; state. What do you think Usama?<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Yes, that&#39;s true, there is no functionality that exists in Pgpool-II to<br>

&gt;&gt; &gt; disable the backend node status synch. In fact that<br>

&gt;&gt; &gt; would be hazardous if we somehow disable the node status syncing.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; But having said that, In the mentioned scenario when the M1 comes back<br>

&gt;&gt; and<br>

&gt;&gt; &gt; join the watchdog cluster Pgpool-II should have<br>

&gt;&gt; &gt; kept the M2 as the true master while resolving the split-brain. The<br>

&gt;&gt; &gt; algorithm used to resolve the true master considers quite a<br>

&gt;&gt; &gt; few parameters and for the scenario, you explained, M2 should have kept<br>

&gt;&gt; the<br>

&gt;&gt; &gt; master node status while M1 should have resigned<br>

&gt;&gt; &gt; after joining back the cluster and effectively the M1 node should have<br>

&gt;&gt; been<br>

&gt;&gt; &gt; syncing the status from M2 ( keeping the proper primary node)<br>

&gt;&gt; &gt; not the other way around.<br>

&gt;&gt; &gt; Can you please share the Pgpool-II log files so that I can have a look at<br>

&gt;&gt; &gt; what went wrong in this case.<br>

&gt;&gt;<br>

&gt;&gt; Usama,<br>

&gt;&gt;<br>

&gt;&gt; Ok, the scenario (PostgreSQL primary x 2 in the end) should have not<br>

&gt;&gt; happend. That&#39;s a good news.<br>

&gt;&gt;<br>

&gt;&gt; Lakshmi,<br>

&gt;&gt;<br>

&gt;&gt; Can you please provide the Pgpool-II log files as Usama requested?<br>

&gt;&gt;<br>

&gt;&gt; Best regards,<br>

&gt;&gt; --<br>

&gt;&gt; Tatsuo Ishii<br>

&gt;&gt; SRA OSS, Inc. Japan<br>

&gt;&gt; English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>

&gt;&gt; Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

&gt;&gt;<br>

</blockquote></div>