<div dir="ltr">Hi Ishii-San<div><br></div><div>I have been thinking about this issue in the past days and unfortunately</div><div>couldn't think of a good solution.</div><div><br></div><div>As you mentioned in native replication mode and with snapshot isolation</div><div>as well almost all queries are sent to multiple PostgreSQL backends and</div><div>even when we issue pg_terminate_backend for one backend process</div><div>the Pgpool-II keeps on waiting for the other backends to finish their tasks.</div><div>Now there could be few solutions to this problem but each has its own drawback.</div><div><br></div><div>The first solution would be to identify all the relevant backend PIDs</div><div>and issue individual pg_terminate_backend function to each backend</div><div>node after substituting the appropriate PID argument to the function.</div><div>But the problem with this solution is not only this is complex to</div><div>implement but also has some fundamental differences from the</div><div>actual functionality provided by the pg_terminate_backend(),</div><div>Since the pg_terminate_backend is supposed to kill a single troubling</div><div>backend process that was identified by the user from pg_stat_activity or any</div><div>other similar means so the user might not expect the function to influence</div><div>or kill some other backend on a different PG server. SO that might</div><div>come as a surprise for some users.</div><div><br></div><div>The second solution is to clearly mention this behavior in the</div><div>documentation and informs the users about how to proceed</div><div>with pg_terminate_backend in native replication and snapshot</div><div>isolation modes.</div><div>i.e.</div><div>step1: get the PIDs of all the backend process related to the particular</div><div>pgpool child process</div><div>step2: issue pg_terminate_backend for each PID individually.</div><div><br></div><div>The third solution I can think of is to add a new pgpool-II local</div><div>function for the purpose.</div><div>For example, we can add a new function in Pgpool-II say</div><div>pool_terminate_all_backends(pgpool-child-pid) and that new function</div><div>issues pg_terminate_backed() from within for each backend process.</div><div><br></div><div>What are your thoughts on these possible solutions and on a way forward?</div><div><br></div><div>Thanks</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 20, 2020 at 11:01 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Another fundamental problem with pg_terminate_backend() with native<br>
replication mode (and snapshot isolation mode) is, pgpool needs to<br>
send pg_termnaite_backend() with different argument because the<br>
argument is a process id, which is not incosistent among backends.<br>
<br>
> Hi Usma,<br>
> <br>
> While looking into the 073.pg_terminate_backend test failure I found<br>
> interesting issue.<br>
> <br>
> Supoose we execute following SQL in native replication mode:<br>
> <br>
> ssesion 1: select pg_sleep(60); /* at time 't1' */<br>
> <br>
> session 2: select pg_terminate_backend('7615'); /* at time 't2' */<br>
> <br>
> The pg_sleep() should be canceled at time t2, but actually it is<br>
> canceled at t2 + 60 seconds. Also after the cancel we get:<br>
> <br>
> WARNING: packet kind of backend 1 ['D'] does not match with master/majority nodes packet kind ['E']<br>
> WARNING: write on backend 0 failed with error :"Success"<br>
> DETAIL: while trying to write data from offset: 0 wlen: 5<br>
> FATAL: failed to read kind from backend<br>
> DETAIL: kind mismatch among backends. Possible last query was: "select pg_sleep(60);" kind details are: 0[E: terminating connection due to administrator command] 1[D]<br>
> HINT: check data consistency among db nodes<br>
> <br>
> What actually happening here is:<br>
> <br>
> 2020-08-20 13:01:46: psql pid 7603: LOG: DB node id: 0 backend pid: 7615 statement: BEGIN<br>
> 2020-08-20 13:01:46: psql pid 7603: LOG: DB node id: 1 backend pid: 7616 statement: BEGIN<br>
> 2020-08-20 13:01:46: psql pid 7603: LOG: DB node id: 0 backend pid: 7615 statement: select pg_sleep(60); <-- pgpool 7603 waiting for response from backend 0.<br>
> 2020-08-20 13:02:06: psql pid 7598: LOG: DB node id: 0 backend pid: 7632 statement: SELECT version()<br>
> 2020-08-20 13:02:06: psql pid 7598: LOG: DB node id: 0 backend pid: 7632 statement: SELECT count(*) FROM pg_catalog.pg_proc AS p, pg_catalog.pg_namespace AS n WHERE p.proname = 'pg_terminate_backend' AND n.oid = p.pronamespace AND n.nspname ~ '.*' AND p.provolatile = 'v'<br>
> 2020-08-20 13:02:06: psql pid 7598: LOG: found the pg_terminate_backend request for backend pid:7615 on backend node:0<br>
> 2020-08-20 13:02:06: psql pid 7598: DETAIL: setting the connection flag<br>
> 2020-08-20 13:02:06: psql pid 7598: LOG: DB node id: 0 backend pid: 7632 statement: select pg_terminate_backend(7615);<br>
> 2020-08-20 13:02:06: psql pid 7603: LOG: DB node id: 1 backend pid: 7616 statement: select pg_sleep(60); <--- pgpool 7603 got response because pg_terminate_backend executed. pgpool 7603 started to wait for response from backend 1.<br>
> 2020-08-20 13:03:06: psql pid 7603: WARNING: packet kind of backend 1 ['D'] does not match with master/majority nodes packet kind ['E'] <-- after 60 seconds passed, pgpool 7603 got response from bacnend 0 and 1. <-- since backend 0 got error while backend 1 sucessfully executed pg_sleep(60), there were difference in packet kind.<br>
> 2020-08-20 13:03:06: psql pid 7603: FATAL: failed to read kind from backend <-- and pgpool get angry!<br>
> 2020-08-20 13:03:06: psql pid 7603: DETAIL: kind mismatch among backends. Possible last query was: "select pg_sleep(60);" kind details are: 0[E: terminating connection due to administrator command] 1[D]<br>
> 2020-08-20 13:03:06: psql pid 7603: HINT: check data consistency among db nodes<br>
> 2020-08-20 13:03:06: psql pid 7603: WARNING: write on backend 0 failed with error :"Success"<br>
> 2020-08-20 13:03:06: psql pid 7603: DETAIL: while trying to write data from offset: 0 wlen: 5<br>
> 2020-08-20 13:03:06: main pid 7572: LOG: child process with pid: 7603 exits with status 512<br>
> <br>
> <br>
> Any idea how to deal with this problem?<br>
> <br>
> Best regards,<br>
> --<br>
> Tatsuo Ishii<br>
> SRA OSS, Inc. Japan<br>
> English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>
> Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>
> _______________________________________________<br>
> pgpool-hackers mailing list<br>
> <a href="mailto:pgpool-hackers@pgpool.net" target="_blank">pgpool-hackers@pgpool.net</a><br>
> <a href="http://www.pgpool.net/mailman/listinfo/pgpool-hackers" rel="noreferrer" target="_blank">http://www.pgpool.net/mailman/listinfo/pgpool-hackers</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div style="color:rgb(0,0,0)"><div><div><span style="color:rgb(68,68,68)">...<br></span></div><div><span style="color:rgb(68,68,68)">Muhammad Usama<br></span></div><div><span><span style="font-family:Lato"><span style="font-size:14px"><span style="color:rgb(68,68,68)">Highgo Software (Canada/China/Pakistan)</span><span><span style="color:rgb(68,68,68)"> </span></span></span></span></span><span style="color:rgb(68,68,68)"><br></span></div><div><span><span style="font-family:Lato"><span style="font-size:14px"><span style="color:rgb(68,68,68)">URL :</span><span><span style="color:rgb(68,68,68)"> </span></span></span></span></span><a href="http://www.highgo.ca/" style="color:rgb(89,143,222);font-family:Lato;font-size:14px" target="_blank"><span style="color:rgb(68,68,68)">http://www.highgo.ca</span></a><span><span style="font-family:Lato"><span style="font-size:14px"><span><span style="color:rgb(68,68,68)"> </span></span></span></span></span><span style="color:rgb(68,68,68)"><br></span></div><div><span><span style="font-family:Lato"><span style="font-size:14px"><span style="color:rgb(68,68,68)">ADDR: 10318 WHALLEY BLVD, Surrey, BC</span><span><span style="color:rgb(68,68,68)"> </span></span></span></span></span><br></div></div></div></div></div>