<div dir="ltr">Hi Ishii-San<div><br></div><div>I have tried like a dozen times but never got the failure even once.</div><div><br></div><div>Looking at the log snippet you shared it appears that this failure</div><div>is a random timing-based and it is not really a problem with the</div><div>handling of pg_terminate_backend but somehow it is related to</div><div>missing connection_count_down() call.</div><div><br></div><div>I suppose connection_count_down() is not called in one of the code</div><div>path that gets executed when we get this "write on backend failed"</div><div>warning</div><div>i.e.</div><div><font face="monospace">.WARNING: write on backend 0 failed with error :"Success"<br>DETAIL: while trying to write data from offset: 0 wlen: 5<br></font></div><div><br></div><div>The reason I suspect that is, I never saw this above warning message in</div><div>pg_terminate_backend test case logs, but it is present in the log snippet</div><div>you shared above so I suspect if the backend gets terminated when</div><div>pgpool process is about to write to the backend socket then in that code</div><div>patch somehow we don't execute the connection_count_down() function</div><div>that leads to this regression test failure.</div><div><br></div><div><br></div><div><br></div>
<div>Thanks</div><div>Best regards</div><div>Muhammad Usama</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Aug 3, 2020 at 10:40 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Usama,<br>
<br>
Recently 073.pg_terminate_backend sometimes fails (like 5 of 10).<br>
<br>
count_up: 2<br>
count_down: 1<br>
connection count is not matched.<br>
<br>
Is this somewhat related to the regression failure?<br>
<br>
creating test environment...PostgreSQL major version: 123<br>
Starting set up in streaming replication mode<br>
creating startall and shutdownall<br>
creating failover script<br>
creating database cluster /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/data0...done.<br>
update postgresql.conf<br>
creating pgpool_remote_start<br>
creating basebackup.sh<br>
creating recovery.conf<br>
creating database cluster /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/data1...done.<br>
update postgresql.conf<br>
creating pgpool_remote_start<br>
creating basebackup.sh<br>
creating recovery.conf<br>
temporarily start data0 cluster to create extensions<br>
temporarily start pgpool-II to create standby nodes<br>
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change <br>
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------<br>
0 | /tmp | 11002 | up | 0.500000 | primary | 0 | true | 0 | | | 2020-08-01 15:34:15<br>
1 | /tmp | 11003 | down | 0.500000 | standby | 0 | false | 0 | | | 2020-08-01 15:34:14<br>
(2 rows)<br>
<br>
recovery node 1...pcp_recovery_node -- Command Successful<br>
done.<br>
creating follow master script<br>
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change <br>
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------<br>
0 | /tmp | 11002 | up | 0.500000 | primary | 0 | false | 0 | | | 2020-08-01 15:34:15<br>
1 | /tmp | 11003 | up | 0.500000 | standby | 0 | true | 0 | | | 2020-08-01 15:34:19<br>
(2 rows)<br>
<br>
shutdown all<br>
<br>
pgpool-II setting for streaming replication mode is done.<br>
To start the whole system, use /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/startall.<br>
To shutdown the whole system, use /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/shutdownall.<br>
pcp command user name is "t-ishii", password is "t-ishii".<br>
Each PostgreSQL, pgpool-II and pcp port is as follows:<br>
#1 port is 11002<br>
#2 port is 11003<br>
pgpool port is 11000<br>
pcp port is 11001<br>
The info above is in README.port.<br>
done.<br>
waiting for server to start....15620 2020-08-01 15:34:25 JST LOG: starting PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit<br>
15620 2020-08-01 15:34:25 JST LOG: listening on IPv4 address "0.0.0.0", port 11002<br>
15620 2020-08-01 15:34:25 JST LOG: listening on IPv6 address "::", port 11002<br>
15620 2020-08-01 15:34:25 JST LOG: listening on Unix socket "/tmp/.s.PGSQL.11002"<br>
15620 2020-08-01 15:34:25 JST LOG: redirecting log output to logging collector process<br>
15620 2020-08-01 15:34:25 JST HINT: Future log output will appear in directory "log".<br>
done<br>
server started<br>
waiting for server to start....15633 2020-08-01 15:34:25 JST LOG: starting PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit<br>
15633 2020-08-01 15:34:25 JST LOG: listening on IPv4 address "0.0.0.0", port 11003<br>
15633 2020-08-01 15:34:25 JST LOG: listening on IPv6 address "::", port 11003<br>
15633 2020-08-01 15:34:25 JST LOG: listening on Unix socket "/tmp/.s.PGSQL.11003"<br>
15633 2020-08-01 15:34:25 JST LOG: redirecting log output to logging collector process<br>
15633 2020-08-01 15:34:25 JST HINT: Future log output will appear in directory "log".<br>
done<br>
server started<br>
pg_terminate_backend <br>
----------------------<br>
t<br>
(1 row)<br>
<br>
FATAL: terminating connection due to administrator command<br>
ERROR: unable to forward message to frontend<br>
DETAIL: FATAL error occured on backend<br>
server closed the connection unexpectedly<br>
This probably means the server terminated abnormally<br>
before or while processing the request.<br>
connection to server was lost<br>
2020-08-01 15:34:28: psql pid 15671: LOG: found the pg_terminate_backend request for backend pid:15693 on backend node:0<br>
count_up: 2<br>
count_down: 2<br>
2020-08-01 15:34:33: main pid 15719: DEBUG: initializing pool configuration<br>
2020-08-01 15:34:33: main pid 15719: DETAIL: num_backends: 2 total_weight: 2.000000<br>
2020-08-01 15:34:33: main pid 15719: DEBUG: initializing pool configuration<br>
2020-08-01 15:34:33: main pid 15719: DETAIL: backend 0 weight: 1073741823.500000 flag: 0000<br>
2020-08-01 15:34:33: main pid 15719: DEBUG: initializing pool configuration<br>
2020-08-01 15:34:33: main pid 15719: DETAIL: backend 1 weight: 1073741823.500000 flag: 0000<br>
2020-08-01 15:34:33: main pid 15719: LOG: stop request sent to pgpool. waiting for termination...<br>
.done.<br>
2020-08-01 15:34:34: main pid 15719: DEBUG: shmem_exit(-1): 2 callbacks to make<br>
2020-08-01 15:34:34: main pid 15719: DEBUG: proc_exit(-1): 0 callbacks to make<br>
waiting for server to shut down.... done<br>
server stopped<br>
waiting for server to shut down.... done<br>
server stopped<br>
creating test environment...PostgreSQL major version: 123<br>
Starting set up in native replication mode<br>
creating startall and shutdownall<br>
creating failover script<br>
creating database cluster /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/data0...done.<br>
update postgresql.conf<br>
creating pgpool_remote_start<br>
creating basebackup.sh<br>
creating database cluster /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/data1...done.<br>
update postgresql.conf<br>
creating pgpool_remote_start<br>
creating basebackup.sh<br>
start all<br>
waiting for pgpool-II coming up...done.<br>
create extensions<br>
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change <br>
---------+----------+-------+--------+-----------+--------+------------+-------------------+-------------------+-------------------+------------------------+---------------------<br>
0 | /tmp | 11002 | up | 0.500000 | master | 0 | true | 0 | | | 2020-08-01 15:34:37<br>
1 | /tmp | 11003 | up | 0.500000 | slave | 0 | false | 0 | | | 2020-08-01 15:34:37<br>
(2 rows)<br>
<br>
shutdown all<br>
<br>
pgpool-II setting for native replication mode is done.<br>
To start the whole system, use /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/startall.<br>
To shutdown the whole system, use /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/tests/073.pg_terminate_backend/testdir/shutdownall.<br>
pcp command user name is "t-ishii", password is "t-ishii".<br>
Each PostgreSQL, pgpool-II and pcp port is as follows:<br>
#1 port is 11002<br>
#2 port is 11003<br>
pgpool port is 11000<br>
pcp port is 11001<br>
The info above is in README.port.<br>
done.<br>
waiting for server to start....15952 2020-08-01 15:34:45 JST LOG: starting PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit<br>
15952 2020-08-01 15:34:45 JST LOG: listening on IPv4 address "0.0.0.0", port 11002<br>
15952 2020-08-01 15:34:45 JST LOG: listening on IPv6 address "::", port 11002<br>
15952 2020-08-01 15:34:45 JST LOG: listening on Unix socket "/tmp/.s.PGSQL.11002"<br>
15952 2020-08-01 15:34:45 JST LOG: redirecting log output to logging collector process<br>
15952 2020-08-01 15:34:45 JST HINT: Future log output will appear in directory "log".<br>
done<br>
server started<br>
waiting for server to start....15967 2020-08-01 15:34:45 JST LOG: starting PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit<br>
15967 2020-08-01 15:34:45 JST LOG: listening on IPv4 address "0.0.0.0", port 11003<br>
15967 2020-08-01 15:34:45 JST LOG: listening on IPv6 address "::", port 11003<br>
15967 2020-08-01 15:34:45 JST LOG: listening on Unix socket "/tmp/.s.PGSQL.11003"<br>
15967 2020-08-01 15:34:45 JST LOG: redirecting log output to logging collector process<br>
15967 2020-08-01 15:34:45 JST HINT: Future log output will appear in directory "log".<br>
done<br>
server started<br>
pg_terminate_backend <br>
----------------------<br>
t<br>
(1 row)<br>
<br>
2020-08-01 15:34:48: psql pid 16011: LOG: found the pg_terminate_backend request for backend pid:16028 on backend node:0<br>
count_up: 2<br>
count_down: 1<br>
connection count is not matched.<br>
2020-08-01 15:34:53: main pid 16057: DEBUG: initializing pool configuration<br>
2020-08-01 15:34:53: main pid 16057: DETAIL: num_backends: 2 total_weight: 2.000000<br>
2020-08-01 15:34:53: main pid 16057: DEBUG: initializing pool configuration<br>
2020-08-01 15:34:53: main pid 16057: DETAIL: backend 0 weight: 1073741823.500000 flag: 0000<br>
2020-08-01 15:34:53: main pid 16057: DEBUG: initializing pool configuration<br>
2020-08-01 15:34:53: main pid 16057: DETAIL: backend 1 weight: 1073741823.500000 flag: 0000<br>
2020-08-01 15:34:53: main pid 16057: LOG: stop request sent to pgpool. waiting for termination...<br>
.WARNING: write on backend 0 failed with error :"Success"<br>
DETAIL: while trying to write data from offset: 0 wlen: 5<br>
server closed the connection unexpectedly<br>
This probably means the server terminated abnormally<br>
before or while processing the request.<br>
connection to server was lost<br>
done.<br>
2020-08-01 15:34:54: main pid 16057: DEBUG: shmem_exit(-1): 2 callbacks to make<br>
2020-08-01 15:34:54: main pid 16057: DEBUG: proc_exit(-1): 0 callbacks to make<br>
waiting for server to shut down....... done<br>
server stopped<br>
waiting for server to shut down....... done<br>
server stopped<br>
</blockquote></div>