<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi,<br>
<br>
we are using pgpool 3.6.5 with replication and load balancing. The
option "failover_if_affected_tuples_mismatch" is set to ON.<br>
The option worked well for some time, if a mismatched tuple was
detected on a DELETE statement, the node was degenerated. But for
some reason it now only works from time to time, in most cases the
node stays up, despite pgpool log saying it was shut down.<br>
<br>
Example with 2 database nodes:<br>
<br>
<i>test=# show pool_nodes;</i><i><br>
</i><i> node_id | hostname | port | status | lb_weight | role |
select_cnt | load_balance_node | replication_delay</i><i><br>
</i><i>---------+-----------+------+--------+-----------+--------+------------+-------------------+-------------------</i><i><br>
</i><i> 0 | 10.0.1.9 | 5432 | up | 0.500000 | master |
32 | false | 0</i><i><br>
</i><i> 1 | 10.0.1.11 | 5432 | up | 0.500000 | slave |
12 | true | 0</i><i><br>
</i><i>(2 rows)</i><i><br>
</i><br>
<br>
I manually add a table entry into one of the nodes to get them out
of sync and send a DELETE to pgpool:<br>
<i><br>
</i><i>test=# DELETE FROM test;</i><i><br>
</i><i>ERROR: pgpool detected difference of the number of inserted,
updated or deleted tuples. Possible last query was: "DELETE FROM
test;"</i><i><br>
</i><i>HINT: check data consistency between master and other db
node</i><i><br>
</i><i>server closed the connection unexpectedly</i><i><br>
</i><i> This probably means the server terminated abnormally</i><i><br>
</i><i> before or while processing the request.</i><i><br>
</i><i>The connection to the server was lost. Attempting reset:
Succeeded.</i><br>
<br>
<br>
The log file states that the node was shut down:<br>
<br>
<i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: LOG: pgpool detected difference of the number
of inserted, updated or deleted tuples. Possible last query was:
"DELETE FROM test;"</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: LOG: processing command complete</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: DETAIL: CommandComplete: Number of affected
tuples are: 1 0</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: LOG: processing ready for query message</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: DETAIL: ReadyForQuery: Degenerate backends: 1</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: LOG: processing ready for query message</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: DETAIL: ReadyForQuery: Number of affected
tuples are: 1 0</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3967: LOG: received degenerate backend request for
node_id: 1 from pid [3967]</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 798: LOG: Pgpool-II parent process has received
failover request</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 798: LOG: starting degeneration. shutdown host
10.0.1.11(5432)</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 798: LOG: Restart all children</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 798: LOG: failover: set new primary node: -1</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 798: LOG: failover: set new master node: 0</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:09: pid 3969: LOG: worker process received restart request</i><i><br>
</i><i>Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: failover done.
shutdown host 10.0.1.11(5432)2017-09-19 19:10:09: pid 798: LOG:
failover done. shutdown host 10.0.1.11(5432)</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 3968: LOG: restart request received in pcp child
process</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: PCP child 3968 exits with status 0 in
failover()</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: fork a new PCP child pid 5163 in
failover()</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3451 exits with
status 0</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3451 exited with
success and will not be restarted</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3452 exits with
status 0</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3452 exited with
success and will not be restarted</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3453 exits with
status 0</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3453 exited with
success and will not be restarted</i><i><br>
</i><i>Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19
19:10:10: pid 798: LOG: child process with pid: 3454 exits with
status 0</i><br>
<br>
<br>
but the node is still up:<br>
<br>
<i>test=# show pool_nodes;</i><i><br>
</i><i> node_id | hostname | port | status | lb_weight | role |
select_cnt | load_balance_node | replication_delay</i><i><br>
</i><i>---------+-----------+------+--------+-----------+--------+------------+-------------------+-------------------</i><i><br>
</i><i> 0 | 10.0.1.9 | 5432 | up | 0.500000 | master |
44 | true | 0</i><i><br>
</i><i> 1 | 10.0.1.11 | 5432 | up | 0.500000 | slave |
54 | false | 0</i><i><br>
</i><i>(2 rows)</i><br>
<br>
<br>
If I send another DELETE the process repeats, but the node stays up.<br>
<br>
<br>
Is this a bug or is my understanding wrong somewhere? <br>
<br>
<br>
Thanks in advance<br>
Benjamin Firl<br>
<br>
<br>
<pre class="moz-signature" cols="72">--
+++++++++++++++++++++++++++++++++++++++++++++++++
+++ Jetzt neu Wissensmanagement für Netzwerke +++
+++ <a class="moz-txt-link-abbreviated" href="http://www.knodge.de">www.knodge.de</a> +++
+++++++++++++++++++++++++++++++++++++++++++++++++
--
<a class="moz-txt-link-abbreviated" href="http://www.wisit.com">www.wisit.com</a>
<a class="moz-txt-link-abbreviated" href="http://www.knodge.de">www.knodge.de</a>
wisit media GmbH
Ehrenbergstr. 11
D-98693 Ilmenau
---------------------------------------------------------------------------
-
wisit media GmbH, Ehrenbergstr. 11, D-98693 Ilmenau
Registergericht Jena HRB 512472
Geschaeftsfuehrung: Dipl. Ing. Markus Duelli
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich
erhalten haben, informieren Sie bitte sofort den Absender und vernichten
Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe
dieser E-Mail ist nicht gestattet. </pre>
</body>
</html>