[pgpool-general: 5728] pgpool says its degenerating a backend, but it does not...

Benjamin Firl bf at wisit.com
Wed Sep 20 02:18:13 JST 2017


Hi,

we are using pgpool 3.6.5 with replication and load balancing. The
option "failover_if_affected_tuples_mismatch" is set to ON.
The option worked well for some time, if a mismatched tuple was detected
on a DELETE statement, the node was degenerated. But for some reason it
now only works from time to time, in most cases the node stays up,
despite pgpool log saying it was shut down.

Example with 2 database nodes:

/test=# show pool_nodes;//
// node_id | hostname  | port | status | lb_weight |  role  | select_cnt
| load_balance_node | replication_delay//
//---------+-----------+------+--------+-----------+--------+------------+-------------------+-------------------//
// 0       | 10.0.1.9  | 5432 | up     | 0.500000  | master | 32        
| false             | 0//
// 1       | 10.0.1.11 | 5432 | up     | 0.500000  | slave  | 12        
| true              | 0//
//(2 rows)//
/

I manually add a table entry into one of the nodes to get them out of
sync and send a DELETE to pgpool:
/
//test=# DELETE FROM test;//
//ERROR:  pgpool detected difference of the number of inserted, updated
or deleted tuples. Possible last query was: "DELETE FROM test;"//
//HINT:  check data consistency between master and other db node//
//server closed the connection unexpectedly//
//        This probably means the server terminated abnormally//
//        before or while processing the request.//
//The connection to the server was lost. Attempting reset: Succeeded./


The log file states that the node was shut down:

/Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09: pid
3967: LOG:  pgpool detected difference of the number of inserted,
updated or deleted tuples. Possible last query was: "DELETE FROM test;"//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: LOG:  processing command complete//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: DETAIL:  CommandComplete: Number of affected tuples are: 1 0//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: LOG:  processing ready for query message//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: DETAIL:  ReadyForQuery: Degenerate backends: 1//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: LOG:  processing ready for query message//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: DETAIL:  ReadyForQuery: Number of affected tuples are: 1 0//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3967: LOG:  received degenerate backend request for node_id: 1 from
pid [3967]//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 798: LOG:  Pgpool-II parent process has received failover request//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 798: LOG:  starting degeneration. shutdown host 10.0.1.11(5432)//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 798: LOG:  Restart all children//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 798: LOG:  failover: set new primary node: -1//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 798: LOG:  failover: set new master node: 0//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
pid 3969: LOG:  worker process received restart request//
//Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: failover done. shutdown
host 10.0.1.11(5432)2017-09-19 19:10:09: pid 798: LOG:  failover done.
shutdown host 10.0.1.11(5432)//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 3968: LOG:  restart request received in pcp child process//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  PCP child 3968 exits with status 0 in failover()//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  fork a new PCP child pid 5163 in failover()//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3451 exits with status 0//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3451 exited with success and will
not be restarted//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3452 exits with status 0//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3452 exited with success and will
not be restarted//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3453 exits with status 0//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3453 exited with success and will
not be restarted//
//Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
pid 798: LOG:  child process with pid: 3454 exits with status 0/


but the node is still up:

/test=# show pool_nodes;//
// node_id | hostname  | port | status | lb_weight |  role  | select_cnt
| load_balance_node | replication_delay//
//---------+-----------+------+--------+-----------+--------+------------+-------------------+-------------------//
// 0       | 10.0.1.9  | 5432 | up     | 0.500000  | master | 44        
| true              | 0//
// 1       | 10.0.1.11 | 5432 | up     | 0.500000  | slave  | 54        
| false             | 0//
//(2 rows)/


If I send another DELETE the process repeats, but the node stays up.


Is this a bug or is my understanding wrong somewhere?


Thanks in advance
Benjamin Firl


-- 
+++++++++++++++++++++++++++++++++++++++++++++++++

+++ Jetzt neu Wissensmanagement für Netzwerke +++

+++             www.knodge.de                 +++

+++++++++++++++++++++++++++++++++++++++++++++++++

 

--

www.wisit.com

www.knodge.de

 

 

wisit media GmbH

Ehrenbergstr. 11

D-98693 Ilmenau

---------------------------------------------------------------------------

-

wisit media GmbH, Ehrenbergstr. 11, D-98693 Ilmenau

Registergericht Jena HRB 512472

Geschaeftsfuehrung: Dipl. Ing. Markus Duelli

 

 

Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich

erhalten haben, informieren Sie bitte sofort den Absender und vernichten

Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe

dieser E-Mail ist nicht gestattet. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20170919/494b1c16/attachment.html>


More information about the pgpool-general mailing list