[pgpool-general: 5729] Re: pgpool says its degenerating a backend, but it does not...

Tatsuo Ishii ishii at sraoss.co.jp
Wed Sep 20 10:47:03 JST 2017


Hi Benjamin,

I think you hit a bug of Pgpool-II. Will look into this.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Hi,
> 
> we are using pgpool 3.6.5 with replication and load balancing. The
> option "failover_if_affected_tuples_mismatch" is set to ON.
> The option worked well for some time, if a mismatched tuple was detected
> on a DELETE statement, the node was degenerated. But for some reason it
> now only works from time to time, in most cases the node stays up,
> despite pgpool log saying it was shut down.
> 
> Example with 2 database nodes:
> 
> /test=# show pool_nodes;//
> // node_id | hostname  | port | status | lb_weight |  role  | select_cnt
> | load_balance_node | replication_delay//
> //---------+-----------+------+--------+-----------+--------+------------+-------------------+-------------------//
> // 0       | 10.0.1.9  | 5432 | up     | 0.500000  | master | 32        
> | false             | 0//
> // 1       | 10.0.1.11 | 5432 | up     | 0.500000  | slave  | 12        
> | true              | 0//
> //(2 rows)//
> /
> 
> I manually add a table entry into one of the nodes to get them out of
> sync and send a DELETE to pgpool:
> /
> //test=# DELETE FROM test;//
> //ERROR:  pgpool detected difference of the number of inserted, updated
> or deleted tuples. Possible last query was: "DELETE FROM test;"//
> //HINT:  check data consistency between master and other db node//
> //server closed the connection unexpectedly//
> //        This probably means the server terminated abnormally//
> //        before or while processing the request.//
> //The connection to the server was lost. Attempting reset: Succeeded./
> 
> 
> The log file states that the node was shut down:
> 
> /Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09: pid
> 3967: LOG:  pgpool detected difference of the number of inserted,
> updated or deleted tuples. Possible last query was: "DELETE FROM test;"//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: LOG:  processing command complete//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: DETAIL:  CommandComplete: Number of affected tuples are: 1 0//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: LOG:  processing ready for query message//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: DETAIL:  ReadyForQuery: Degenerate backends: 1//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: LOG:  processing ready for query message//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: DETAIL:  ReadyForQuery: Number of affected tuples are: 1 0//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3967: LOG:  received degenerate backend request for node_id: 1 from
> pid [3967]//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 798: LOG:  Pgpool-II parent process has received failover request//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 798: LOG:  starting degeneration. shutdown host 10.0.1.11(5432)//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 798: LOG:  Restart all children//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 798: LOG:  failover: set new primary node: -1//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 798: LOG:  failover: set new master node: 0//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:09:
> pid 3969: LOG:  worker process received restart request//
> //Sep 19 19:10:09 sumak-test-pgpool pgpool[798]: failover done. shutdown
> host 10.0.1.11(5432)2017-09-19 19:10:09: pid 798: LOG:  failover done.
> shutdown host 10.0.1.11(5432)//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 3968: LOG:  restart request received in pcp child process//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  PCP child 3968 exits with status 0 in failover()//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  fork a new PCP child pid 5163 in failover()//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3451 exits with status 0//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3451 exited with success and will
> not be restarted//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3452 exits with status 0//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3452 exited with success and will
> not be restarted//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3453 exits with status 0//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3453 exited with success and will
> not be restarted//
> //Sep 19 19:10:10 sumak-test-pgpool pgpool[798]: 2017-09-19 19:10:10:
> pid 798: LOG:  child process with pid: 3454 exits with status 0/
> 
> 
> but the node is still up:
> 
> /test=# show pool_nodes;//
> // node_id | hostname  | port | status | lb_weight |  role  | select_cnt
> | load_balance_node | replication_delay//
> //---------+-----------+------+--------+-----------+--------+------------+-------------------+-------------------//
> // 0       | 10.0.1.9  | 5432 | up     | 0.500000  | master | 44        
> | true              | 0//
> // 1       | 10.0.1.11 | 5432 | up     | 0.500000  | slave  | 54        
> | false             | 0//
> //(2 rows)/
> 
> 
> If I send another DELETE the process repeats, but the node stays up.
> 
> 
> Is this a bug or is my understanding wrong somewhere?
> 
> 
> Thanks in advance
> Benjamin Firl
> 
> 
> -- 
> +++++++++++++++++++++++++++++++++++++++++++++++++
> 
> +++ Jetzt neu Wissensmanagement für Netzwerke +++
> 
> +++             www.knodge.de                 +++
> 
> +++++++++++++++++++++++++++++++++++++++++++++++++
> 
>  
> 
> --
> 
> www.wisit.com
> 
> www.knodge.de
> 
>  
> 
>  
> 
> wisit media GmbH
> 
> Ehrenbergstr. 11
> 
> D-98693 Ilmenau
> 
> ---------------------------------------------------------------------------
> 
> -
> 
> wisit media GmbH, Ehrenbergstr. 11, D-98693 Ilmenau
> 
> Registergericht Jena HRB 512472
> 
> Geschaeftsfuehrung: Dipl. Ing. Markus Duelli
> 
>  
> 
>  
> 
> Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich
> 
> erhalten haben, informieren Sie bitte sofort den Absender und vernichten
> 
> Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe
> 
> dieser E-Mail ist nicht gestattet. 
> 


More information about the pgpool-general mailing list