[pgpool-hackers: 3449] Signal unblock leak in failover

Tatsuo Ishii ishii at sraoss.co.jp
Thu Oct 3 21:15:01 JST 2019


When failover event occurs, register_node_operation_request() gets
called to en-queue failover/failback requests. If the request queue is
full, this function returns false with unlocking semaphore. But it
forgot to unblock signal mask. This leads to block all signals
including SITERM, which makes pgpool fail to shutdown.

I found the bug while investigating regression test
055.backend_all_down test occasional failure. When the test fail or
succeeds, subsequent tests occasionally fail because pgpool process
created in the 055 test remains. I think the bug explains why the
process remains.

In the test pgpool starts with no PostgreSQL starting and this creates
immediate multiple failover requests. Currently the queue length is
10, which may become full if failover requests rush in and that could
trigger the bug, I guess.

I am going to push a fix for this to all supported branches.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list