[pgpool-hackers: 1150] Modifying the pgpool-II startup mechanism when the watchdog is enabled.

Muhammad Usama m.usama at gmail.com
Tue Nov 10 20:40:16 JST 2015


Hi All


While working on the watchdog enhancements, I noticed that there is a
scenario at the time of pgpool-II startup which can cause unexpected
results when the pgpool-II watchdog is enabled.

Consider the situation when the pgpool-II is started and the
PostgreSQL backend is already down. Now, as soon as the pgpool-II starts-up
it will go on to process the failover on the down backend node. But the
watchdog process may still be in the startup mode and is not fully
connected to the watchdog cluster (watchdog cluster is the logical set of
all the pgpool-II nodes connected through pgpool-II watchdog). So at that
time the failover process of pgpool-II will try to go in interlocking state
so that the failover and/or follow master scripts should only gets executed
by a single pgpool-II node, but as the watchdog is still in startup state,
so it will decline the request of pgpool-II failover process to acquire the
command lock that ensures the "failover"/"failback"/"follow master" command
only gets to execute on single pgpool-II node.
So upon the request of interlocking being declined by watchdog (note that
the decline to process an interlocking request by watchdog is different
from failed to acquire the lock) the failover process will go on and
execute the respective failover command. And at the same time another
pgpool-II node, started at the same instance in time will do the very same
steps. Eventually we will end up with multiple pgpool-II nodes calling the
failover commands and that will lead to an undesired behavior.

So my suggestion is to change the startup mechanism of pgpool-II, when the
watchdog is enabled on it. With the watchdog enabled, pgpool-II should wait
until the watchdog process gets into a stable state before starting its
normal routine tasks, i.e. Health checking (if enabled) and spawning the
child process.
Although this will make the pgpool-II startup takes a little more time, but
will always yield the expected behavior.

Your comments and suggestion.

Thanks
Best regards
Muhammad Usama
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20151110/e9c119ca/attachment.html>


More information about the pgpool-hackers mailing list