[pgpool-hackers: 887] Re: Making Failover more robust.

Thu Apr 23 17:35:55 JST 2015

> Hi
> 
> Please see my response inline.
> 
> On Mon, Apr 20, 2015 at 6:34 AM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> 
>> Hi,
>>
>> It seems like good idea, but I have some questions.
> 
> 
>> I'm not sure how different from using health_check_max_retries.
>>
> 
> health_check_max_retries only waits for a specific amount of time for the
> node to get back online and does not care about what has caused the node to
> become unavailable. And having larger values for this configuration to
> cover the transient errors also delays the failovers in cases of actual
> node failures.

I have thought about your idea a little bit more. Suppose we have a
primary PostgreSQL node and two standby PostgreSQL nodes. If the
primary returns max connections reaching error, then remaining two
standby nodes will return the same error soon or later because
pgpool-II tries to connect all of DB nodes and it's rare that standbys
have different max_connections than the primary. This will result in
"all backend down" error, which is same as the current situation.

>> How should NODE_TEMP_DOWN be defined? This should include
>> only max_connections error or also other case like health
>> check errors within health_check_max_retries?
>>
> 
> I am thinking of NODE_TEMP_DOWN  for only temporary kind of errors
> where PostgreSQL node is reachable but connection is explicitly closed by
> PG server. Currently I can only think of  max_connections reached error at
> the moment, but I am sure there are other cases.
> 
> 
>> How are NODE_TEMP_DOWN nodes treated by child processes?
>> While the status is NODE_TEMP_DOWN, are these allowed to be
>> sent queries from children?
>>
> 
> I think it should be treated similarly as NODE_DOWN status by child
> processes.

Suppose one of standby nodes comes back from the NODE_TEMP_DOWN state
first. Since there's no writable DB node, DML query will fail. This is
not good from user's point of view. Maybe we could only allow standby
nodes come back from the state only if the primary is online. However
this is too complex and I am not sure it's worth the trouble.

In summary, the idea "NODE_TEMP_DOWN" itself is great but I am not
sure the max_connections problem is best handled by the
state. Probably NODE_TEMP_DOWN state should be applied to temporary
errors *and* which are not equally happen on all DB nodes (like the
max_connections problem). Probably certain kind of network error might
be one of the candidates but I'm not sure if we could reliably know
that kind of state from the error code which the OS returns.

>> How long does NODE_TEMP_DOWN state last? Forever untill
>> health check succeeds again? Or, this should be controlled
>> by other parameter?
>>
> 
> This one need to be thought out a little more. Some of the options are. It
> always remains as NODE_TEMP_DOWN until the node comes back or die
> permanently, Or we can control it with a new configuration parameter, which
> could put a time limit on this status before failing the node.
> 
> 
> Thanks
> Kind regards
> Muhammad Usama
> 
> 
>> On Fri, 17 Apr 2015 20:07:10 +0500
>> Muhammad Usama <m.usama at gmail.com> wrote:
>>
>> > Hi
>> >
>> > Currenlty pgpool-II does not discriminate between types and nature of
>> > backend failures, especially when performing the backend health check,
>> And
>> > it triggers the node failover as soon as the health check fails to
>> connect
>> > to backend PostgreSQL server (of course after retries gets expired). This
>> > is a big problem in case of transient failures like for example if
>> > max_connection is reached on the backend node and health check connection
>> > gets denied, it will still be considered as a backend node failure by
>> > pgpool-II and it will go on to trigger a failover. Despite the fact that
>> > node actually is working fine and pgpool-II child processes are
>> > successfully connected to that.
>> >
>> > So I think pgpool-II health check should consider the cause and type of
>> > error happened on backend and depending on the type of error It should
>> > either register the failover request, ignore the error or may be just
>> > change the backend node status. We could introduce a new node status to
>> > identify these type of situations, (e-g NODE_TEMP_DOWN) and have a new
>> > configuration parameter to control the behavior of this state. And
>> instead
>> > of straight away initiating the failover on a node, Health check keeps on
>> > probing for the node with this new NODE_TEMP_DOWN status and
>> automatically
>> > make the node available when health check succeeds on the node.
>> >
>> > Thoughts, suggestions and design ideas are most welcome
>> >
>> > Thanks
>> > Best regards!
>> > Muhammad Usama
>>
>>
>> --
>> Yugo Nagata <nagata at sraoss.co.jp>
>>