[pgpool-hackers: 3239] Re: Deal with recovery failure by an abnormally exiting child process

Yugo Nagata nagata at sraoss.co.jp
Tue Feb 12 14:01:05 JST 2019


On Tue, 08 Jan 2019 11:16:19 +0900 (JST)
Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> >> In bug 431, it was reported that recovery second stage fails if there
> >> was an abnormally exiting child process (typically caused by SIGKILL
> >> or segfault). This is because the global connection counter
> >> (Req_info->conn_counter) is left when the child process abnormaly
> >> exits. In general we have nothing to do for abnormaly exiting process
> >> situation and we recommend to restart whole Pgpool-II in this case.
> >> 
> >> However I find a tricky solution for a particular situation: if
> >> client_idle_limit_in_recovery is properly set (i.e.
> >> client_idle_limit_in_recovery >= recovery_timeout).
> 
> Sorry this should have been: 0< client_idle_limit_in_recovery <= recovery_timeout || client_idle_limit_in_recovery == -1
> 
> >> The logic is shown in the patch:
> >> 
> >> 	/*
> >> 	 * recovery_timeout was expired. Before returning with failure status,
> >> 	 * let's check if this is caused by the malformed conn_counter. If a child
> >> 	 * process abnormally exits (killed by SIGKILL or SEGFAULT, for example),
> >> 	 * then conn_counter is not decremented at process exit, thus it will
> >> 	 * never be returning to 0. This could be detected by checking if
> >> 	 * client_idle_limit_in_recovery is enabled and less value than
> >> 	 * recovery_timeout because all clients must be kicked out by the time
> >> 	 * when client_idle_limit_in_recovery is expired. If so, we should reset
> >> 	 * conn_counter to 0 also.
> >> 
> >> Should we emply this? Is it too tricky? Comments are welcome.

I think it is a good assumpsion that if client_idle_limit_in_recovery is expired
here then all clients' conn_counter can be reset to 0. 

BTW, will this fix be included in the next minor release planned at Feb 21st?
The customer who reported this issue want to know when the fixed version gets
available.


> > 
> > Forgot to attach patch.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > English: http://www.sraoss.co.jp/index_en.php
> > Japanese:http://www.sraoss.co.jp
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers


-- 
Yugo Nagata <nagata at sraoss.co.jp>


More information about the pgpool-hackers mailing list