[pgpool-hackers: 3475] Re: Proposal: health check statistics

Tue Dec 10 11:14:28 JST 2019

On Tue, 10 Dec 2019 09:18:58 +0900 (JST)
Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Currently Pgpool-II's health check process logs various information
> including backend connection problem, retrying to recover from it, and
> so on. This information is very important for users because it reports
> the healthiness problem of PostgreSQL.　For example, observing
> increase of retry count may suggest that network connection between
> Pgpool-II and PostgreSQL having trouble so that users could replace
> the switch before actual failure occurs. Problem is, it is annoying to
> look for such that information from log files afterward since it may
> already disappear or was not logged by other problems (such as disk
> full).
> 
> I would like to propose a new feature:
> 
> - Accumulate health check statistics on shared memory so that later on
>   users can look into the stats using PCP commands.
> 
> - Such statistics includes:
>   - failure count per backend nodes
>   - retry count per backend nodes
>   - success count after retries

How about collecting statistics of response time? For example:
- average response time per backend nodes
- maximum response time of successful check

If these are available, it may help users tune timeout values in
configurations.

> 
> Comments and suggestions are welcome.
> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers

-- 
Yugo Nagata <nagata at sraoss.co.jp>