<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 7, 2015 at 5:54 AM, Tatsuo Ishii <span dir="ltr">&lt;<a href="mailto:ishii@postgresql.org" target="_blank">ishii@postgresql.org</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">&gt; I&#39;ve done some more thinking on this, and I think the solution is<br>

&gt; simply *not* to record pgpool_status when *all* nodes are down (i.e.<br>

&gt; only record it when at least one node is up).  So pgpool_status will<br>

&gt; always reflect the last set of nodes to which any data was written.<br>

&gt; Upon restart, if the up-to-date (previously &quot;up&quot;) node is in fact down<br>

&gt; (regardless of whether the stale (&quot;down&quot;) node is back up), pgpool<br>

&gt; will detect this in its health check and will fail; if the up-to-date<br>

&gt; (previously &quot;up&quot;) node is back up, then pgpool will commence using it.<br>

<br>

</span>A downside of this approach is, the first health check after pgpool<br>

restarting may take long time due to certain health checking retry<br>

setting if the node is still down. However this downside is not new to<br>

your approach (and there&#39;s a workaround: users can manually edit<br>

pgpool_status file to set the node status to &quot;down&quot;). Besides this,<br>

your approach seems attractive.<br></blockquote><div><br></div><div>I have been looking into this one and thinking of any other possible solutions we can have. And I think this solution of not recording the status of last working node when it goes down should work without any problem, And I believe this is a very good solution in a sense that it would require a minimum amount of code changes and the downside of the solution that at startup might require a long time when the previously last alive node is still down should not be worrisome since that situation would always require a user intervention to fix and a little extra time should not hurt in that case.</div><div><br></div><div>Along with that, I think while we are on that, it would be a good thing to record some more information in the pool_status file regarding the node that fails. Like along with the node status, if we also put the timestamp of the node status change and the last successful query processed by that backend node in the pool status file, It could be helpful in deciding the direction of recovery and may be for debugging the cause of node failure.</div><div><br></div><div>Kind regards</div><div>Muhammad Usama</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class="im HOEnZb"><br>

Best regards,<br>

--<br>

Tatsuo Ishii<br>

SRA OSS, Inc. Japan<br>

English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>

Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

</span><div class="HOEnZb"><div class="h5">_______________________________________________<br>

pgpool-general mailing list<br>

<a href="mailto:pgpool-general@pgpool.net">pgpool-general@pgpool.net</a><br>

<a href="http://www.pgpool.net/mailman/listinfo/pgpool-general" rel="noreferrer" target="_blank">http://www.pgpool.net/mailman/listinfo/pgpool-general</a><br>

</div></div></blockquote></div><br></div></div>