<div dir="ltr">Hi Tatsuo<div><br></div><div>Thanks a bundle for looking into the patch.<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Dec 14, 2013 at 2:35 PM, Tatsuo Ishii <span dir="ltr"><<a href="mailto:ishii@postgresql.org" target="_blank">ishii@postgresql.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Usama,<br>
<br>
Thanks for the great work.<br>
I have take a look at your patches.<br>
<br>
Compiling the source code:<br>
<br>
I see some "//" comments. I believe they are temporary ones. We follow<br>
the PostgreSQL coding standard and it does not allow to use "//"<br>
comments. I also see some compiler warnings because of function<br>
prototypes are not declared.<br></blockquote><div><br></div><div>Yes I intend to follow the PostgreSQL coding style and standards, the "//" style comments were left</div><div>in the code unintentionally, and I had this thing in the mind and last TODO of this effort is to clean up</div>
<div>the code for these kind of mistakes. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
I noticed you commented out following line which causes lots of<br>
compiler warnings. Is that intended?<br>
<br>
extern size_t strlcpy(char *dst, const char *src, size_t siz);<br></blockquote><div><br></div><div>Actually I use Mac OS/X for development and since OS/X is BSD based which already</div><div>contains the native strlcpy function defined in string.h, So I was getting the compilation error.</div>
<div><br></div><div>Please find the attached patch (strlcpy_fix.patch) to properly fix this problem, The patch adds the check</div><div>in configure to test if strlcpy is provided by OS and in that case does not externs the function in pool.h.</div>
<div>Since this issue also exists in the master branch so If you feel the comfortable with the patch and the fix</div><div>,can you please also apply this on master branch. </div><div>Note: You will require to run the autoconf after this patch.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div class="im"><br>
> pgpool_main.c PgpoolMain()<br>
><br>
> The patch changes the flow of how health check failures are processed,<br>
> health_check() function which<br>
> previously used to return health check failing node number or success is<br>
> changed to do_health_check(),<br>
> which reports an error through ereport in case of failed health check on<br>
> backend node and than the failure is<br>
> processed by "process_backend_health_check_failure()" function.<br>
> This portion of the code requires the most review and suggestions, Since<br>
> there are many different<br>
> ways to implement this health checking.<br>
<br>
</div>I saw followings:<br>
<br>
LOG: starting health check<br>
LOCATION: pgpool_main.c:369<br>
LOG: after retry 0, All backends are returned to healthy state<br>
LOCATION: pgpool_main.c:411<br>
<br>
Before "starting health check" message was debug message and should<br>
remain debug IMO. Otherwise they flood the log file.<br>
<br>
I also see problem with "after retry 0, All backends are returned to<br>
healthy state" message since the message appears even there's no retry<br>
(again before we do not spit the message if we are not doing retry).<br></blockquote><div><br></div><div>Thanks for pointing these. The 2nd attached patch "exmgr_fix.patch" tries to fix these, which can be applied</div>
<div>on EXCEPTION_MGR development branch.</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
Most importantly, the message style is pretty different from current<br>
one. no timestamp, no pid.<br></blockquote><div><br></div><div>Yes this one is known thing, I some how missed to point it out specifically. This is the part of the TODO item mentioned</div><div>upthread "changes and cleanup of elog.c and elog.h". I have intentionally left it as a last item because it deals mostly with</div>
<div>cosmetic issues. Apart from that this TODO item also have couple of more things which are needed to be done</div><div>by elog api. e.g. adding the configuration parameters similar to PostgreSQL's <br>log_min_error_statement, log_min_messages, client_min_messages to control the level of message generated and reported by pgpool.</div>
<div><br></div><div><br></div><div><br></div><div>Thanks</div><div>Muhammad Usama</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div class="im">
> Another way could be to use PG_TRY and PG_CATCH inside health_check<br>
> function and if<br>
> "make_persistent_db_connection()" throws an error, the PG_CATCH block<br>
> catches the error<br>
> and return the failing node number<br>
><br>
> something like<br>
><br>
> health_check()<br>
> {<br>
> ....<br>
> for(i =0; i< MAX_BACKEND; i++)<br>
> PG_TRY()<br>
> {<br>
> make_persistent_db_connection();<br>
> ...<br>
> }<br>
> PG_CATCH();<br>
> {<br>
> ....<br>
> return i;<br>
> }<br>
> PG_END_TRY();<br>
> return 0;<br>
> }<br>
><br>
><br>
> And than the main loop process the failure, the similar way it it<br>
> processing it now (with out the patch).<br>
> But the current approach followed by the patch seems to come more natural<br>
> with the exception manager<br>
> and yet on the negative side the current approach requires to add a state<br>
> machine to the main loop.<br>
<br>
</div>Agreed.<br>
<div class=""><div class="h5"><br>
> So suggestions and comments on this portion of code and implementation<br>
> approach would be very helpful.<br>
><br>
> pgpool child process loop<br>
> ==================<br>
> src/protocol/child.c do_child()<br>
><br>
> do_child() function is the work horse of pg_pool child process, and<br>
> basically contains two loops those<br>
> spans the life cycle of child process. First step is to wait for the client<br>
> connection, and once this part<br>
> is successful, the child is connected with the front end, the process goes<br>
> into the query process mode.<br>
> So I have used two memory contexts instead of one per loop memory context<br>
> for this process.<br>
> QueryProcessMemoryContext and ChildLoopMemoryContext, where later is used<br>
> for the pahse<br>
> where child is waiting for the front end connections and the former is for<br>
> the query process phase.<br>
><br>
> Secondly for the query processing function all my current code browsing<br>
> leads me to the believe that<br>
> although "pool_process_query()" caller has handled five possible return<br>
> values from the function<br>
> i.e POOL_END, POOL_ERROR, POOL_FATAL, POOL_IDLE and POOL_CONTINUE.<br>
> But in actual the function returns only POOL_END, POOL_ERROR and the<br>
> success which is<br>
> POOL_CONTINUE. and to squeeze in the managers in query processing function<br>
> I have moved forward<br>
> with the above mentioned assumptions. Also my current code understanding on<br>
> child process flow is when<br>
> ever some error occurs in query process phase of child process that error<br>
> actually terminates that particular client<br>
> So I have added the below code in elog.c file to treat the ERROR in child<br>
> process as FATAL in case the<br>
> front end clint is connected with the child process.<br>
><br>
> if(elevel == ERROR)<br>
> {<br>
> if(processType == PT_CHILD)<br>
> {<br>
> /*<br>
> * If the frontend connection exists, treat this error as FATAL<br>
> */<br>
> if(is_session_connected())<br>
> elevel = FATAL;<br>
> }<br>
> }<br>
><br>
> Reviews comments and suggestions are most welcome<br>
><br>
> Thanks<br>
><br>
> Muhammad Usama<br>
</div></div></blockquote></div><br></div></div></div>