[pgpool-hackers: 1721] Re: WIP handling pg_terminate_backend function in pgpool-II

Muhammad Usama m.usama at gmail.com
Tue Jul 26 05:38:36 JST 2016


please use this v2 patch, there was a little mistake in the previous
version

Regards
Muhammad Usama

On Mon, Jul 25, 2016 at 6:57 PM, Muhammad Usama <m.usama at gmail.com> wrote:

> Hi
>
> Please find the attached work in progress patch for the handling of
> pg_terminate_backend() in pgpool-II. There is still some work need to be
> done on the patch, but I am sharing it to get the thoughts of a wider group
> of audience on the idea and the design.
>
> Problems needed to be addressed for pg_terminated_backend()
> ======
> 1- Since the pg_terminate_backend just go on and kills the child
> connection it is treated as the backend failure by pgpool-II and depending
> on the pgpool-II configuration pgpool-II failover that backend node.
>
> 2- The current logic in pgpool-II have no idea of which backend node the
> pg_terminate_backend call is intended to, and it is routed on the basis of
> logic in pool_where_to_send() function that can send the query to the node
> that does not have the connection with the pid.
>
>
> How is this patch trying to solve this?
> ===========
> After parsing the query in SimpleQuery() function the patch runs the
> function walker on the parsed query tree to find the
> pg_terminate_backend("const integer") call. And if the function is found in
> the query the second step is to locate the pgpool-II child process that
> contains the backend connection with PID specified in
> pg_terminate_backend() function call. For that
> the patch loops on all pgpool-II child process and try to locate
> the backend connection with the specific PID by comparing the PID stored in
> the shared memory ConnectionInfos. If the search is successful and the
> child process with the backend connection having the same PID is found, the
> shared memory "ConnectionInfo->backend_terminated"(added by this patch)
> flag is set to true for that connection.
> Once, after the backend node is identified and the flag is set, the next
> step is to rout the query to the exact backend node that contains the
> connection. For that the patch refrains from calling the
> pool_where_to_send() for this query and explicitly sets the query
> destination to the backend node that has the connection so that the query
> should lands on the proper backend node.
>
> Now when the query is successfully executed on the backend server and
> consequently the backend node kills the said connection to the pgpool, It
> results in the communication error in pgpool-II child process. So to make
> sure that the child process does not failover that backend node and the
> child process checks the "ConnectionInfo->backend_terminated" flag before
> performing the failover, and if the flag is set, it just does not proceed
> with failover and bails out by FATAL error that kills that
> particular pgpool-II child process which eventually gets re-spawned by
> pgpool main.
>
>
> Issues with this approach
> ====
> 1- What to do if somehow two or more connections to different backend
> PostgreSQL servers have the same PID.
>
> 2- We can only support pg_terminate_backend(constant number) function
> calls. If the expression or sub query is used in the argument of
> pg_terminate_backend then it would not be handled
>  e.g
> pgpool=# select pg_terminate_backend((select 1));
> pgpool=# select pg_terminate_backend( 2 +1);
>
>
> 3- One very remote scenario where after setting the flag
> ConnectionInfo->backend_terminated to true and before the execution of
> pg_terminate_backend the backend actually gets fail, will still be treated
> as pg_terminate_backend case. But I believe it would be harmless since that
> backend node failure will be caught by the child process sending the
> pg_terminate_backend to that same backend.
>
> TODOs
> =====
>
> If the pg_terminate_backend() fails on the backend because of permission
> or any other issue ,we need to reset the ConnectionInfo->backend_terminated
> flag.
>
>
> What are your thoughts and suggestions on this approach?
>
> Many thanks in anticipation.
> Best regards
> Muhammad Usama
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20160726/691124b7/attachment-0001.html>
-------------- next part --------------
diff --git a/src/auth/pool_auth.c b/src/auth/pool_auth.c
index 2f259df..14b4926 100644
--- a/src/auth/pool_auth.c
+++ b/src/auth/pool_auth.c
@@ -406,6 +406,8 @@ int pool_do_auth(POOL_CONNECTION *frontend, POOL_CONNECTION_POOL *cp)
 			strlcpy(cp->info[i].database, sp->database, sizeof(cp->info[i].database));
 			strlcpy(cp->info[i].user, sp->user, sizeof(cp->info[i].user));
 			cp->info[i].counter = 1;
+			CONNECTION(cp, i)->info = &cp->info[i];
+			cp->info[i].backend_terminated = 0;
 		}
 	}
 
diff --git a/src/context/pool_query_context.c b/src/context/pool_query_context.c
index 49b0056..5f1cfc5 100644
--- a/src/context/pool_query_context.c
+++ b/src/context/pool_query_context.c
@@ -311,6 +311,24 @@ int pool_virtual_master_db_node_id(void)
 	return my_master_node_id;
 }
 
+void pool_force_query_node_to_backend(POOL_QUERY_CONTEXT *query_context, int backend_id)
+{
+	int i;
+	CHECK_QUERY_CONTEXT_IS_VALID;
+	ereport(LOG,
+		(errmsg("forcing query destination node to backend node:%d",backend_id)));
+
+	pool_set_node_to_be_sent(query_context,backend_id);
+	for (i=0;i<NUM_BACKENDS;i++)
+	{
+		if (query_context->where_to_send[i])
+		{
+			query_context->virtual_master_node_id = i;
+			break;
+		}
+	}
+}
+
 /*
  * Decide where to send queries(thus expecting response)
  */
diff --git a/src/include/context/pool_query_context.h b/src/include/context/pool_query_context.h
index 207e239..34de352 100644
--- a/src/include/context/pool_query_context.h
+++ b/src/include/context/pool_query_context.h
@@ -103,5 +103,5 @@ extern bool pool_is_cache_exceeded(void);
 extern void pool_set_cache_exceeded(void);
 extern void pool_unset_cache_exceeded(void);
 extern bool pool_is_transaction_read_only(Node *node);
-
+extern void pool_force_query_node_to_backend(POOL_QUERY_CONTEXT *query_context, int backend_id);
 #endif /* POOL_QUERY_CONTEXT_H */
diff --git a/src/include/pcp/libpcp_ext.h b/src/include/pcp/libpcp_ext.h
index 42e83f2..7b727ea 100644
--- a/src/include/pcp/libpcp_ext.h
+++ b/src/include/pcp/libpcp_ext.h
@@ -95,6 +95,7 @@ typedef struct {
 							 * might be out of control of
 							 * pgpool-II. So we use "char" here.
 							 */
+	char		backend_terminated;
 } ConnectionInfo;
 
 /*
diff --git a/src/include/pool.h b/src/include/pool.h
index 1f43efd..af58357 100644
--- a/src/include/pool.h
+++ b/src/include/pool.h
@@ -235,6 +235,7 @@ typedef struct {
 	char *auth_arg;
 	char *database;
 	char *username;
+	ConnectionInfo *info; /* shared memory coninfo used for pg_terminate_backend*/
 } POOL_CONNECTION;
 
 /*
diff --git a/src/include/utils/pool_select_walker.h b/src/include/utils/pool_select_walker.h
index d3e6fa0..8bdd774 100644
--- a/src/include/utils/pool_select_walker.h
+++ b/src/include/utils/pool_select_walker.h
@@ -40,6 +40,7 @@ typedef struct {
 	bool    has_unlogged_table; /* True if unlogged table is used */
 	bool    has_view; /* True if view is used */
 	bool    has_function_call;  /* True if write function call is used */
+	int		pg_terminate_backend_pid; /* pid argument of pg_terminate_backedn_call(if used) */
 	bool    has_non_immutable_function_call;    /* True if non immutable functions are used */
 	bool    has_insertinto_or_locking_clause;   /* True if it has SELECT INTO or FOR SHARE/UPDATE */
 	int     num_oids;   /* number of oids */
@@ -47,6 +48,7 @@ typedef struct {
 	char    table_names[POOL_MAX_SELECT_OIDS][POOL_NAMEDATALEN];  /* table names */
 } SelectContext;
 
+extern int pool_get_terminate_backend_pid(Node *node);
 extern bool pool_has_function_call(Node *node);
 extern bool pool_has_non_immutable_function_call(Node *node);
 extern bool pool_has_system_catalog(Node *node);
diff --git a/src/protocol/pool_connection_pool.c b/src/protocol/pool_connection_pool.c
index 23c6665..a941ffc 100644
--- a/src/protocol/pool_connection_pool.c
+++ b/src/protocol/pool_connection_pool.c
@@ -167,6 +167,7 @@ POOL_CONNECTION_POOL *pool_get_cp(char *user, char *database, int protoMajor, in
 					info = connection_pool->info;
 					memset(connection_pool, 0, sizeof(POOL_CONNECTION_POOL));
 					connection_pool->info = info;
+					info->backend_terminated = 0;
 					memset(connection_pool->info, 0, sizeof(ConnectionInfo) * MAX_NUM_BACKENDS);
 					POOL_SETMASK(&oldmask);
 					return NULL;
diff --git a/src/protocol/pool_process_query.c b/src/protocol/pool_process_query.c
index 2f12132..38381dd 100644
--- a/src/protocol/pool_process_query.c
+++ b/src/protocol/pool_process_query.c
@@ -4719,8 +4719,18 @@ SELECT_RETRY:
 						was_error = 1;
 						if (!VALID_BACKEND(i))
 							break;
-						notice_backend_error(i, true);
-						sleep(5);
+						/* check if the pg_terminate_backend was issued on this connection */
+						if (CONNECTION(backend, i)->info->backend_terminated == 1)
+						{
+							ereport(FATAL,
+								(errmsg("postmaster on DB node %d connection was lost due to pg_terminate_backend", i),
+									 errdetail("pg_terminate_backend was called on the backend")));
+						}
+						else
+						{
+							notice_backend_error(i, true);
+							sleep(5);
+						}
 						break;
 					}
 
@@ -4760,7 +4770,7 @@ SELECT_RETRY:
             ereport(ERROR,
                 (errmsg("unable to read from frontend socket"),
                      errdetail("exception occured on frontend socket")));
-    
+
 		else if (FD_ISSET(frontend->fd, &readmask))
 		{
 			status = ProcessFrontendResponse(frontend, backend);
diff --git a/src/protocol/pool_proto_modules.c b/src/protocol/pool_proto_modules.c
index 6277b61..073136f 100644
--- a/src/protocol/pool_proto_modules.c
+++ b/src/protocol/pool_proto_modules.c
@@ -409,11 +409,62 @@ POOL_STATUS SimpleQuery(POOL_CONNECTION *frontend,
 		}
 
 		/*
-		 * Decide where to send query
+		 * pg_terminate function needs special handling.
 		 */
-		pool_where_to_send(query_context, query_context->original_query,
-						   query_context->parse_tree);
+		int terminate_backend_id = -1;
+		int terminate_pid = pool_get_terminate_backend_pid(query_context->parse_tree);
+		if (terminate_pid > 0)
+		{
+			int child;
+			/*
+			 * pg_terminate backene found in the query
+			 * look for the child process that has the backend
+			 * with the pid
+			 */
+
+			ereport(LOG,
+				(errmsg("found pg_terminate_backend request for backend pid %d",terminate_pid)));
+
+			for (child = 0; child < pool_config->num_init_children; child++)
+			{
+				int pool;
+				ProcessInfo *pi = pool_get_process_info(process_info[child].pid);
 
+				for (pool = 0; pool < pool_config->max_pool; pool++)
+				{
+					int backend_id;
+					for (backend_id = 0; backend_id < NUM_BACKENDS; backend_id++)
+					{
+						int poolBE = pool*MAX_NUM_BACKENDS+backend_id;
+						if (ntohl(pi->connection_info[poolBE].pid) == terminate_pid)
+						{
+							ereport(LOG,
+								(errmsg("processing pg_terminate_backend request"),
+									 errdetail("found pid:%d child pid:%d",ntohl(pi->connection_info[poolBE].pid), pi->pid)));
+							/* set flag for in the shared memory for the connection info */
+							pi->connection_info[poolBE].backend_terminated = 1;
+							terminate_backend_id = backend_id;
+							/* Todo Reset the flag somehow in case the function was not successfull on the
+							 * backend
+							 */
+						}
+					}
+				}
+			}
+		}
+		if (terminate_backend_id >=0 )
+		{
+			/* It was the pg_terminate_backend call so send the query to appropriate backend */
+			pool_force_query_node_to_backend(query_context, terminate_backend_id);
+		}
+		else
+		{
+			/*
+			 * Decide where to send query
+			 */
+			pool_where_to_send(query_context, query_context->original_query,
+							   query_context->parse_tree);
+		}
 		/*
 		 * if this is DROP DATABASE command, send USR1 signal to parent and
 		 * ask it to close all idle connections.
diff --git a/src/utils/pool_select_walker.c b/src/utils/pool_select_walker.c
index be5c38a..7c063dd 100644
--- a/src/utils/pool_select_walker.c
+++ b/src/utils/pool_select_walker.c
@@ -58,6 +58,7 @@ bool pool_has_function_call(Node *node)
 		return false;
 
 	ctx.has_function_call = false;
+	ctx.pg_terminate_backend_pid = -1;
 
 	raw_expression_tree_walker(node, function_call_walker, &ctx);
 
@@ -65,6 +66,25 @@ bool pool_has_function_call(Node *node)
 }
 
 /*
+ * Search the pg_terminate_backend() call in the query
+ */
+int pool_get_terminate_backend_pid(Node *node)
+{
+	SelectContext	ctx;
+
+	if (!IsA(node, SelectStmt))
+		return false;
+
+	ctx.has_function_call = false;
+	ctx.pg_terminate_backend_pid = 0;
+
+	raw_expression_tree_walker(node, function_call_walker, &ctx);
+
+	return ctx.pg_terminate_backend_pid;
+
+}
+
+/*
  * Return true if this SELECT has system catalog table.
  */
 bool pool_has_system_catalog(Node *node)
@@ -286,6 +306,20 @@ static bool function_call_walker(Node *node, void *context)
 			ereport(DEBUG1,
 				(errmsg("function call walker, function name: \"%s\"", fname)));
 
+			if (ctx->pg_terminate_backend_pid == 0 && strcmp("pg_terminate_backend", fname) == 0)
+			{
+				if (list_length(fcall->args) == 1)
+				{
+					Node *arg = linitial(fcall->args);
+					if (IsA(arg, A_Const) &&
+					   ((A_Const *)arg)->val.type == T_Integer)
+					{
+						ctx->pg_terminate_backend_pid = ((A_Const *)arg)->val.val.ival;
+						ereport(DEBUG1,
+								(errmsg("pg_terminate_backend pid = %d",ctx->pg_terminate_backend_pid)));
+					}
+				}
+			}
 			/*
 			 * Check white list if any.
 			 */
diff --git a/src/utils/pool_stream.c b/src/utils/pool_stream.c
index 39903bf..892c07d 100644
--- a/src/utils/pool_stream.c
+++ b/src/utils/pool_stream.c
@@ -207,8 +207,16 @@ int pool_read(POOL_CONNECTION *cp, void *buf, int len)
 			cp->socket_state = POOL_SOCKET_ERROR;
 			if (cp->isbackend)
 			{
-				ereport(ERROR,
+				if (cp->info->backend_terminated == 1)
+				{
+					cp->info->backend_terminated = 0;
+					ereport(FATAL,
 						(errmsg("unable to read data from DB node %d",cp->db_node_id),
+							 errdetail("pg_terminate_backend was called on the backend")));
+				}
+
+				ereport(ERROR,
+					(errmsg("unable to read data from DB node %d",cp->db_node_id),
 						 errdetail("socket read failed with an error \"%s\"", strerror(errno))));
 
 				/* if fail_over_on_backend_error is true, then trigger failover */
@@ -347,6 +355,14 @@ char *pool_read2(POOL_CONNECTION *cp, int len)
 			cp->socket_state = POOL_SOCKET_ERROR;
 			if (cp->isbackend)
 			{
+				if (cp->info->backend_terminated == 1)
+				{
+					cp->info->backend_terminated = 0;
+					ereport(FATAL,
+						(errmsg("unable to read data from DB node %d",cp->db_node_id),
+							 errdetail("pg_terminate_backend was called on the backend")));
+				}
+
 				/* if fail_over_on_backend_error is true, then trigger failover */
 				if (pool_config->fail_over_on_backend_error)
 				{
@@ -603,6 +619,14 @@ int pool_flush(POOL_CONNECTION *cp)
 	{
 		if (cp->isbackend)
 		{
+			if (cp->info->backend_terminated == 1)
+			{
+				cp->info->backend_terminated = 0;
+				ereport(FATAL,
+						(errmsg("unable to read data from DB node %d",cp->db_node_id),
+						 errdetail("pg_terminate_backend was called on the backend")));
+			}
+
 			/* if fail_over_on_backend_error is true, then trigger failover */
 			if (pool_config->fail_over_on_backend_error)
 			{
@@ -650,6 +674,14 @@ int pool_flush_noerror(POOL_CONNECTION *cp)
     {
         if (cp->isbackend)
         {
+			if (cp->info->backend_terminated == 1)
+			{
+				cp->info->backend_terminated = 0;
+				ereport(FATAL,
+						(errmsg("unable to read data from DB node %d",cp->db_node_id),
+						 errdetail("pg_terminate_backend was called on the backend")));
+			}
+
             /* if fail_over_on_backend_erro is true, then trigger failover */
             if (pool_config->fail_over_on_backend_error)
             {
@@ -810,6 +842,14 @@ char *pool_read_string(POOL_CONNECTION *cp, int *len, int line)
 			cp->socket_state = POOL_SOCKET_ERROR;
 			if (cp->isbackend)
 			{
+				if (cp->info->backend_terminated == 1)
+				{
+					cp->info->backend_terminated = 0;
+					ereport(FATAL,
+							(errmsg("unable to read data from DB node %d",cp->db_node_id),
+							 errdetail("pg_terminate_backend was called on the backend")));
+				}
+
 				notice_backend_error(cp->db_node_id, true);
 				child_exit(POOL_EXIT_AND_RESTART);
                 ereport(ERROR,


More information about the pgpool-hackers mailing list