[pgpool-hackers: 3166] Re: Query hang issue with extended protocol

Tatsuo Ishii ishii at sraoss.co.jp
Thu Dec 6 07:52:38 JST 2018


> Hi Ishii San
> 
> I was investigating the issue faced by one of Pgpool-II user who reported
> that after executing a certain transaction (large amount of data DELETE /
> INSERT / UPDATE, COMMIT), the next execution of the BEGIN statement
> hangs(using the extended query protocol).
> 
> I think the problem is in pool_is_node_to_be_sent_in_current_query()
> function. The issue is in streaming replication mode, when
> pool_is_query_in_progress is not set, the function always returns true. Now
> consider that we have two backend servers 1 primary and one standby. And
> for a particular session the primary node gets selected as a load balance
> node. So effectively that session should only be sending queries and
> reading responses from primary server alone. But with the current
> implementation of pool_is_node_to_be_sent_in_current_query() the
> VALID_BACKEND macro will return true for standby node as well when the
> query in progress flag is not set(which is wrong when standby node is not
> selected for load balancing). and if this happens in let say
> read_kind_from_backend() function we will get a stuck, since the
> read_kind_from_backend() will be trying to read the response from standby
> node while we have never sent any query to that node.
> 
> 
> Can you please take a look at the attached patch, I am not sure if it will
> have some side effects or not.

Unfortunately the patch would not help and probably would break many
cases which are working today.

For the scenario you mentioned is already handled by
pool_pending_message_* functions defined in
context/query_context.c. When a message, for example parse, is sent
from frontend and forwarded to backends, the info is recorded in an
FIFO queue by pool_pending_message_add() along with info which of
backend the message is forwarded to. When the parse message will be
replied back with "parse complete message" from the backend,
read_kind_message_from_backend() can know from which of backend it
should read a message by looking at the head of the queue.  Once the
message is retrieved (in this example parse complete message), an
entry in the queue (in this example parse message) is pulled out and
removed.

This mechanism usually works well unless backend sends unpredicted
message. Recently one of such a case was reported:
https://www.pgpool.net/mantisbt/view.php?id=448

In the report PostgreSQL sends an error message caused by
idle_in_transaction_timeout setting which Pgpool-II does not expect.

Does the user has any uncommon setting in postgresql.conf?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list