[pgpool-general: 686] Re: Transaction never finishes

Luiz Pasqual luiz at pasquall.com
Sat Jun 30 10:23:02 JST 2012


On Fri, Jun 29, 2012 at 8:54 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>> On Fri, Jun 29, 2012 at 12:09 PM, Luiz Pasqual <luiz at pasquall.com> wrote:
>>> On Thu, Jun 28, 2012 at 8:06 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>>>> Hello,
>>>>>
>>>>> I'm using pgpool-II 3.1.2 with streaming replication and it's working
>>>>> pretty well. But I'm dealing with a weird situation and I don't know
>>>>> how to debug:
>>>>>
>>>>> Sometimes, some transactions never finishes in the master, here is an
>>>>> example, the following query:
>>>>> select * from pg_stat_activity where xact_start < current_timestamp -
>>>>> '10 minutes'::interval
>>>>>
>>>>> Results:
>>>>> 20994;"****";2445;16385;"****";"";"192.168.**.**";"";44083;"2012-06-27
>>>>> 05:55:39.525881-03";"2012-06-27 11:17:46.475347-03";"2012-06-27
>>>>> 11:18:10.044718-03";f;"<IDLE> in transaction"
>>>>>
>>>>> This transaction gets AccessShareLock in the relations:
>>>>> pg_class_relname_nsp_index
>>>>> pg_class
>>>>> pg_class_oid_index
>>>>> pg_namespace
>>>>> pg_namespace_oid_index
>>>>> pg_namespace_nspname_index
>>>>>
>>>>> And one ExclusiveLock that I couldn't identify the relation.
>>>>>
>>>>> Sometimes, depending on the relations that are locked everything fails
>>>>> (timeout) and a pgpool restart must be done. Anyone knows what is
>>>>> going on?
>>>>
>>>> You want to identify the process id of pgpool which is dealing with
>>>> PostgreSQL backend process id(in the case above it's 2445).
>>>>
>>>> Here is the step to find pgpool process id:
>>>>
>>>> 1) Execute pcp_proc_count to find pgpool process list. This command
>>>>   returns all pgpool process ids.
>>>>
>>>> 2) For each process id in #1, execute pcp_proc_info. This will tell
>>>>   what are the PostgreSQL process ids. Note that the command returns
>>>>   multiple rows sorting by node id. Usually node id 0 (thus the first
>>>>   line) is for primary.
>>>>
>>>> 3) Look for 2445 in #2 to find the pgpool process id.
>>>>
>>>> 4) If you could find the pgpool process id (say 12345), grep pgpool
>>>>   log by using 12345. This will show what's going on with 12345.
>>>
>>> Well, they are all "idle in transaction".
>>
>> With a little more debugging, I discovered that the client who uses
>> this specific process already disconnects from pgpool:
>>
>> # ps aux | grep 25003
>> pgpool: *** *** 127.0.0.1(51456) idle in transaction
>>
>> # lsof -i -P -n | grep 25003
>> pgpool    25003       root    5u  IPv4 39609135      0t0  TCP *:5432 (LISTEN)
>> pgpool    25003       root    6u  IPv4 39621031      0t0  TCP
>> 127.0.0.1:5432->127.0.0.1:51456 (CLOSE_WAIT)
>> pgpool    25003       root    7u  IPv4 39621043      0t0  TCP
>> 192.168.10.55:33684->192.168.10.101:5432 (ESTABLISHED)
>> pgpool    25003       root    8u  IPv4 39621044      0t0  TCP
>> 192.168.10.55:51761->192.168.10.11:5432 (CLOSE_WAIT)
>>
>> There is no connection between the client (127.0.0.1:51456) and ( <- )
>> pgpool (127.0.0.1:5432).
>>
>> For some reason, this transaction was aborted, client disconnected
>> from pgpool, pgpool aborted the transaction on the slave server
>> (192.168.10.11) but not on the master (192.168.10.55).
>>
>> I'm running out of ideas.
>
>> 192.168.10.55:33684->192.168.10.101:5432 (ESTABLISHED)
>
> What is on 192.168.10.101:5432?

It's my database, a PostgresSQL 9.1.3, streaming replication to 192.168.10.11


> Also can you attach gdb to 25003 and take back trace?

Sure:

(gdb) bt
#0  0x00007f801e2c5073 in __select_nocancel () from /lib64/libc.so.6
#1  0x000000000041661e in pool_check_fd (cp=<value optimized out>) at
pool_process_query.c:1050
#2  0x0000000000419e64 in pool_read (cp=0x70f1d0, buf=0x7fff24248d3f,
len=1) at pool_stream.c:138
#3  0x00000000004158db in read_kind_from_backend (frontend=<value
optimized out>, backend=0x70b420, decided_kind=<value optimized out>)
at pool_process_query.c:3490
#4  0x0000000000443cf9 in ProcessBackendResponse (frontend=0x70c450,
backend=0x70b420, state=0x7fff24248fb8, num_fields=0x7fff24248fbc) at
pool_proto_modules.c:2164
#5  0x00000000004188b0 in pool_process_query (frontend=0x70c450,
backend=0x70b420, reset_request=0) at pool_process_query.c:416
#6  0x000000000040a4d0 in do_child (unix_fd=4, inet_fd=5) at child.c:354
#7  0x0000000000404245 in fork_a_child (unix_fd=4, inet_fd=5, id=29)
at main.c:1072
#8  0x000000000040726f in main (argc=<value optimized out>,
argv=<value optimized out>) at main.c:549
(gdb)

Thanks for your support.

-- 
Luiz Fernando Pasqual S. Souza
mail: luiz at pasquall.com


More information about the pgpool-general mailing list