[pgpool-general: 1783] Re: insert_lock hangs
Karsten Düsterloh
pgpool-general-ml at tal.de
Tue May 28 16:03:23 JST 2013
Hi!
Sorry, it took some time to further isolate the problem, but now I have
a testcase!
Nozomi Anzai wrote:
> Could you tell us some hints to reproduce this?
pgpool with two nodes,
pgpool running on node 0, node 1 offline because of the current problem.
pgpool-II version 3.2.4 under Debian 6.0.6
postgresql 9.1.9 under Debian 6.0.6
> - pgpool.conf
> - The mode do you use (replication mode, master slave mode with SR, etc.)
> - Using load balancing or not
replication_mode = true
load_balance_mode = true
replicate_select = false
white_function_list = ''
black_function_list = 'nextval,setval'
insert_lock = true
lobj_lock_table = ''
enable_pool_hba = true
master_slave_mode = false
parallel_mode = false
The table pgpool_catalog.insert_lock exists in all our databases
(including template1), created by insert_lock.sql.
> - Steps you executed
As a test scenario, I created the following two tables:
test=# \d kd_test_serial_pgpool
Table "public.kd_test_serial_pgpool"
Column | Type | Modifiers
---------+---------+--------------------------------------------------------------------
id | integer | not null default
. . nextval('kd_test_serial_pgpool_id_seq'::regclass)
content | text |
test=# \d kd_test_nonserial_pgpool
Table "public.kd_test_nonserial_pgpool"
Column | Type | Modifiers
---------+---------+-----------
id | integer |
content | text |
Inserting stuff into kd_test_serial_pgpool should (and does) grab a lock
on pgpool_catalog.insert_lock, while doing so for
kd_test_nonserial_pgpool should (and does) not.
The inserting script is written in perl using prepared queries as
follows (line wrapping caused by MUA):
#!/usr/bin/perl
use strict;
use DBI;
my $dbh;
my $sth;
$dbh = DBI->connect('DBI:Pg:dbname=XXX;host=XXX','XXX','XXX');
if ($dbh)
{
$sth = $dbh->prepare("insert into kd_test_serial_pgpool(content)
values (?)"); # hangs
# $sth = $dbh->prepare("insert into kd_test_nonserial_pgpool(content)
values (?)"); # does not hang
$sth->execute('perl lkhzfjktfjktf');
$sth->finish;
}
$dbh->disconnect();
Using the "insert into kd_test_serial_pgpool" line will hang
indefinitely (I let it hang for about 20 days!) on calling
$sth->execute, while "insert into kd_test_nonserial_pgpool" won't.
(For log file content see below, as posted previously.)
I made sure that no actual lock was set on insert_lock before running
the test by checking the following query (and killing threads via select
pg_terminate_backend($PID); as needed):
select
l.pid,a.datname,a.usename,t.relname,l.locktype,l.mode,l.granted,a.waiting,a.query_start,a.backend_start,a.current_query
from pg_locks l, pg_stat_all_tables t,pg_stat_activity a where
l.relation=t.relid and l.pid=a.procpid order by datname,relname,granted
desc,query_start;
I tested the perl script on several systems, with equal behaviour:
- Debian 4.0 (32 bit), perl 5.8.8-7etch6, libdbi-perl 1.53-1etch1
- Debian 6.0.7 (32 bit), perl 5.10.1-17squeeze6, libdbi-perl 1.612-1
- Kubuntu 12.10 (64 bit), perl 5.14.2-13ubuntu0.2, libdbi-perl 1.622-1
Log excerpt as posted previously (different tables, but same symptom):
>> pgpool.log:
>> > 2013-05-06 14:22:02.235 CEST [13003] [unknown] 00000 LOG: duration: 0.035 ms statement: BEGIN
>> > 2013-05-06 14:22:02.235 CEST [13003] [unknown] 00000 LOG: duration: 0.070 ms parse <unnamed>:
>> > DELETE FROM sessions
>> > WHERE session_id = $1
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 LOG: duration: 0.088 ms bind <unnamed>:
>> > DELETE FROM sessions
>> > WHERE session_id = $1
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 DETAIL: parameters: $1 = '1eae4e7eb3f0adf2e2bc3f50f1446530f'
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 LOG: duration: 0.063 ms execute <unnamed>:
>> > DELETE FROM sessions
>> > WHERE session_id = $1
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 DETAIL: parameters: $1 = '1eae4e7eb3f0adf2e2bc3f50f1446530f'
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 LOG: duration: 0.024 ms statement: COMMIT
>> > 2013-05-06 14:22:02.238 CEST [13003] [unknown] 00000 LOG: duration: 0.426 ms parse pgpool12980: SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND d.adsrc ~ 'nextval' AND c.oid = pgpool_regclass('sessions')
>> > 2013-05-06 14:22:02.238 CEST [13003] [unknown] 00000 LOG: duration: 0.054 ms bind pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND d.adsrc ~ 'nextval' AND c.oid = pgpool_regclass('sessions')
>> > 2013-05-06 14:22:02.239 CEST [13003] [unknown] 00000 LOG: duration: 0.993 ms execute pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND d.adsrc ~ 'nextval' AND c.oid = pgpool_regclass('sessions')
>> > 2013-05-06 14:22:02.240 CEST [13003] [unknown] 00000 LOG: duration: 0.757 ms parse pgpool12980: SELECT attname, d.adsrc, coalesce((d.adsrc LIKE '%now()%' OR d.adsrc LIKE '%''now''::text%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = pgpool_regclass('sessions') ORDER BY a.attnum
>> > 2013-05-06 14:22:02.240 CEST [13003] [unknown] 00000 LOG: duration: 0.087 ms bind pgpool12980/pgpool12980: SELECT attname, d.adsrc, coalesce((d.adsrc LIKE '%now()%' OR d.adsrc LIKE '%''now''::text%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = pgpool_regclass('sessions') ORDER BY a.attnum
>> > 2013-05-06 14:22:02.241 CEST [13003] [unknown] 00000 LOG: duration: 0.787 ms execute pgpool12980/pgpool12980: SELECT attname, d.adsrc, coalesce((d.adsrc LIKE '%now()%' OR d.adsrc LIKE '%''now''::text%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = pgpool_regclass('sessions') ORDER BY a.attnum
>> > 2013-05-06 14:22:02.241 CEST [13003] [unknown] 00000 LOG: duration: 0.026 ms statement: BEGIN
>> > 2013-05-06 14:22:02.263 CEST [13003] [unknown] 00000 LOG: duration: 21.331 ms parse pgpool12980: SELECT count(*) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON (c.relnamespace = n.oid) WHERE nspname = 'pgpool_catalog' AND relname = 'insert_lock'
>> > 2013-05-06 14:22:02.263 CEST [13003] [unknown] 00000 LOG: duration: 0.067 ms bind pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON (c.relnamespace = n.oid) WHERE nspname = 'pgpool_catalog' AND relname = 'insert_lock'
>> > 2013-05-06 14:22:02.263 CEST [13003] [unknown] 00000 LOG: duration: 0.038 ms execute pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON (c.relnamespace = n.oid) WHERE nspname = 'pgpool_catalog' AND relname = 'insert_lock'
>> > 2013-05-06 14:22:02.264 CEST [13003] [unknown] 00000 LOG: duration: 0.231 ms parse pgpool12980: SELECT 1 FROM pgpool_catalog.insert_lock WHERE reloid = pgpool_regclass('sessions') FOR UPDATE
>> > 2013-05-06 14:22:02.264 CEST [13003] [unknown] 00000 LOG: duration: 0.044 ms bind pgpool12980/pgpool12980: SELECT 1 FROM pgpool_catalog.insert_lock WHERE reloid = pgpool_regclass('sessions') FOR UPDATE
>> > 2013-05-06 14:22:02.267 CEST [13003] [unknown] 00000 LOG: duration: 3.819 ms execute pgpool12980/pgpool12980: SELECT 1 FROM pgpool_catalog.insert_lock WHERE reloid = pgpool_regclass('sessions') FOR UPDATE
>> (… then nothing happens in this thread anymore.)
>>
>>
>> The original hanging query isn't logged by pgpool, not even when turning
>> on pgpool's extensive debug output, but running OTRS against postgresql
>> directly (without pgpool) shows this as the offending query:
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 LOG: duration: 0.068 ms parse <unnamed>: INSERT INTO sessions (session_id, data_key, data_value, serialized) VALUES ($1,$2,$3,$4)
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 LOG: duration: 0.045 ms bind <unnamed>: INSERT INTO sessions (session_id, data_key, data_value, serialized) VALUES ($1,$2,$3,$4)
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 DETAIL: parameters: $1 = '11a98dce334f5dd318594a76edf08819b', $2 = 'LastScreenOverview', $3 = 'Action=AgentTicketQueue;QueueID=9;View=', $4 = '0'
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 LOG: duration: 0.106 ms execute <unnamed>: INSERT INTO sessions (session_id, data_key, data_value, serialized) VALUES ($1,$2,$3,$4)
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 DETAIL: parameters: $1 = '11a98dce334f5dd318594a76edf08819b', $2 = 'LastScreenOverview', $3 = 'Action=AgentTicketQueue;QueueID=9;View=', $4 = '0'
>>
>> The context of this query suggests that no other query is using/blocking
>> pgpool's insert_lock table at this moment, especially not for
>> minutes/hours, hence I'm quite at a loss here.
Karsten
More information about the pgpool-general
mailing list