[pgpool-general: 1783] Re: insert_lock hangs

Karsten Düsterloh pgpool-general-ml at tal.de
Tue May 28 16:03:23 JST 2013


Hi!

Sorry, it took some time to further isolate the problem, but now I have
a testcase!

Nozomi Anzai wrote:
> Could you tell us some hints to reproduce this?

pgpool with two nodes,
pgpool running on node 0, node 1 offline because of the current problem.

pgpool-II version 3.2.4 under Debian 6.0.6
postgresql 9.1.9 under Debian 6.0.6

> - pgpool.conf
>   - The mode do you use (replication mode, master slave mode with SR, etc.)
>   - Using load balancing or not

replication_mode = true
load_balance_mode = true
replicate_select = false
white_function_list = ''
black_function_list = 'nextval,setval'
insert_lock = true
lobj_lock_table = ''

enable_pool_hba = true
master_slave_mode = false
parallel_mode = false

The table pgpool_catalog.insert_lock exists in all our databases
(including template1), created by insert_lock.sql.

> - Steps you executed

As a test scenario, I created the following two tables:

test=# \d kd_test_serial_pgpool
                          Table "public.kd_test_serial_pgpool"
 Column  |  Type   |                             Modifiers
---------+---------+--------------------------------------------------------------------
 id      | integer | not null default
         .         . nextval('kd_test_serial_pgpool_id_seq'::regclass)
 content | text    |

test=# \d kd_test_nonserial_pgpool
Table "public.kd_test_nonserial_pgpool"
 Column  |  Type   | Modifiers
---------+---------+-----------
 id      | integer |
 content | text    |

Inserting stuff into kd_test_serial_pgpool should (and does) grab a lock
on pgpool_catalog.insert_lock, while doing so for
kd_test_nonserial_pgpool should (and does) not.

The inserting script is written in perl using prepared queries as
follows (line wrapping caused by MUA):


#!/usr/bin/perl
use strict;
use DBI;
my $dbh;
my $sth;
$dbh = DBI->connect('DBI:Pg:dbname=XXX;host=XXX','XXX','XXX');
if ($dbh)
{
  $sth = $dbh->prepare("insert into kd_test_serial_pgpool(content)
values (?)");      # hangs
#  $sth = $dbh->prepare("insert into kd_test_nonserial_pgpool(content)
values (?)");   # does not hang
  $sth->execute('perl lkhzfjktfjktf');
  $sth->finish;
}
$dbh->disconnect();


Using the "insert into kd_test_serial_pgpool" line will hang
indefinitely (I let it hang for about 20 days!) on calling
$sth->execute, while "insert into kd_test_nonserial_pgpool" won't.
(For log file content see below, as posted previously.)


I made sure that no actual lock was set on insert_lock before running
the test by checking the following query (and killing threads via select
pg_terminate_backend($PID); as needed):

select
l.pid,a.datname,a.usename,t.relname,l.locktype,l.mode,l.granted,a.waiting,a.query_start,a.backend_start,a.current_query
from pg_locks l, pg_stat_all_tables t,pg_stat_activity a where
l.relation=t.relid and l.pid=a.procpid order by datname,relname,granted
desc,query_start;


I tested the perl script on several systems, with equal behaviour:
- Debian 4.0 (32 bit), perl 5.8.8-7etch6, libdbi-perl 1.53-1etch1
- Debian 6.0.7 (32 bit), perl 5.10.1-17squeeze6, libdbi-perl 1.612-1
- Kubuntu 12.10 (64 bit), perl 5.14.2-13ubuntu0.2, libdbi-perl 1.622-1


Log excerpt as posted previously (different tables, but same symptom):
>> pgpool.log:
>> > 2013-05-06 14:22:02.235 CEST [13003] [unknown] 00000 LOG:  duration: 0.035 ms  statement: BEGIN
>> > 2013-05-06 14:22:02.235 CEST [13003] [unknown] 00000 LOG:  duration: 0.070 ms  parse <unnamed>: 
>> >                     DELETE FROM sessions
>> >                     WHERE session_id = $1
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 LOG:  duration: 0.088 ms  bind <unnamed>: 
>> >                     DELETE FROM sessions
>> >                     WHERE session_id = $1
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 DETAIL:  parameters: $1 = '1eae4e7eb3f0adf2e2bc3f50f1446530f'
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 LOG:  duration: 0.063 ms  execute <unnamed>: 
>> >                     DELETE FROM sessions
>> >                     WHERE session_id = $1
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 DETAIL:  parameters: $1 = '1eae4e7eb3f0adf2e2bc3f50f1446530f'
>> > 2013-05-06 14:22:02.236 CEST [13003] [unknown] 00000 LOG:  duration: 0.024 ms  statement: COMMIT
>> > 2013-05-06 14:22:02.238 CEST [13003] [unknown] 00000 LOG:  duration: 0.426 ms  parse pgpool12980: SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND d.adsrc ~ 'nextval' AND c.oid = pgpool_regclass('sessions')
>> > 2013-05-06 14:22:02.238 CEST [13003] [unknown] 00000 LOG:  duration: 0.054 ms  bind pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND d.adsrc ~ 'nextval' AND c.oid = pgpool_regclass('sessions')
>> > 2013-05-06 14:22:02.239 CEST [13003] [unknown] 00000 LOG:  duration: 0.993 ms  execute pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND d.adsrc ~ 'nextval' AND c.oid = pgpool_regclass('sessions')
>> > 2013-05-06 14:22:02.240 CEST [13003] [unknown] 00000 LOG:  duration: 0.757 ms  parse pgpool12980: SELECT attname, d.adsrc, coalesce((d.adsrc LIKE '%now()%' OR d.adsrc LIKE '%''now''::text%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a  LEFT JOIN pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = pgpool_regclass('sessions') ORDER BY a.attnum
>> > 2013-05-06 14:22:02.240 CEST [13003] [unknown] 00000 LOG:  duration: 0.087 ms  bind pgpool12980/pgpool12980: SELECT attname, d.adsrc, coalesce((d.adsrc LIKE '%now()%' OR d.adsrc LIKE '%''now''::text%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a  LEFT JOIN pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = pgpool_regclass('sessions') ORDER BY a.attnum
>> > 2013-05-06 14:22:02.241 CEST [13003] [unknown] 00000 LOG:  duration: 0.787 ms  execute pgpool12980/pgpool12980: SELECT attname, d.adsrc, coalesce((d.adsrc LIKE '%now()%' OR d.adsrc LIKE '%''now''::text%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a  LEFT JOIN pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = pgpool_regclass('sessions') ORDER BY a.attnum
>> > 2013-05-06 14:22:02.241 CEST [13003] [unknown] 00000 LOG:  duration: 0.026 ms  statement: BEGIN
>> > 2013-05-06 14:22:02.263 CEST [13003] [unknown] 00000 LOG:  duration: 21.331 ms  parse pgpool12980: SELECT count(*) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON (c.relnamespace = n.oid) WHERE nspname = 'pgpool_catalog' AND relname = 'insert_lock'
>> > 2013-05-06 14:22:02.263 CEST [13003] [unknown] 00000 LOG:  duration: 0.067 ms  bind pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON (c.relnamespace = n.oid) WHERE nspname = 'pgpool_catalog' AND relname = 'insert_lock'
>> > 2013-05-06 14:22:02.263 CEST [13003] [unknown] 00000 LOG:  duration: 0.038 ms  execute pgpool12980/pgpool12980: SELECT count(*) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON (c.relnamespace = n.oid) WHERE nspname = 'pgpool_catalog' AND relname = 'insert_lock'
>> > 2013-05-06 14:22:02.264 CEST [13003] [unknown] 00000 LOG:  duration: 0.231 ms  parse pgpool12980: SELECT 1 FROM pgpool_catalog.insert_lock WHERE reloid = pgpool_regclass('sessions') FOR UPDATE
>> > 2013-05-06 14:22:02.264 CEST [13003] [unknown] 00000 LOG:  duration: 0.044 ms  bind pgpool12980/pgpool12980: SELECT 1 FROM pgpool_catalog.insert_lock WHERE reloid = pgpool_regclass('sessions') FOR UPDATE
>> > 2013-05-06 14:22:02.267 CEST [13003] [unknown] 00000 LOG:  duration: 3.819 ms  execute pgpool12980/pgpool12980: SELECT 1 FROM pgpool_catalog.insert_lock WHERE reloid = pgpool_regclass('sessions') FOR UPDATE
>> (… then nothing happens in this thread anymore.)
>> 
>> 
>> The original hanging query isn't logged by pgpool, not even when turning
>> on pgpool's extensive debug output, but running OTRS against postgresql
>> directly (without pgpool) shows this as the offending query:
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 LOG:  duration: 0.068 ms  parse <unnamed>: INSERT INTO sessions (session_id, data_key, data_value, serialized) VALUES ($1,$2,$3,$4)
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 LOG:  duration: 0.045 ms  bind <unnamed>: INSERT INTO sessions (session_id, data_key, data_value, serialized) VALUES ($1,$2,$3,$4)
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 DETAIL:  parameters: $1 = '11a98dce334f5dd318594a76edf08819b', $2 = 'LastScreenOverview', $3 = 'Action=AgentTicketQueue;QueueID=9;View=', $4 = '0'
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 LOG:  duration: 0.106 ms  execute <unnamed>: INSERT INTO sessions (session_id, data_key, data_value, serialized) VALUES ($1,$2,$3,$4)
>> > 2013-05-06 14:47:10.487 CEST [14141] [unknown] 00000 DETAIL:  parameters: $1 = '11a98dce334f5dd318594a76edf08819b', $2 = 'LastScreenOverview', $3 = 'Action=AgentTicketQueue;QueueID=9;View=', $4 = '0'
>> 
>> The context of this query suggests that no other query is using/blocking
>> pgpool's insert_lock table at this moment, especially not for
>> minutes/hours, hence I'm quite at a loss here.


Karsten


More information about the pgpool-general mailing list