[pgpool-general: 2207] Re: Parallel Query Hangs w/ specific rows

Yugo Nagata nagata at sraoss.co.jp
Tue Oct 22 19:08:19 JST 2013


Hi,

Thanks for yor providing information. However, I can't still reproduced 
this hang, using the data as folowing;

test=# select * from my_table ;
 id  | character_id | my_col_one | my_col_two | quality | properties 
-----+--------------+------------+------------+---------+------------
   2 |          102 | t          |        200 |    2000 | \x62626262
  22 |          102 | t          |        200 |    2000 | \x62626262
   0 |          100 | t          |         99 |    9999 | \x62626262
  10 |          100 | t          |         99 |    9999 | \x62626262
 100 |          100 | t          |         99 |    9999 | \x62626262
 333 |          103 | t          |        300 |    3000 | \x63636363
  33 |          103 | t          |        300 |    3000 | \x63636363
   3 |          103 | t          |        300 |    3000 | \x63636363
   1 |          101 | t          |        100 |    1000 | \x62626262


I can't understand why the hang occurs when only caracter_id is even number.
Maybe, contents of your 'my_table' or backend server #0 which receives
data of even charcter_id.

So, could you please me contens of 'my_table' using which you can
reproduce the hang in your environment?

In addition, could you try to reverse the partiaioning condition?
That is, even number characters are on node #1, and odd number characters are on
node #0.




On Tue, 15 Oct 2013 14:56:50 +0800
Fantix King <fantix.king at gmail.com> wrote:

> Oh sure, thank you very much for helping here!
> 
> Table definition:
> 
> CREATE TABLE my_table (
>     id BIGINT NOT NULL,
>     character_id BIGINT NOT NULL,
>     my_col_one BOOLEAN NOT NULL,
>     my_col_two INTEGER NOT NULL,
>     quality INTEGER NOT NULL,
>     properties BYTEA,
>     PRIMARY KEY (id)
> );
> 
> dist_def for this table:
> 
> INSERT INTO pgpool_catalog.dist_def VALUES (
>     'my_db',
>     'public',
>     'my_table',
>     'character_id',
>     ARRAY['id', 'character_id', 'my_col_one', 'my_col_two', 'quality',
> 'properties'],
>     ARRAY['bigint', 'bigint', 'boolean', 'integer', 'integer', 'bytea'],
>     'pgpool_catalog.dist_def_demo'
> );
> CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_demo(anyelement)
> RETURNS integer AS $$
>     SELECT CASE WHEN $1 % 2 = 0 THEN 0
>         ELSE 1
>     END;
> $$ LANGUAGE sql;
> 
> There're also some other irrelevant definitions in dist_def and
> replicate_def, but they have nothing to do with this table at all (no
> foreign key, no constraints, etc.). Please let me know if you need the full
> file. :)
> 
> 
> BR,
> Fantix
> 
> 
> On Tue, Oct 15, 2013 at 1:33 PM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> 
> > Hi,
> >
> > Thanks for your reporting and analyzing. I'll handle the problem. However,
> > it may
> > take some time for me to resoleve sinse I have other issues now.
> >
> > For analysis, could you please tell me the definition of table "my_table",
> > and contents of dist_def, replicate_def ?
> >
> > On Mon, 14 Oct 2013 20:08:07 +0800
> > Fantix King <fantix.king at gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I met a pretty weird issue in our production use of pgpool-II 3.3.1.
> > > Executing SELECT through pgpool on some certain rows of a dist_def-ed
> > table
> > > always hang - you have to kill the "psql".
> > >
> > > It keeps reproducing in our production system while I cannot find a
> > similar
> > > report anywhere else. I'm still trying to reproduce this issue
> > individually.
> > >
> > > My setup is like this:
> > >
> > >  * 1 pgpool in parallel mode
> > >  * 2 PostgreSQL 9.1.9 backends
> > >  * 1 PostgreSQL 9.1.9 system db
> > >
> > > The table "my_table" is partitioned with "character_id" key column, where
> > > even number characters are on node #0, and odd number characters are on
> > > node #1. This query hangs:
> > >
> > > statement: SELECT my_table.id, my_table.character_id,
> > my_table.my_col_one,
> > > my_table.my_col_two, my_table.quality, my_table.properties FROM my_table
> > > WHERE  my_table.character_id = 100;
> > >
> > > I found that:
> > >
> > >  * All other hanging queries (the same table) are on node #0 - even
> > numbers
> > >  * If I reduce any column from the SELECT list, the query works
> > >  * Not all of the queries are hanging, e.g. character_id = 102 works fine
> > >
> > > Then I diff-ed the pgpool log querying row 100(-) and 102(+):
> > >
> > >   1  : ProcessFrontendResponse: kind from frontend Q(51)
> > >   2  : pool_unset_doing_extended_query_message: done
> > >   3 -: statement: SELECT my_table.id, my_table.character_id,
> > > my_table.my_col_one, my_table.my_col_two, my_table.quality,
> > > my_table.properties FROM my_table WHERE  my_table.character_id = 100;
> > >   4  : pool_set_query_in_progress: done
> > >   5  : initSelectStmt: ANALYZE now(0)
> > >   6  : inside build_range_info num= 1 current_select=0
> > >   7  : inside build_range_info dist 0
> > >   8  : inside build_virtual_info dist state=P  my_table
> > >   9  : append_virtual_table select=0,
> > > no=0,col=id,type=bigint,table=my_table,state=P,valid=-1
> > >  10  : append_virtual_table select=0,
> > > no=1,col=character_id,type=bigint,table=my_table,state=P,valid=-1
> > >  11  : append_virtual_table select=0,
> > > no=2,col=my_col_one,type=boolean,table=my_table,state=P,valid=-1
> > >  12  : append_virtual_table select=0,
> > > no=3,col=my_col_two,type=integer,table=my_table,state=P,valid=-1
> > >  13  : append_virtual_table select=0,
> > > no=4,col=quality,type=integer,table=my_table,state=P,valid=-1
> > >  14  : append_virtual_table select=0,
> > > no=5,col=properties,type=bytea,table=my_table,state=P,valid=-1
> > > ...
> > >  17  : pool_parallel_query:  0 th FD_SET: 19
> > >  18  : pool_parallel_query:  1 th FD_SET: 21
> > >  19  : pool_parallel_query: num_fds: 22
> > >  20  : read_kind_from_one_backend: read kind from 0 th backend T
> > >  21  : pool_parallel_exec: kind from backend: T
> > >  22  : pool_parallel_exec: dummy from backend: C
> > >  23 +: pool_parallel_query:  1 th FD_SET: 21
> > >  24 +: pool_parallel_query: num_fds: 22
> > >  25  : read_kind_from_one_backend: read kind from 1 th backend T
> > >  26  : pool_parallel_exec: dummy kind from backend: T
> > >  27 -: pool_parallel_exec: kind from backend: D 3
> > >  28 +: pool_parallel_exec: kind from backend: D 4
> > >  29  : pool_parallel_exec: kind from backend: C
> > >  30  : pool_unset_query_in_progress: done
> > >  31  : pool_unset_query_in_progress: done
> > > * 32 -: pool_process_query: discard Z packet from backend 1*
> > >  33 -: detect_error: kind: Z
> > >  34 -: detect_error: kind: Z
> > >  35 -: detect_error: kind: Z
> > >  36  : read_kind_from_backend: kind: Z from 0 th backend
> > >  37  : read_kind_from_backend: read kind from 0 th backend Z
> > NUM_BACKENDS: 2
> > >  38 +: read_kind_from_backend: kind: Z from 1 th backend
> > >  39 +: read_kind_from_backend: read kind from 1 th backend Z
> > NUM_BACKENDS: 2
> > >  40 +: ProcessBackendResponse: kind from backend: Z
> > >  41 +: pool_read_message_length: slot: 0 length: 5
> > >  42 +: pool_read_message_length: slot: 1 length: 5
> > >  43 +: ReadyForQuery: transaction state:I
> > >  44 +: ReadyForQuery: transaction state:I
> > >  45 +: ProcessBackendResponse: Ready For Query
> > >
> > > FYI some of the pgpool.conf:
> > >
> > > backend_weight0 = 1
> > > backend_weight1 = 1
> > > load_balance_mode = on
> > > replication_mode = on
> > > parallel_mode = on
> > >
> > > Please advice or let me know if I'm missing any info, thank you!
> > >
> > >
> > > BR,
> > > Fantix
> >
> >
> > --
> > Yugo Nagata <nagata at sraoss.co.jp>
> >


-- 
Yugo Nagata <nagata at sraoss.co.jp>


More information about the pgpool-general mailing list