[pgpool-general: 2210] Re: Parallel Query Hangs w/ specific rows

Tue Oct 22 19:35:19 JST 2013

Hi Yugo,

Thank you very much for helping! Please allow me to do some cleaning up and
send you the dump.

I'll also try to reverse the partitioning condition and get back to you
soon.

BR,
Fantix

On Tue, Oct 22, 2013 at 6:08 PM, Yugo Nagata <nagata at sraoss.co.jp> wrote:

> Hi,
>
> Thanks for yor providing information. However, I can't still reproduced
> this hang, using the data as folowing;
>
> test=# select * from my_table ;
>  id  | character_id | my_col_one | my_col_two | quality | properties
> -----+--------------+------------+------------+---------+------------
>    2 |          102 | t          |        200 |    2000 | \x62626262
>   22 |          102 | t          |        200 |    2000 | \x62626262
>    0 |          100 | t          |         99 |    9999 | \x62626262
>   10 |          100 | t          |         99 |    9999 | \x62626262
>  100 |          100 | t          |         99 |    9999 | \x62626262
>  333 |          103 | t          |        300 |    3000 | \x63636363
>   33 |          103 | t          |        300 |    3000 | \x63636363
>    3 |          103 | t          |        300 |    3000 | \x63636363
>    1 |          101 | t          |        100 |    1000 | \x62626262
>
>
> I can't understand why the hang occurs when only caracter_id is even
> number.
> Maybe, contents of your 'my_table' or backend server #0 which receives
> data of even charcter_id.
>
> So, could you please me contens of 'my_table' using which you can
> reproduce the hang in your environment?
>
> In addition, could you try to reverse the partiaioning condition?
> That is, even number characters are on node #1, and odd number characters
> are on
> node #0.
>
>
>
>
> On Tue, 15 Oct 2013 14:56:50 +0800
> Fantix King <fantix.king at gmail.com> wrote:
>
> > Oh sure, thank you very much for helping here!
> >
> > Table definition:
> >
> > CREATE TABLE my_table (
> >     id BIGINT NOT NULL,
> >     character_id BIGINT NOT NULL,
> >     my_col_one BOOLEAN NOT NULL,
> >     my_col_two INTEGER NOT NULL,
> >     quality INTEGER NOT NULL,
> >     properties BYTEA,
> >     PRIMARY KEY (id)
> > );
> >
> > dist_def for this table:
> >
> > INSERT INTO pgpool_catalog.dist_def VALUES (
> >     'my_db',
> >     'public',
> >     'my_table',
> >     'character_id',
> >     ARRAY['id', 'character_id', 'my_col_one', 'my_col_two', 'quality',
> > 'properties'],
> >     ARRAY['bigint', 'bigint', 'boolean', 'integer', 'integer', 'bytea'],
> >     'pgpool_catalog.dist_def_demo'
> > );
> > CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_demo(anyelement)
> > RETURNS integer AS $$
> >     SELECT CASE WHEN $1 % 2 = 0 THEN 0
> >         ELSE 1
> >     END;
> > $$ LANGUAGE sql;
> >
> > There're also some other irrelevant definitions in dist_def and
> > replicate_def, but they have nothing to do with this table at all (no
> > foreign key, no constraints, etc.). Please let me know if you need the
> full
> > file. :)
> >
> >
> > BR,
> > Fantix
> >
> >
> > On Tue, Oct 15, 2013 at 1:33 PM, Yugo Nagata <nagata at sraoss.co.jp>
> wrote:
> >
> > > Hi,
> > >
> > > Thanks for your reporting and analyzing. I'll handle the problem.
> However,
> > > it may
> > > take some time for me to resoleve sinse I have other issues now.
> > >
> > > For analysis, could you please tell me the definition of table
> "my_table",
> > > and contents of dist_def, replicate_def ?
> > >
> > > On Mon, 14 Oct 2013 20:08:07 +0800
> > > Fantix King <fantix.king at gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I met a pretty weird issue in our production use of pgpool-II 3.3.1.
> > > > Executing SELECT through pgpool on some certain rows of a dist_def-ed
> > > table
> > > > always hang - you have to kill the "psql".
> > > >
> > > > It keeps reproducing in our production system while I cannot find a
> > > similar
> > > > report anywhere else. I'm still trying to reproduce this issue
> > > individually.
> > > >
> > > > My setup is like this:
> > > >
> > > >  * 1 pgpool in parallel mode
> > > >  * 2 PostgreSQL 9.1.9 backends
> > > >  * 1 PostgreSQL 9.1.9 system db
> > > >
> > > > The table "my_table" is partitioned with "character_id" key column,
> where
> > > > even number characters are on node #0, and odd number characters are
> on
> > > > node #1. This query hangs:
> > > >
> > > > statement: SELECT my_table.id, my_table.character_id,
> > > my_table.my_col_one,
> > > > my_table.my_col_two, my_table.quality, my_table.properties FROM
> my_table
> > > > WHERE  my_table.character_id = 100;
> > > >
> > > > I found that:
> > > >
> > > >  * All other hanging queries (the same table) are on node #0 - even
> > > numbers
> > > >  * If I reduce any column from the SELECT list, the query works
> > > >  * Not all of the queries are hanging, e.g. character_id = 102 works
> fine
> > > >
> > > > Then I diff-ed the pgpool log querying row 100(-) and 102(+):
> > > >
> > > >   1  : ProcessFrontendResponse: kind from frontend Q(51)
> > > >   2  : pool_unset_doing_extended_query_message: done
> > > >   3 -: statement: SELECT my_table.id, my_table.character_id,
> > > > my_table.my_col_one, my_table.my_col_two, my_table.quality,
> > > > my_table.properties FROM my_table WHERE  my_table.character_id = 100;
> > > >   4  : pool_set_query_in_progress: done
> > > >   5  : initSelectStmt: ANALYZE now(0)
> > > >   6  : inside build_range_info num= 1 current_select=0
> > > >   7  : inside build_range_info dist 0
> > > >   8  : inside build_virtual_info dist state=P  my_table
> > > >   9  : append_virtual_table select=0,
> > > > no=0,col=id,type=bigint,table=my_table,state=P,valid=-1
> > > >  10  : append_virtual_table select=0,
> > > > no=1,col=character_id,type=bigint,table=my_table,state=P,valid=-1
> > > >  11  : append_virtual_table select=0,
> > > > no=2,col=my_col_one,type=boolean,table=my_table,state=P,valid=-1
> > > >  12  : append_virtual_table select=0,
> > > > no=3,col=my_col_two,type=integer,table=my_table,state=P,valid=-1
> > > >  13  : append_virtual_table select=0,
> > > > no=4,col=quality,type=integer,table=my_table,state=P,valid=-1
> > > >  14  : append_virtual_table select=0,
> > > > no=5,col=properties,type=bytea,table=my_table,state=P,valid=-1
> > > > ...
> > > >  17  : pool_parallel_query:  0 th FD_SET: 19
> > > >  18  : pool_parallel_query:  1 th FD_SET: 21
> > > >  19  : pool_parallel_query: num_fds: 22
> > > >  20  : read_kind_from_one_backend: read kind from 0 th backend T
> > > >  21  : pool_parallel_exec: kind from backend: T
> > > >  22  : pool_parallel_exec: dummy from backend: C
> > > >  23 +: pool_parallel_query:  1 th FD_SET: 21
> > > >  24 +: pool_parallel_query: num_fds: 22
> > > >  25  : read_kind_from_one_backend: read kind from 1 th backend T
> > > >  26  : pool_parallel_exec: dummy kind from backend: T
> > > >  27 -: pool_parallel_exec: kind from backend: D 3
> > > >  28 +: pool_parallel_exec: kind from backend: D 4
> > > >  29  : pool_parallel_exec: kind from backend: C
> > > >  30  : pool_unset_query_in_progress: done
> > > >  31  : pool_unset_query_in_progress: done
> > > > * 32 -: pool_process_query: discard Z packet from backend 1*
> > > >  33 -: detect_error: kind: Z
> > > >  34 -: detect_error: kind: Z
> > > >  35 -: detect_error: kind: Z
> > > >  36  : read_kind_from_backend: kind: Z from 0 th backend
> > > >  37  : read_kind_from_backend: read kind from 0 th backend Z
> > > NUM_BACKENDS: 2
> > > >  38 +: read_kind_from_backend: kind: Z from 1 th backend
> > > >  39 +: read_kind_from_backend: read kind from 1 th backend Z
> > > NUM_BACKENDS: 2
> > > >  40 +: ProcessBackendResponse: kind from backend: Z
> > > >  41 +: pool_read_message_length: slot: 0 length: 5
> > > >  42 +: pool_read_message_length: slot: 1 length: 5
> > > >  43 +: ReadyForQuery: transaction state:I
> > > >  44 +: ReadyForQuery: transaction state:I
> > > >  45 +: ProcessBackendResponse: Ready For Query
> > > >
> > > > FYI some of the pgpool.conf:
> > > >
> > > > backend_weight0 = 1
> > > > backend_weight1 = 1
> > > > load_balance_mode = on
> > > > replication_mode = on
> > > > parallel_mode = on
> > > >
> > > > Please advice or let me know if I'm missing any info, thank you!
> > > >
> > > >
> > > > BR,
> > > > Fantix
> > >
> > >
> > > --
> > > Yugo Nagata <nagata at sraoss.co.jp>
> > >
>
>
> --
> Yugo Nagata <nagata at sraoss.co.jp>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20131022/957678a8/attachment.htm>