<div dir="ltr"><div>All child processes hang at read_packets_and_process</div><div><br></div><div>Hi,</div><div><br></div><div>We're having occasional problem where all child processes hang at</div><div>`read_packets_and_process`, waiting for packets from backend when the backend</div>
<div>connection is actually idle and not executing any query. If this happens,</div><div>our only option is to restart pgpool.</div><div><br></div><div>I'm aware of `child_idle_limit` parameter, but unfortunately we can't set it to</div>
<div>a small value as we know that some of the queries takes a few minutes,</div><div>so it doesn't help much in reducing the downtime.</div><div><br></div><div>We have no idea when or why this happens, our pgpool process currently</div>
<div>handles around 500 connections/second, and it happens maybe once in a couple of days.</div><div><br></div><div>Any idea what could have gone wrong? Any help is greatly appreciated.</div><div><br></div><div><div># Callstack of each child process (all identical):</div>
<div><br></div><div> (gdb) where</div><div> #0 0x0000003d222cdaf3 in __select_nocancel () from /lib64/libc.so.6</div><div> #1 0x000000000041addf in read_packets_and_process (frontend=0xac6cb60, backend=0xac5da40, reset_request=0, state=0x7fffb871aad0, num_fields=0x7fffb871aad6, cont=0x7fffb871aacc "\001")</div>
<div> at pool_process_query.c:4859</div><div> #2 0x000000000041b741 in pool_process_query (frontend=0xac6cb60, backend=0xac5da40, reset_request=0) at pool_process_query.c:260</div><div> #3 0x000000000040ad7a in do_child (unix_fd=5, inet_fd=6) at child.c:355</div>
<div> #4 0x000000000040455f in fork_a_child (unix_fd=5, inet_fd=6, id=91) at main.c:1258</div><div> #5 0x0000000000404887 in reaper () at main.c:2482</div><div> #6 0x0000000000404c15 in pool_sleep (second=<value optimized out>) at main.c:2679</div>
<div> #7 0x00000000004079fa in main (argc=<value optimized out>, argv=<value optimized out>) at main.c:856</div><div><br></div></div><div># `ps` output when it happens:</div><div><br></div><div> pgpool 12894 0.0 0.0 22452 1800 ? S Mar26 0:00 \_ /db/pgpool/bin/pgpool -n -D -f /db/pgpool/conf/pg8/pgpool.conf</div>
<div> pgpool 12918 0.0 0.0 22536 1356 ? S Mar26 0:23 | \_ pgpool: PCP: wait for connection request</div><div> pgpool 12919 0.0 0.0 22452 1004 ? S Mar26 0:00 | \_ pgpool: worker process</div>
<div> pgpool 26236 0.0 0.0 27240 2616 ? S 07:54 0:10 | \_ pgpool: cloud cloud 10.27.18.56(17208) idle</div><div> pgpool 26936 0.0 0.0 27068 2468 ? S 07:59 0:06 | \_ pgpool: cloud cloud 10.30.25.96(24663) idle</div>
<div> pgpool 27422 0.1 0.0 27308 2684 ? S 08:02 0:12 | \_ pgpool: cloud cloud 10.27.18.56(49400) idle</div><div> pgpool 29357 0.0 0.0 27252 2644 ? S 08:13 0:11 | \_ pgpool: cloud cloud 10.27.33.88(17186) idle</div>
<div> pgpool 2363 0.1 0.0 27244 2620 ? S 08:45 0:10 | \_ pgpool: cloud cloud 10.30.25.96(35038) idle</div><div> pgpool 3672 0.1 0.0 27216 2612 ? S 08:54 0:10 | \_ pgpool: cloud cloud 10.27.18.56(49399) idle</div>
<div> pgpool 4969 0.1 0.0 27168 2568 ? S 09:00 0:09 | \_ pgpool: cloud cloud 10.27.33.88(16792) idle</div><div> pgpool 5981 0.1 0.0 27144 2540 ? S 09:06 0:08 | \_ pgpool: cloud cloud 10.27.33.88(16821) idle</div>
<div> pgpool 10072 0.1 0.0 27308 2704 ? S 09:30 0:11 | \_ pgpool: cloud cloud 10.30.25.96(47224) idle</div><div> pgpool 10825 0.0 0.0 27032 2404 ? S 09:35 0:05 | \_ pgpool: cloud cloud 10.27.33.88(22482) idle</div>
<div> pgpool 12839 0.0 0.0 26960 2344 ? S 09:46 0:04 | \_ pgpool: cloud cloud 10.27.33.88(22429) idle</div><div> pgpool 13630 0.0 0.0 26924 2304 ? S 09:51 0:03 | \_ pgpool: cloud cloud 10.27.18.56(45126) idle</div>
<div> pgpool 13650 0.0 0.0 26932 2320 ? S 09:51 0:03 | \_ pgpool: cloud cloud 10.30.25.96(35105) idle</div><div> pgpool 13869 0.0 0.0 26924 2300 ? S 09:52 0:03 | \_ pgpool: cloud cloud 10.30.25.96(34504) idle</div>
<div> pgpool 15023 0.0 0.0 26916 2284 ? S 09:58 0:03 | \_ pgpool: cloud cloud 10.27.18.56(44532) idle</div><div> pgpool 18577 0.2 0.0 27200 2580 ? S 10:16 0:07 | \_ pgpool: cloud cloud 10.27.33.88(29036) idle</div>
<div> pgpool 27679 0.2 0.0 27080 2464 ? S 10:37 0:05 | \_ pgpool: cloud cloud 10.27.33.88(36875) idle</div><div> pgpool 29064 0.1 0.0 26988 2364 ? S 10:39 0:03 | \_ pgpool: cloud cloud 10.30.25.96(46962) idle</div>
<div> pgpool 30220 0.1 0.0 26936 2336 ? S 10:40 0:03 | \_ pgpool: cloud cloud 10.27.18.56(56669) idle</div><div> pgpool 10215 0.0 0.0 26752 2148 ? S 10:55 0:00 | \_ pgpool: cloud cloud 10.30.25.96(46863) idle</div>
<div> pgpool 26719 0.0 0.0 26752 1616 ? S 11:16 0:00 | \_ pgpool: cloud cloud 10.27.18.56(58985) idle</div><div><br></div><div><br></div><div>Thanks,</div><div>Junegunn Choi</div><div><br></div>
</div>