<div dir="ltr"><div><div>Hello,<br><br></div>I&#39;ve got a single pgpool server load balancing and streaming between 1 RDS primary and 1 RDS replica. In general, things work pretty well. But I find that each night, at the same time, pgpool network IO spikes, because of some jobs we kick off and pgpool fails to gracefully handle the load. <br><br></div>As the number of connections and queries increases, we see an increase in processes receiving SIGKILL. Within a minute, we see all of the processes receive SIGKILL. Then we lose the connection to our backends. It seems that new children are spawned and the cycle repeats itself. Suddenly, the parent process can no longer fork, because pgpool cannot allocate memory. After the memory error no children are forked (waited 6 hours, never recovered). If I connect to pgpool my connection hangs indefinitely. Once pgpool reports the memory error, nstat show TcpExtListenDrops are ever increasing.  <br><br>Has anyone else run into issues like this? I tried moving to a larger instance (from 8gb RAM to 15). The 8pm job deluge was passed, pgpool climbed to 11.2gb of RAM. But just now, I ran into the issue again... Any insight would be appreciated.<div><br></div><div>Thanks!<br><div><br><span style="font-family:monospace">2018-01-28 00:14:23: pid 8874: LOG:  child process with pid: 13059 exits with status 9 by signal 9<br>...<br>2018-01-28 00:15:13: pid 2408: WARNING:  failed to connect to PostgreSQL server, getaddrinfo() failed with error &quot;System error&quot;<br>...<br>2018-01-28 00:23:33: pid 8874: LOG:  child process with pid: 9155 exits with status 9 by signal 9<br>...<br>2018-01-28 00:23:34: pid 3034: WARNING:  failed to connect to PostgreSQL server, getaddrinfo() failed with error &quot;System error&quot;<br>...<br></span><pre class="m_3061486987743571053inbox-inbox-m m_3061486987743571053inbox-inbox-m-p m_3061486987743571053inbox-inbox-wr" id="m_3061486987743571053inbox-inbox-" style="overflow-y:hidden;max-height:165px">2018-01-28 00:23:48: pid 8874: FATAL:  failed to fork a child<br>2018-01-28 00:23:48: pid 8874: DETAIL:  system call fork() failed with reason: Cannot allocate memory<br>2018-01-28 00:23:49: pid 9105: LOG:  pool_ssl: &quot;SSL_write&quot;: &quot;bad write retry&quot;<br>2018-01-28 00:23:49: pid 9105: LOCATION:  pool_ssl.c:314<br>2018-01-28 00:23:49: pid 9105: WARNING:  write on backend 0 failed with <span class="m_3061486987743571053inbox-inbox-dt1 m_3061486987743571053inbox-inbox-dt1-h">error</span> :&quot;Success&quot;<br>2018-01-28 00:23:49: pid 9105: DETAIL:  while trying to write data from offset: 0 wlen: 5<br>2018-01-28 00:23:49: pid 9105: LOCATION:  pool_stream.c:678</pre><pre class="m_3061486987743571053inbox-inbox-m m_3061486987743571053inbox-inbox-m-p m_3061486987743571053inbox-inbox-wr" id="m_3061486987743571053inbox-inbox-" style="overflow-y:hidden;max-height:165px"><br></pre><pre class="m_3061486987743571053inbox-inbox-m m_3061486987743571053inbox-inbox-m-p m_3061486987743571053inbox-inbox-wr" id="m_3061486987743571053inbox-inbox-" style="overflow-y:hidden;max-height:165px">listen_addresses          = &#39;*&#39;
port                      = &#39;9999&#39;
socket_dir                = &#39;/var/run/pgpool&#39;
pcp_listen_addresses      = &#39;localhost&#39;
pcp_port                  = 9898
pcp_socket_dir            = &#39;/var/run/pgpool&#39;
listen_backlog_multiplier = 2
serialize_accept          = off
backend_hostname0         = &#39;primary-host&#39;
backend_port0             = 5432
backend_weight0           = 0
backend_flag0             = &#39;ALWAYS_MASTER&#39;
backend_hostname1         = &#39;secondary-host&#39;
backend_port1             = 5432
backend_weight1           = 1
backend_flag1             = &#39;ALLOW_TO_FAILOVER&#39;
enable_pool_hba           = on
pool_passwd               = &#39;pool_passwd&#39;
authentication_timeout    = 60
ssl                       = on
num_init_children         = 450
max_pool                  = 2
child_life_time           = 300
child_max_connections     = 0
connection_life_time      = 300
client_idle_limit         = 0
log_destination           = &#39;stderr&#39;
log_line_prefix           = &#39;%t: pid %p: &#39;
log_connections           = off
log_hostname              = off
log_statement             = off
log_per_node_statement    = off
log_standby_delay         = &#39;if_over_threshold&#39;
log_error_verbosity       = &#39;verbose&#39;
log_min_messages          = &#39;warning&#39;
pid_file_name             = &#39;/var/run/pgpool/pgpool.pid&#39;
logdir                    = &#39;/var/log/pgpool&#39;
connection_cache          = on
reset_query_list          = &#39;ABORT; DISCARD ALL&#39;
replication_mode          = off
replicate_select          = off
insert_lock               = off
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode         = on
ignore_leading_white_space = on
black_function_list       = &#39;currval,lastval,nextval,setval&#39;
allow_sql_comments        = off
master_slave_mode         = on
master_slave_sub_mode     = &#39;stream&#39;
sr_check_period           = 0
delay_threshold           = 10000000
health_check_period       = 5
health_check_timeout      = 20
health_check_password     = &#39;pw&#39;
health_check_user         = &#39;user&#39;
health_check_database     = &#39;postgres&#39;
health_check_max_retries  = 20
health_check_retry_delay  = 1
connect_timeout           = 10000
fail_over_on_backend_error = off
use_watchdog              = off
clear_memqcache_on_escalation = on
check_temp_table          = on
check_unlogged_table      = on
memory_cache_enabled      = off
ssl_key                   = &#39;/etc/pgpool/pgpool.key&#39;
ssl_cert                  = &#39;/etc/pgpool/pgpool.pem&#39;
</pre><div><br></div></div></div></div>