<div dir="ltr"><div>Hi!</div><div><br></div><div>The summary is generated by a script called "pool" : <a href="https://gist.github.com/dansimau/1582492">https://gist.github.com/dansimau/1582492</a></div><div>Unfortunately there are no logs at all, logging was temporary disabled.</div>
<div><br></div><div>The current (normal) status is:</div><div><br></div><div>root@control-1:~# salt postg\* cmd.run "pool status"</div><div>postgres-1:</div><div> Node: 0</div><div> Host: postgres-1</div>
<div>
Port: 5433</div><div> Weight: 0.500000</div><div> Status: Up, in pool (1)</div><div> Role: Master</div><div><br></div><div> Node: 1</div><div> Host: postgres-2</div><div> Port: 5433</div><div> Weight: 0.500000</div>
<div> Status: Up, in pool (1)</div><div> Role: Master</div><div>postgres-2:</div><div> Node: 0</div><div> Host: postgres-2</div><div> Port: 5433</div><div> Weight: 0.500000</div><div> Status: Up, in pool and connected (2)</div>
<div> Role: Master</div><div><br></div><div> Node: 1</div><div> Host: postgres-1</div><div> Port: 5433</div><div> Weight: 0.500000</div><div> Status: Up, in pool and connected (2)</div><div> Role: Master</div>
<div><br></div><div>IP address status</div><div><br></div><div>postgres-2:</div><div> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000</div><div> link/ether 00:50:56:8f:7e:7b brd ff:ff:ff:ff:ff:ff</div>
<div> inet <a href="http://10.6.14.11/24">10.6.14.11/24</a> brd 10.6.14.255 scope global eth0</div><div> valid_lft forever preferred_lft forever</div><div> inet <a href="http://10.6.14.15/24">10.6.14.15/24</a> scope global secondary eth0</div>
<div> valid_lft forever preferred_lft forever</div><div> inet6 fe80::250:56ff:fe8f:7e7b/64 scope link</div><div> valid_lft forever preferred_lft forever</div><div>postgres-1:</div><div> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000</div>
<div> link/ether 00:50:56:8f:58:ab brd ff:ff:ff:ff:ff:ff</div><div> inet <a href="http://10.6.14.10/24">10.6.14.10/24</a> brd 10.6.14.255 scope global eth0</div><div> valid_lft forever preferred_lft forever</div>
<div> inet6 fe80::250:56ff:fe8f:58ab/64 scope link</div><div> valid_lft forever preferred_lft forever</div><div><br></div><div>Config files with modified passwords:</div><div><br></div><div>postgres-2:</div>
<div> listen_addresses = '*'</div><div> port = 5432</div><div> socket_dir = '/var/run/postgresql'</div><div> pcp_port = 9898</div><div> pcp_socket_dir = '/var/run/postgresql'</div><div>
backend_hostname0 = 'postgres-2'</div><div> backend_port0 = 5433</div><div> backend_weight0 = 1</div><div> backend_data_directory0 = '/var/lib/postgresql/9.3'</div><div> backend_flag0 = 'ALLOW_TO_FAILOVER'</div>
<div> backend_hostname1 = 'postgres-1'</div><div> backend_port1 = 5433</div><div> backend_weight1 = 1</div><div> backend_data_directory1 = '/var/lib/postgresql/9.3'</div><div> backend_flag1 = 'ALLOW_TO_FAILOVER'</div>
<div> enable_pool_hba = on</div><div> pool_passwd = ''</div><div> authentication_timeout = 60</div><div> ssl = off</div><div> num_init_children = 32</div><div> max_pool = 4</div><div> child_life_time = 300</div>
<div> child_max_connections = 0</div><div> connection_life_time = 0</div><div> client_idle_limit = 0</div><div> log_destination = 'syslog'</div><div> print_timestamp = on</div><div> log_connections = off</div>
<div> log_hostname = on</div><div> log_statement = off</div><div> log_per_node_statement = on</div><div> log_standby_delay = 'none'</div><div> syslog_facility = 'LOCAL0'</div><div> syslog_ident = 'pgpool'</div>
<div> debug_level = 0</div><div> pid_file_name = '/var/run/postgresql/pgpool.pid'</div><div> logdir = '/var/log/postgresql'</div><div> connection_cache = on</div><div> reset_query_list = 'ABORT; DISCARD ALL'</div>
<div> replication_mode = on</div><div> replicate_select = off</div><div> insert_lock = on</div><div> lobj_lock_table = ''</div><div> replication_stop_on_mismatch = off</div><div> failover_if_affected_tuples_mismatch = off</div>
<div> health_check_period = 15</div><div> health_check_timeout = 5</div><div> health_check_user = 'pg_admin'</div><div> health_check_password = 'P4ssw0rd'</div><div> health_check_max_retries = 2</div>
<div> health_check_retry_delay = 1</div><div> failover_command = 'echo $(date): host:%h, new master id:%m, old master id:%M >> /var/lib/pgsql/failover.log'</div><div> failback_command = 'echo $(date): host:%h, new master id:%m, old master id:%M >> /var/lib/pgsql/failback.log'</div>
<div> fail_over_on_backend_error = on</div><div> recovery_user = 'pg_admin'</div><div> recovery_password = 'P4ssw0rd'</div><div> recovery_1st_stage_command = 'copy_base_backup'</div><div>
recovery_2nd_stage_command = 'switch_xlog'</div><div> recovery_timeout = 90</div><div> client_idle_limit_in_recovery = 0</div><div> use_watchdog = on</div><div> trusted_servers = 'postgres-1'</div>
<div> delegate_IP = '10.6.14.15'</div><div> wd_hostname = 'postgres-2'</div><div> wd_port = 9000</div><div> wd_interval = 5</div><div> ping_path = '/bin'</div><div> ifconfig_path = '/var/lib/postgresql'</div>
<div> if_up_cmd = 'ip add add <a href="http://10.6.14.15/24">10.6.14.15/24</a> dev eth0'</div><div> if_down_cmd = 'ip add del <a href="http://10.6.14.15/24">10.6.14.15/24</a> dev eth0'</div><div> arping_path = '/var/lib/postgresql'</div>
<div> arping_cmd = 'arping -U 10.6.14.15 -w 1'</div><div> wd_life_point = 3</div><div> wd_lifecheck_query = 'SELECT 1'</div><div> wd_escalation_command = ''</div><div> wd_lifecheck_method = 'heartbeat'</div>
<div> wd_interval = 15</div><div> wd_heartbeat_port = 9694</div><div> wd_heartbeat_keepalive = 5</div><div> wd_heartbeat_deadtime = 30</div><div> heartbeat_destination0 = 'postgres-1'</div><div> heartbeat_destination_port0 = 9694</div>
<div> heartbeat_device0 = ''</div><div> other_pgpool_hostname0 = 'postgres-1'</div><div> other_pgpool_port0 = 5432</div><div> other_wd_port0 = 9000</div><div> relcache_expire = 0</div><div>
relcache_size = 256</div>
<div> check_temp_table = on</div><div> memory_cache_enabled = off</div><div> memqcache_method = 'shmem'</div><div> memqcache_memcached_host = 'localhost'</div><div> memqcache_memcached_port = 11211</div>
<div> memqcache_total_size = 67108864</div><div> memqcache_max_num_cache = 1000000</div><div> memqcache_expire = 0</div><div> memqcache_auto_cache_invalidation = on</div><div> memqcache_maxcache = 409600</div>
<div> memqcache_cache_block_size = 1048576</div><div> memqcache_oiddir = '/var/log/pgpool/oiddir'</div><div> white_memqcache_table_list = ''</div><div> black_memqcache_table_list = ''</div>
<div>postgres-1:</div><div> listen_addresses = '*'</div><div> port = 5432</div><div> socket_dir = '/var/run/postgresql'</div><div> pcp_port = 9898</div><div> pcp_socket_dir = '/var/run/postgresql'</div>
<div> backend_hostname0 = 'postgres-1'</div><div> backend_port0 = 5433</div><div> backend_weight0 = 1</div><div> backend_data_directory0 = '/var/lib/postgresql/9.3'</div><div> backend_flag0 = 'ALLOW_TO_FAILOVER'</div>
<div> backend_hostname1 = 'postgres-2'</div><div> backend_port1 = 5433</div><div> backend_weight1 = 1</div><div> backend_data_directory1 = '/var/lib/postgresql/9.3'</div><div> backend_flag1 = 'ALLOW_TO_FAILOVER'</div>
<div> enable_pool_hba = on</div><div> pool_passwd = ''</div><div> authentication_timeout = 60</div><div> ssl = off</div><div> num_init_children = 32</div><div> max_pool = 4</div><div> child_life_time = 300</div>
<div> child_max_connections = 0</div><div> connection_life_time = 0</div><div> client_idle_limit = 0</div><div> log_destination = 'syslog'</div><div> print_timestamp = on</div><div> log_connections = off</div>
<div> log_hostname = on</div><div> log_statement = off</div><div> log_per_node_statement = on</div><div> log_standby_delay = 'none'</div><div> syslog_facility = 'LOCAL0'</div><div> syslog_ident = 'pgpool'</div>
<div> debug_level = 0</div><div> pid_file_name = '/var/run/postgresql/pgpool.pid'</div><div> logdir = '/var/log/postgresql'</div><div> connection_cache = on</div><div> reset_query_list = 'ABORT; DISCARD ALL'</div>
<div> replication_mode = on</div><div> replicate_select = off</div><div> insert_lock = on</div><div> lobj_lock_table = ''</div><div> replication_stop_on_mismatch = off</div><div> failover_if_affected_tuples_mismatch = off</div>
<div> health_check_period = 15</div><div> health_check_timeout = 5</div><div> health_check_user = 'pg_admin'</div><div> health_check_password = 'P4ssw0rd'</div><div> health_check_max_retries = 2</div>
<div> health_check_retry_delay = 1</div><div> failover_command = 'echo $(date): host:%h, new master id:%m, old master id:%M >> /var/lib/pgsql/failover.log'</div><div> failback_command = 'echo $(date): host:%h, new master id:%m, old master id:%M >> /var/lib/pgsql/failback.log'</div>
<div> fail_over_on_backend_error = on</div><div> recovery_user = 'pg_admin'</div><div> recovery_password = 'P4ssw0rd'</div><div> recovery_1st_stage_command = 'copy_base_backup'</div><div>
recovery_2nd_stage_command = 'switch_xlog'</div><div> recovery_timeout = 90</div><div> client_idle_limit_in_recovery = 0</div><div> use_watchdog = on</div><div> trusted_servers = 'postgres-2'</div>
<div> delegate_IP = '10.6.14.15'</div><div> wd_hostname = 'postgres-1'</div><div> wd_port = 9000</div><div> wd_interval = 5</div><div> ping_path = '/bin'</div><div> ifconfig_path = '/var/lib/postgresql'</div>
<div> if_up_cmd = 'ip add add <a href="http://10.6.14.15/24">10.6.14.15/24</a> dev eth0'</div><div> if_down_cmd = 'ip add del <a href="http://10.6.14.15/24">10.6.14.15/24</a> dev eth0'</div><div> arping_path = '/var/lib/postgresql'</div>
<div> arping_cmd = 'arping -U 10.6.14.15 -w 1'</div><div> wd_life_point = 3</div><div> wd_lifecheck_query = 'SELECT 1'</div><div> wd_escalation_command = ''</div><div> wd_lifecheck_method = 'heartbeat'</div>
<div> wd_interval = 15</div><div> wd_heartbeat_port = 9694</div><div> wd_heartbeat_keepalive = 5</div><div> wd_heartbeat_deadtime = 30</div><div> heartbeat_destination0 = 'postgres-2'</div><div> heartbeat_destination_port0 = 9694</div>
<div> heartbeat_device0 = ''</div><div> other_pgpool_hostname0 = 'postgres-2'</div><div> other_pgpool_port0 = 5432</div><div> other_wd_port0 = 9000</div><div> relcache_expire = 0</div><div>
relcache_size = 256</div>
<div> check_temp_table = on</div><div> memory_cache_enabled = off</div><div> memqcache_method = 'shmem'</div><div> memqcache_memcached_host = 'localhost'</div><div> memqcache_memcached_port = 11211</div>
<div> memqcache_total_size = 67108864</div><div> memqcache_max_num_cache = 1000000</div><div> memqcache_expire = 0</div><div> memqcache_auto_cache_invalidation = on</div><div> memqcache_maxcache = 409600</div>
<div> memqcache_cache_block_size = 1048576</div><div> memqcache_oiddir = '/var/log/pgpool/oiddir'</div><div> white_memqcache_table_list = ''</div><div> black_memqcache_table_list = ''</div>
<div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-04-18 10:36 GMT+02:00 Yugo Nagata <span dir="ltr"><<a href="mailto:nagata@sraoss.co.jp" target="_blank">nagata@sraoss.co.jp</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<div class=""><br>
On Tue, 15 Apr 2014 12:24:51 +0200<br>
Attila Heidrich <<a href="mailto:attila.heidrich@gmail.com">attila.heidrich@gmail.com</a>> wrote:<br>
<br>
> Dear guys!<br>
><br>
> Where to find the problem in the situation above?<br>
><br>
> No logs at all, for some reason pgpool stopped logging (usually it uses<br>
> syslog).<br>
<br>
</div>The followings seem a summarised results of some command (pcp command or<br>
show pool_nodes?). I want to konw the actual command results. Could you<br>
please send me these, log messages, and pgpool.conf?<br>
<div class="HOEnZb"><div class="h5"><br>
><br>
> root@postgres-1:/etc/pgpool2# pool status<br>
> Node: 0<br>
> Host: postgres-1<br>
> Port: 5433<br>
> Weight: 0.500000<br>
> Status: Up, in pool (1)<br>
> Role: Master<br>
><br>
> Node: 1<br>
> Host: postgres-2<br>
> Port: 5433<br>
> Weight: 0.500000<br>
> Status: Up, in pool (1)<br>
> Role: Master<br>
><br>
> root@postgres-2:/etc/pgpool2# pool status<br>
> Node: 0<br>
> Host: postgres-2<br>
> Port: 5433<br>
> Weight: 0.500000<br>
> Status: Up, in pool and connected (2)<br>
> Role: Master<br>
><br>
> Node: 1<br>
> Host: postgres-1<br>
> Port: 5433<br>
> Weight: 0.500000<br>
> Status: Up, in pool and connected (2)<br>
> Role: Master<br>
><br>
> This isn't the first time, usually happens in a high load situation.<br>
><br>
> Attila<br>
<br>
<br>
</div></div><span class="HOEnZb"><font color="#888888">--<br>
Yugo Nagata <<a href="mailto:nagata@sraoss.co.jp">nagata@sraoss.co.jp</a>><br>
</font></span></blockquote></div><br></div>