[pgpool-general: 5856] What about load balancing?

Тараканов Олег tarakanov at ir-tech.ru
Fri Jan 12 23:28:18 JST 2018


Hello!

We 100% microsoft-product company. Just start to build PostgreSQL HA 
Cluster based on:
Ubuntu 16.04.3 (in docker) + PostgreSQL 10 + repmgr (for stream 
replication) + pgpool2 (version 3.7.1 - as load balance for one primary 
and two standby nodes) + pgbouncer (for connection pool).

Now we start load test for our solution (without pgbouncer) on i7-3770, 
32GB RAM, SSD 240 Kingstone KC300.

Our changes in default "postgresql.conf":
> listen_addresses = '*'
> port = 5433
> max_connections = 4096
> shared_buffers = 4096MB
> effective_cache_size = 16GB
> checkpoint_timeout = 30min
> work_mem = 1MB
> maintenance_work_mem = 128MB
> synchronous_commit = off
> wal_level = hot_standby
> wal_log_hints = on
> wal_keep_segments = 0
> max_wal_senders = 10
> max_replication_slots = 5
> hot_standby = on
> archive_mode = on
> archive_command = '/bin/true'
> shared_preload_libraries = 'repmgr_funcs, pg_stat_statements'

Our changes in defauilt "pgpool.conf":
> listen_addresses = '*'
> pid_file_name = '/var/run/pgpool.pid'
> logdir = '/var/log'
>
> backend_hostname0 = 'node1'
> backend_port0 = .*/backend_port0 = 5433
> backend_data_directory0 = '/var/lib/postgresql/10/main/base'
> backend_flag0 = 'ALLOW_TO_FAILOVER'
> backend_hostname1 = 'node2'
> backend_port1 = 5433
> backend_weight1 = 1
> backend_data_directory1 = '/var/lib/postgresql/10/main/base'
> backend_flag1 = 'ALLOW_TO_FAILOVER'
> backend_hostname2 = 'node3'
> backend_port2 = 5433
> backend_weight2 = 1
> backend_data_directory2 = '/var/lib/postgresql/10/main/base'
> backend_flag2 = 'ALLOW_TO_FAILOVER'
>
> enable_pool_hba = on
> authentication_timeout = 30
> num_init_children = 64 (set "net.core.somaxconn = 65535" in 
> /etc/sysctl.conf)
> max_pool = 32
> log_destination = 'stderr,syslog'
> insert_lock = off
> load_balance_mode = on
> black_function_list = 'nextval,setval'
> master_slave_mode = on
> master_slave_sub_mode = 'stream'
> sr_check_period = 0
> sr_check_user = 'repmgr'
> health_check_period = 10
> health_check_user = 'repmgr'
> health_check_max_retries = 5
> health_check_retry_delay = 5
> failover_command = 'echo \">>> Failover - that will initialize new 
> primary node search\"'
> fail_over_on_backend_error = off
> search_primary_node_timeout = 0

With pgbench we emulate SELECT workload on nodes (no pgpool):
> ## pgbench -h nodeX -p 5433 -c 64 -j 8 -T 60 -S
> ## (standby -> master)
> # number of transactions actually processed: 2733742
> # latency average = 1.405 ms
> # tps = 45551.063683 (including connections establishing)
> # tps = 45559.700752 (excluding connections establishing)
> ## (standby -> standby)
> # number of transactions actually processed: 2712321
> # latency average = 1.416 ms
> # tps = 45199.740907 (including connections establishing)
> # tps = 45212.390333 (excluding connections establishing)
> ## (standby -> local)
> # number of transactions actually processed: 3161213
> # latency average = 1.215 ms
> # tps = 52675.138511 (including connections establishing)
> # tps = 52686.008651 (excluding connections establishing)
Then we give workload in pgpool node, we have half of one node result:
> pgbench -h pool -p 9999 -U postgres -c 64 -j 8 -T 60 -S
> starting vacuum...end.
> transaction type: <builtin: select only>
> scaling factor: 1
> query mode: simple
> number of clients: 64
> number of threads: 8
> duration: 60 s
> number of transactions actually processed: 1413496
> latency average = 2.719 ms
> tps = 23535.490222 (including connections establishing)
> tps = 23543.872064 (excluding connections establishing)
With command
>  psql -h pool -p 9999 -c "show pool_nodes"
and resource monitoring we see what all nodes use in workload.

We understand about hardware limitations, and check work with cpu 
limitation - 2 core per node (pgpool no limitation).
Results:
> ### node3 => localhost
> transaction type: <builtin: select only>
> scaling factor: 1
> query mode: simple
> number of clients: 64
> number of threads: 8
> duration: 60 s
> number of transactions actually processed: 1116951
> latency average = 3.439 ms
> tps = 18611.745016 (including connections establishing)
> tps = 18616.499012 (excluding connections establishing)
> ### node3 => node2
> transaction type: <builtin: select only>
> scaling factor: 1
> query mode: simple
> number of clients: 64
> number of threads: 8
> duration: 60 s
> number of transactions actually processed: 838843
> latency average = 4.579 ms
> tps = 13977.347353 (including connections establishing)
> tps = 13985.630946 (excluding connections establishing)
Best result for pgpool - tps = 23 000.

What we do wrong? Maybe pgpool have good configuration for work only in 
load balancing mode?
Thanks!
--
Best regards
Tarakanov Oleg


More information about the pgpool-general mailing list