[pgpool-general: 6507] Re: New pgpool-II-95 4.0.4 install in front of a 3-node repmgr95 postgresql-9.5 cluster - not finding all the nodes

Sat Apr 13 02:08:58 JST 2019

Hello,
In your config you have
backend_hostname0 = '192.x.y.a'

3 times, it should be once backend_hostname0, backend_hostname1 and backend_hostname2 (the same for backend_port0,  etc)

Pierre 

    On Friday, April 12, 2019, 5:05:05 PM GMT+2, Rob Reinhardt <rreinhardt at eitccorp.com> wrote:  
 
 I "feel like" it should be working since so much of it is working, except the main function of the s/w seems to be failing me.
my repmgr95 says this:
ID | Name | Role | Status | Upstream | Location | Connection string----+---------+---------+-----------+----------+----------+----------------------------------------------------------1 | r01sv05 | standby | running | r01sv04 | default | host=r01sv05 user=repmgr dbname=repmgr connect_timeout=22 | r01sv04 | primary | * running | | default | host=r01sv04 user=repmgr dbname=repmgr connect_timeout=23 | r01sv03 | standby | running | r01sv04 | default | host=r01sv03 user=repmgr dbname=repmgr connect_timeout=2
(actually 05 is now the primary, that is an old shot)
r01sv02 is the pgpool server btw, and they are all on the same subnet.
my pgpool says this:
-bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv02 -c "show pool_nodes" node_id | hostname | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change  ---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+--------------------- 0       | r01sv03  | 5432 | up     | 1.000000  | standby | 0          | true              | 0                 | 2019-04-11 19:48:43(1 row)
pgpool keeps logging this:
Apr 12 14:03:03 r01sv02.change.me pgpool[14630]: [259-1] 2019-04-12 14:03:03: pid 14630: LOG:  find_primary_node: standby node is 0Apr 12 14:03:03 r01sv02.change.me pgpool[14630]: [259-2] 2019-04-12 14:03:03: pid 14630: LOCATION:  pgpool_main.c:3438Apr 12 14:03:04 r01sv02.change.me pgpool[14630]: [260-1] 2019-04-12 14:03:04: pid 14630: LOG:  find_primary_node: standby node is 0Apr 12 14:03:04 r01sv02.change.me pgpool[14630]: [260-2] 2019-04-12 14:03:04: pid 14630: LOCATION:  pgpool_main.c:3438Apr 12 14:03:05 r01sv02.change.me pgpool[14630]: [261-1] 2019-04-12 14:03:05: pid 14630: LOG:  find_primary_node: standby node is 0and occasionally the find_primary_node_repeatedly line
Quick summary of my setup:3 postgresql-9.5 db nodes, one is primary, the other two are standby, in a streaming replication cluster built and managed with repmgr95.  This is working fine.
1 pgpool 4.0.4 server that has the same version of postgresql-9.5 and postgres user setup as the other 3.- pgpool is running as postgres
what does work:-the postgres user has ssh access to/from any of the four servers. I can remotely run repmgr from the pgpool server as postgres user with no problem-psql can access all the db's says with simple \list or \dt or whatever from any of the 4 nodes asking for 5432 access from any of the four nodes, even from the pgpool server-i can use the postgres user or pgpool user with psql- dns is working too, but I changed from using the hostname to the IP's in the config file in case it made a difference, but it did not.
I've even run this commands by hand and it gets the right answers:
-bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv02 -c "SELECT pg_is_in_recovery();" pg_is_in_recovery ------------------- t(1 row)
-bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv03 -c "SELECT pg_is_in_recovery();" pg_is_in_recovery ------------------- t(1 row)
-bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv04 -c "SELECT pg_is_in_recovery();" pg_is_in_recovery ------------------- t(1 row)
-bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv05 -c "SELECT pg_is_in_recovery();" pg_is_in_recovery ------------------- f(1 row)
pgpool for some reason finds one of the three nodes, a standby node, and it has the right.
the pgpool database I created, I created on my primary.  I had thought that when pgpool started up it might put some stuff in that database, but I haven't seen anything, in case that is the problem.  i found notes on creating said database and user, but have seen nothing on actually putting anything in it by hand.--anyway, I was just looking at that in case it is something
Main question -- where are the other two nodes?
Also, I've noted that each time I start pgpool, it throws those errors (above) until the steps reaches 300, then it finally says "successfully started" and at that point the pcp_* commands will work, before then it has not yet created the pcp socket.  Don't know if that is normal/expected or not.  Seemed odd to me, for basic commands to take 5 minutes to even be available.
The other thing is that while it will come up for a while, pgpool seems to be stopping itself after about 10 minutes or so.  the log just says that pgpool was told to stop (but I didn't do it).
I've attached a sanitized version of my pgpool.conf file
In case it helps, here also is the sanitized contents of the .pgpass and .pcppass files in the postgres home dir of all four of my servers and the pool_passwd, in case you see a problem with these (they are 600 owned by postgres).
-bash-4.2$ cat .pgpassr01sv02:5432:*:pgpool:sanitizedr01sv05:5432:*:postgres:pgpool:sanitizedr01sv04:5432:*:postgres:pgpool:sanitizedr01sv03:5432:*:postgres:pgpool:sanitizedr01sv05:5432:replication:repmgr:pgpool:sanitizedr01sv04:5432:replication:repmgr:pgpool:sanitizedr01sv03:5432:replication:repmgr:pgpool:sanitized
-bash-4.2$ cat .pcppass*:*:pgpool:pgpool:sanitized*:*:postgres:pgpool:sanitized
pcp.confpgpool:sanitizednrpe:sanitizedpostgres:sanitized
pool_passwdpgpool:sanitizednrpe:sanitizedpostgres:sanitized

-bash-4.2$ cat pool_hba.conf# pgpool Client Authentication Configuration File
# "local" is for Unix domain socket connections onlylocal   all         all                               trust# IPv4 local connections:host    all         all         127.0.0.1/32          trusthost    all         all         ::1/128               trusthost    all         all         192.x.y.0/24             md5
Thanks,Rob


_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20190412/58057c28/attachment.html>