[pgpool-hackers: 1256] Re: changing the pcp_watchdog_info

Yugo Nagata nagata at sraoss.co.jp
Mon Dec 21 21:10:54 JST 2015


Hi Usama,

On Thu, 10 Dec 2015 20:29:50 +0500
Muhammad Usama <m.usama at gmail.com> wrote:

> Hi Ishii San
> 
> pcp_watchdog_info only gives the information of a single watchdog
> node which might not be enough in some certain situations. And as we are
> currently working on watchdog enhancements so I thought it would be good to
> also enhance the pcp_watchdog_info utility. I have created a patch to add
> a little more information about the watchdog cluster state and nodes in the
> output of pcp_watchdog_info.
> 
> Can you please have a look at the attached patch specially for
> 
> 1-) If you are good with all the new information shown by pcp_watchdog_info
> utility or you want to add/remove something?

What is "In Network Error"?  Looking into codes, this stands for the value of
g_cluster.network_error. However, this is not used in watchdog codes, and
I feel this is meanless. In addition, network_error_time is also not used.

I think it would be good to add delegate_IP and QuorumStatus to 
"Watchdog Cluster Information" section in verbose mode.

In get_node_list_json(), g_cluster.quorum_status is put into jNode, however
update_quorum_status() isn't called before that. I guess 
update_connected_node_count() is called for updating g_cluster.aliveNodeCount.
I think update_quorum_status() is also required to be call similarly.

I think users can't understand what "Local Node Escalated" stands for,
that is, what is the different from COORDINATOR status on local node.
It might be better to use "VIP is up on Local Node", "Local Node has VIP"
and so on....  Alternatively, each Node Information should have VIP holder
or not. This can let users know whether VIP exists in the whole cluster
instead of the local node, and users can notice the problem if VIP doesn't
exist at any node or if mutiple VIPs exist.

In non verbose mode, "Master Node Name" is shown by its host name. 
However, the host name is not shown at each Node Information section.
Although, user might know "COORDINATOR is the master", it whould be
good to add host name to each node information section.

 $ ~/pgpool/bin/pcp_watchdog_info -p 11001 
 
 2 NO YES Linux_yugo-n-ubuntu_11000 

 [0] localhost 11000 9000 4 COORDINATOR
 [1] localhost 12000 9001 7 STANDBY


If there is no master (coordinator), what should be shown at "Master Node Name"?
Or, this situation must not occur?

> 
> 2-) inform watchdog info in pcp_worker violates the data serialization
> technique used by PCP server for other functions and adopts the JSON data
> formatted load to transmit the watchdog information to the client side.
> Although I am of the point of view that someday we should shift all the
> other functions to use JSON or some other serialization technique which is
> more adaptable and then the current proprietary format. But for the time
> being the watchdog informing part of PCP is different from all other.
> 
> with the new pcp_watchdog_info when node ID is given the utility shows the
> information of that specific node while ID =0 means the local watchdog
> node. And when no node ID is provided by user information of all nodes is
> shown

To be frank, I'm not sure it is good idea to use ID=0 to stand for local 
watchdog, because the ID of Nth remote pgpool (other_pgpool_hostnameN in
pgpool.conf) is now N+1 and I think this is slightly misleading. However,
it is a not bad idea to allow pcp_watchdog_info show all nodes information.


BTW, when I try to get infromation of Node 1 (1st remote pgpool),
this shows that Node Number is 0. Is "Node Number" is diffrence
with "Node ID"? I feel this is confusable. Is Node Number needed
to be shown rather than Node ID?

$ pcp_watchdog_info -p 11001 -n 1 -v

Watchdog Cluster Information 
Total Nodes         : 2
Remote Nodes        : 1
Alive Remote Nodes  : 1
In Network Error    : NO
Local Node Escalated: YES
Master Node Name    : Linux_yugo-n-ubuntu_11000

Watchdog Node Information 
Node Number    : 0
Node Name      : Linux_yugo-n-ubuntu_12000
Host Name      : localhost
Pgpool port    : 12000
Watchdog port  : 9001
Node priority  : 1
status         : 7
status Name    : STANDBY




> 
> --example--
> 
> [usama at localhost pgpool]$ bin/pcp_watchdog_info -h localhost -p 9893 -U
> postgres -v
> Password:
> Watchdog Cluster Information
> Total Nodes         : 3
> Remote Nodes        : 2
> Alive Remote Nodes  : 2
> In Network Error    : NO
> Local Node Escalated: NO
> Master Node Name    : Linux_localhost.localdomain_9992
> 
> Watchdog Node Information
> Node Number    : 0
> Node Name      : Linux_localhost.localdomain_9993
> Host Name      : localhost
> Pgpool port    : 9993
> Watchdog port  : 9003
> Node priority  : 1
> status         : 7
> status Name    : STANDBY
> 
> Node Number    : 1
> Node Name      : Linux_localhost.localdomain_9992
> Host Name      : localhost
> Pgpool port    : 9992
> Watchdog port  : 9002
> Node priority  : 1
> status         : 4
> status Name    : COORDINATOR
> 
> Node Number    : 2
> Node Name      : Linux_localhost.localdomain_9991
> Host Name      : localhost
> Pgpool port    : 9991
> Watchdog port  : 9001
> Node priority  : 1
> status         : 7
> status Name    : STANDBY
> 
> 
> 
> Thanks
> Best regards
> Muhammad Usama


-- 
Yugo Nagata <nagata at sraoss.co.jp>


More information about the pgpool-hackers mailing list