[pgpool-general: 1167] Re: Watchdog error - wd_init: delegate_IP already exists
Will Ferguson
will.ferguson at vyre.com
Tue Nov 6 22:03:36 JST 2012
Ah I see what the problem is.
There was another process listening on 9000! Stopping that process allowed
the watchdog process to start and it all works well now.
Thanks ever so much for all your help. To note, there was no bind error in
the logs - it would be great if future releases would log a bind error.
Thanks again,
Will
On 6 November 2012 12:37, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> There aren't watchdog process on the pgpool 1's server for some reason.
> Is there any error message in pgpool 1 log?
>
> In this situation, pgpool 2 judges that pgpool 1 hasn't started up yet
> and tries to bring up virtual ip. However, it fails since the ip is on
> the first server already.
>
> To resolve it, anyway, you have to kill all pgpool process on the first
> server and start pgpool again.
>
> Procedure:
>
> 1) bring down the vip on the first server.
> 3) kill all pgpool processes on the first server.
> 4) start pgpool on the first server.
> 2) start pgpool on the second server.
>
> On Tue, 6 Nov 2012 11:59:59 +0000
> Will Ferguson <will.ferguson at vyre.com> wrote:
>
> > *pgpool 1*
> >
> > root at will-pgpool1:~# ps aux | grep pgpool
> > avahi 848 0.0 0.1 33000 2612 ? S Oct12 9:11
> > avahi-daemon: running [will-pgpool1.local]
> > root 2279 0.0 0.3 98940 6032 ? S Nov05 0:12 pgpool
> -n
> > root 2286 0.0 0.1 98940 1740 ? S Nov05 0:06 pgpool:
> > lifecheck
> > root 2288 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2289 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2291 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2296 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2298 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2299 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2300 0.0 0.3 104296 5888 ? S Nov05 0:35 pgpool:
> > onbrand onbrand_unify del1.local(55305) idle
> > root 2304 0.0 0.3 104288 5944 ? S Nov05 0:42 pgpool:
> > onbrand onbrand_unify del1.local(55304) idle
> > root 2306 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2308 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2311 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2312 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2313 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2315 0.0 0.3 104288 5888 ? S Nov05 0:00 pgpool:
> > onbrand onbrand_unify del1.local(55308) idle
> > root 2316 0.0 0.3 104288 5888 ? S Nov05 0:38 pgpool:
> > onbrand onbrand_unify del1.local(55307) idle
> > root 2317 0.1 0.3 104288 5888 ? S Nov05 1:46 pgpool:
> > onbrand onbrand_unify del1.local(55306) idle
> > root 2318 0.0 0.3 104288 5888 ? S Nov05 0:00 pgpool:
> > onbrand onbrand_unify del1.local(55303) idle
> > root 2319 0.0 0.0 98940 1132 ? S Nov05 0:00 pgpool:
> > PCP: wait for connection request
> > root 2320 0.0 0.1 98940 1828 ? S Nov05 0:06 pgpool:
> > worker process
> > root 2391 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2486 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2522 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2546 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2551 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2595 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2672 0.0 0.0 98940 1156 ? S Nov05 0:00 pgpool:
> > wait for connection request
> > root 2785 0.0 0.0 98940 1156 ? S 01:06 0:00 pgpool:
> > wait for connection request
> > root 2827 0.0 0.0 98940 1156 ? S 01:55 0:00 pgpool:
> > wait for connection request
> > root 2888 0.0 0.0 98940 1156 ? S 03:13 0:00 pgpool:
> > wait for connection request
> > root 3000 0.0 0.0 98940 1156 ? S 05:52 0:00 pgpool:
> > wait for connection request
> > root 3006 0.0 0.0 98940 1156 ? S 06:07 0:00 pgpool:
> > wait for connection request
> > root 3724 0.0 0.0 98940 1156 ? S 10:39 0:00 pgpool:
> > wait for connection request
> > root 3730 0.0 0.1 98944 2284 ? S 10:55 0:00 pgpool:
> > wait for connection request
> > root 3731 0.0 0.0 98940 1156 ? S 11:00 0:00 pgpool:
> > wait for connection request
> > root 3775 0.0 0.0 9392 924 pts/0 S+ 11:55 0:00 grep
> > --color=auto pgpool
> >
> > *pgpool2*
> > *
> > *
> > root at will-pgpool2:~# ps aux | grep pgool
> > root 3394 0.0 0.0 9388 916 pts/0 S+ 11:57 0:00 grep
> > --color=auto pgool
> >
> >
> > Thanks for all your help.
> >
> > Will
> >
> > On 6 November 2012 11:52, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> >
> > > Could you show me lists of pgpool processes?
> > >
> > > Execute followings on the each server.
> > >
> > > $ ps aux | grep pgpool
> > >
> > > On Tue, 6 Nov 2012 11:43:57 +0000
> > > Will Ferguson <will.ferguson at vyre.com> wrote:
> > >
> > > > No the IP doesn't come up.
> > > >
> > > > On 6 November 2012 11:06, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> > > >
> > > > > Dose the virtual ip is bringing up on the scond pgpool server?
> > > > >
> > > > > On Tue, 6 Nov 2012 10:59:16 +0000
> > > > > Will Ferguson <will.ferguson at vyre.com> wrote:
> > > > >
> > > > > > Thanks but leaving the first pgpool instance running for hours
> before
> > > > > > starting the second still doesn't help. Strange that it is fine
> the
> > > other
> > > > > > way around. I'll keep my eye out for the fix.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Will
> > > > > >
> > > > > > On 6 November 2012 00:20, Yugo Nagata <nagata at sraoss.co.jp>
> wrote:
> > > > > >
> > > > > > > Wait a few seconds after the first pgpool has started.
> > > > > > > Then, you would be able to start the second successfully.
> > > > > > >
> > > > > > > I recognize this problem and am trying to resolve it.
> > > > > > >
> > > > > > > On Mon, 5 Nov 2012 07:07:25 -0800
> > > > > > > Lonni J Friedman <netllama at gmail.com> wrote:
> > > > > > >
> > > > > > > > I've seen this happen often a well. I'm guessing that
> there's a
> > > race
> > > > > > > > condition somewhere.
> > > > > > > >
> > > > > > > > On Mon, Nov 5, 2012 at 7:00 AM, Will Ferguson <
> > > > > will.ferguson at vyre.com>
> > > > > > > wrote:
> > > > > > > > > Hello
> > > > > > > > >
> > > > > > > > > I have two pgpool servers: will-pgpool1 & will-pgpool2 with
> > > > > watchdog
> > > > > > > enabled
> > > > > > > > >
> > > > > > > > > I can start will-pgpool2 first which takes the virtual IP,
> and
> > > then
> > > > > > > start
> > > > > > > > > will-pgpool1 and all works fine.
> > > > > > > > >
> > > > > > > > > But if I start will-pgpool1 first which takes the virtual
> IP,
> > > and
> > > > > then
> > > > > > > start
> > > > > > > > > will-pgpool2 the pgpool2 server doesn't start.
> > > > > > > > >
> > > > > > > > > Both pgpool.conf files are the same except wd_hostname and
> > > > > > > > > other_pgpool_hostname0
> > > > > > > > >
> > > > > > > > > Both instances are started as root with -d -n
> > > > > > > > >
> > > > > > > > > Below is the log debug output.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Will
> > > > > > > > >
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: listen_addresses
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: '*' kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: port
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 9999 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: socket_dir
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: '/tmp' kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: pcp_port
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 9898 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: pcp_socket_dir
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: '/tmp' kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_hostname0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'will-db1'
> kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_port0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 5432 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: port slot
> > > number
> > > > > 0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_weight0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 1 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: weight
> slot
> > > > > number 0
> > > > > > > > > weight: 1.000000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > backend_data_directory0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> '/data/postgres'
> > > kind:
> > > > > 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_flag0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> 'ALLOW_TO_FAILOVER'
> > > > > kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: extract_string_tokens:
> > > token:
> > > > > > > > > ALLOW_TO_FAILOVER
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config:
> > > > > allow_to_failover on
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: slot
> number 0
> > > > > flag:
> > > > > > > 0000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_hostname1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'will-db2'
> kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_port1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 5432 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: port slot
> > > number
> > > > > 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_weight1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 1 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: weight
> slot
> > > > > number 1
> > > > > > > > > weight: 1.000000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > backend_data_directory1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> '/data/postgres'
> > > kind:
> > > > > 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_flag1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> 'ALLOW_TO_FAILOVER'
> > > > > kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: extract_string_tokens:
> > > token:
> > > > > > > > > ALLOW_TO_FAILOVER
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config:
> > > > > allow_to_failover on
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: slot
> number 1
> > > > > flag:
> > > > > > > 0000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_hostname2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'will-db3'
> kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_port2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 5432 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: port slot
> > > number
> > > > > 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_weight2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 1 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: weight
> slot
> > > > > number 2
> > > > > > > > > weight: 1.000000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > backend_data_directory2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> '/data/postgres'
> > > kind:
> > > > > 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_flag2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> 'ALLOW_TO_FAILOVER'
> > > > > kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: extract_string_tokens:
> > > token:
> > > > > > > > > ALLOW_TO_FAILOVER
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config:
> > > > > allow_to_failover on
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: slot
> number 2
> > > > > flag:
> > > > > > > 0000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: enable_pool_hba
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: on kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > authentication_timeout
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 60 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: ssl
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: num_init_children
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 32 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: max_pool
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 4 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: child_life_time
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 300 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> child_max_connections
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> connection_life_time
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: client_idle_limit
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_destination
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'stderr' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: print_timestamp
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: on kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_connections
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_hostname
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: on kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_statement
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > log_per_node_statement
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: w
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> > > > > > > > > kind: 7
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_standby_delay
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'always' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: syslog_facility
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'LOCAL0' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: syslog_ident
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'pgpool' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: debug_level
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: loading
> > > > > > > "/usr/local/etc/pool_hba.conf"
> > > > > > > > > for client authentication configuration file
> > > > > > > > > 2012-11-05 14:58:40 LOG: pid 2145: wd_chk_sticky:
> > > > > > > ifup[/sbin/ifconfig]
> > > > > > > > > doesn't have sticky bit
> > > > > > > > > 2012-11-05 14:58:40 LOG: pid 2145: wd_create_send_socket:
> > > > > connect()
> > > > > > > > > reports failure (Connection refused). You can safely ignore
> > > this
> > > > > while
> > > > > > > > > starting up.
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145: wd_init: delegate_IP
> > > already
> > > > > > > exists
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145: watchdog: wd_init
> failed
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145: wd_main error
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145:
> unlink(/tmp/.s.PGSQL.9898)
> > > > > > > failed: No
> > > > > > > > > such file or directory
> > > > > > > > > 2012-11-05 14:58:42 DEBUG: pid 2145: shmem_exit(1)
> > > > > > > > _______________________________________________
> > > > > > > > pgpool-general mailing list
> > > > > > > > pgpool-general at pgpool.net
> > > > > > > > http://www.pgpool.net/mailman/listinfo/pgpool-general
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > > > >
> > > > > >
> > > > > --
> > > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > >
> > >
> > >
> > > --
> > > Yugo Nagata <nagata at sraoss.co.jp>
> > >
>
>
> --
> Yugo Nagata <nagata at sraoss.co.jp>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20121106/cae7982b/attachment.htm>
More information about the pgpool-general
mailing list