[pgpool-general: 1167] Re: Watchdog error - wd_init: delegate_IP already exists

Will Ferguson will.ferguson at vyre.com
Tue Nov 6 22:03:36 JST 2012


Ah I see what the problem is.

There was another process listening on 9000! Stopping that process allowed
the watchdog process to start and it all works well now.

Thanks ever so much for all your help. To note, there was no bind error in
the logs - it would be great if future releases would log a bind error.

Thanks again,

Will

On 6 November 2012 12:37, Yugo Nagata <nagata at sraoss.co.jp> wrote:

> There aren't  watchdog process on the pgpool 1's server for some reason.
> Is there any error message in pgpool 1 log?
>
> In this situation, pgpool 2 judges that pgpool 1 hasn't started up yet
> and tries to bring up virtual ip. However, it fails since the ip is on
> the first server already.
>
> To resolve it, anyway, you have to kill all pgpool process on the first
> server and start pgpool again.
>
> Procedure:
>
> 1) bring down the vip on the first server.
> 3) kill all pgpool processes on the first server.
> 4) start pgpool on the first server.
> 2) start pgpool on the second server.
>
> On Tue, 6 Nov 2012 11:59:59 +0000
> Will Ferguson <will.ferguson at vyre.com> wrote:
>
> > *pgpool 1*
> >
> > root at will-pgpool1:~# ps aux | grep pgpool
> > avahi      848  0.0  0.1  33000  2612 ?        S    Oct12   9:11
> > avahi-daemon: running [will-pgpool1.local]
> > root      2279  0.0  0.3  98940  6032 ?        S    Nov05   0:12 pgpool
> -n
> > root      2286  0.0  0.1  98940  1740 ?        S    Nov05   0:06 pgpool:
> > lifecheck
> > root      2288  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2289  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2291  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2296  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2298  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2299  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2300  0.0  0.3 104296  5888 ?        S    Nov05   0:35 pgpool:
> > onbrand onbrand_unify del1.local(55305) idle
> > root      2304  0.0  0.3 104288  5944 ?        S    Nov05   0:42 pgpool:
> > onbrand onbrand_unify del1.local(55304) idle
> > root      2306  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2308  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2311  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2312  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2313  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2315  0.0  0.3 104288  5888 ?        S    Nov05   0:00 pgpool:
> > onbrand onbrand_unify del1.local(55308) idle
> > root      2316  0.0  0.3 104288  5888 ?        S    Nov05   0:38 pgpool:
> > onbrand onbrand_unify del1.local(55307) idle
> > root      2317  0.1  0.3 104288  5888 ?        S    Nov05   1:46 pgpool:
> > onbrand onbrand_unify del1.local(55306) idle
> > root      2318  0.0  0.3 104288  5888 ?        S    Nov05   0:00 pgpool:
> > onbrand onbrand_unify del1.local(55303) idle
> > root      2319  0.0  0.0  98940  1132 ?        S    Nov05   0:00 pgpool:
> > PCP: wait for connection request
> > root      2320  0.0  0.1  98940  1828 ?        S    Nov05   0:06 pgpool:
> > worker process
> > root      2391  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2486  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2522  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2546  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2551  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2595  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2672  0.0  0.0  98940  1156 ?        S    Nov05   0:00 pgpool:
> > wait for connection request
> > root      2785  0.0  0.0  98940  1156 ?        S    01:06   0:00 pgpool:
> > wait for connection request
> > root      2827  0.0  0.0  98940  1156 ?        S    01:55   0:00 pgpool:
> > wait for connection request
> > root      2888  0.0  0.0  98940  1156 ?        S    03:13   0:00 pgpool:
> > wait for connection request
> > root      3000  0.0  0.0  98940  1156 ?        S    05:52   0:00 pgpool:
> > wait for connection request
> > root      3006  0.0  0.0  98940  1156 ?        S    06:07   0:00 pgpool:
> > wait for connection request
> > root      3724  0.0  0.0  98940  1156 ?        S    10:39   0:00 pgpool:
> > wait for connection request
> > root      3730  0.0  0.1  98944  2284 ?        S    10:55   0:00 pgpool:
> > wait for connection request
> > root      3731  0.0  0.0  98940  1156 ?        S    11:00   0:00 pgpool:
> > wait for connection request
> > root      3775  0.0  0.0   9392   924 pts/0    S+   11:55   0:00 grep
> > --color=auto pgpool
> >
> > *pgpool2*
> > *
> > *
> > root at will-pgpool2:~# ps aux | grep pgool
> > root      3394  0.0  0.0   9388   916 pts/0    S+   11:57   0:00 grep
> > --color=auto pgool
> >
> >
> > Thanks for all your help.
> >
> > Will
> >
> > On 6 November 2012 11:52, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> >
> > > Could you show me lists of pgpool processes?
> > >
> > > Execute followings on the each server.
> > >
> > > $ ps aux | grep pgpool
> > >
> > > On Tue, 6 Nov 2012 11:43:57 +0000
> > > Will Ferguson <will.ferguson at vyre.com> wrote:
> > >
> > > > No the IP doesn't come up.
> > > >
> > > > On 6 November 2012 11:06, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> > > >
> > > > > Dose the virtual ip is bringing up on the scond pgpool server?
> > > > >
> > > > > On Tue, 6 Nov 2012 10:59:16 +0000
> > > > > Will Ferguson <will.ferguson at vyre.com> wrote:
> > > > >
> > > > > > Thanks but leaving the first pgpool instance running for hours
> before
> > > > > > starting the second still doesn't help. Strange that it is fine
> the
> > > other
> > > > > > way around. I'll keep my eye out for the fix.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Will
> > > > > >
> > > > > > On 6 November 2012 00:20, Yugo Nagata <nagata at sraoss.co.jp>
> wrote:
> > > > > >
> > > > > > > Wait a few seconds after the first pgpool has started.
> > > > > > > Then, you would be able to start the second successfully.
> > > > > > >
> > > > > > > I recognize this problem and am trying to resolve it.
> > > > > > >
> > > > > > > On Mon, 5 Nov 2012 07:07:25 -0800
> > > > > > > Lonni J Friedman <netllama at gmail.com> wrote:
> > > > > > >
> > > > > > > > I've seen this happen often a well.  I'm guessing that
> there's a
> > > race
> > > > > > > > condition somewhere.
> > > > > > > >
> > > > > > > > On Mon, Nov 5, 2012 at 7:00 AM, Will Ferguson <
> > > > > will.ferguson at vyre.com>
> > > > > > > wrote:
> > > > > > > > > Hello
> > > > > > > > >
> > > > > > > > > I have two pgpool servers: will-pgpool1 & will-pgpool2 with
> > > > > watchdog
> > > > > > > enabled
> > > > > > > > >
> > > > > > > > > I can start will-pgpool2 first which takes the virtual IP,
> and
> > > then
> > > > > > > start
> > > > > > > > > will-pgpool1 and all works fine.
> > > > > > > > >
> > > > > > > > > But if I start will-pgpool1 first which takes the virtual
> IP,
> > > and
> > > > > then
> > > > > > > start
> > > > > > > > > will-pgpool2 the pgpool2 server doesn't start.
> > > > > > > > >
> > > > > > > > > Both pgpool.conf files are the same except wd_hostname and
> > > > > > > > > other_pgpool_hostname0
> > > > > > > > >
> > > > > > > > > Both instances are started as root with -d -n
> > > > > > > > >
> > > > > > > > > Below is the log debug output.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Will
> > > > > > > > >
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: listen_addresses
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: '*' kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: port
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 9999 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: socket_dir
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: '/tmp' kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: pcp_port
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 9898 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: pcp_socket_dir
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: '/tmp' kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_hostname0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'will-db1'
> kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_port0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 5432 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: port slot
> > > number
> > > > > 0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_weight0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 1 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: weight
> slot
> > > > > number 0
> > > > > > > > > weight: 1.000000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > backend_data_directory0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> '/data/postgres'
> > > kind:
> > > > > 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_flag0
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> 'ALLOW_TO_FAILOVER'
> > > > > kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: extract_string_tokens:
> > > token:
> > > > > > > > > ALLOW_TO_FAILOVER
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config:
> > > > > allow_to_failover on
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: slot
> number 0
> > > > > flag:
> > > > > > > 0000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_hostname1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'will-db2'
> kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_port1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 5432 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: port slot
> > > number
> > > > > 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_weight1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 1 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: weight
> slot
> > > > > number 1
> > > > > > > > > weight: 1.000000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > backend_data_directory1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> '/data/postgres'
> > > kind:
> > > > > 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_flag1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> 'ALLOW_TO_FAILOVER'
> > > > > kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: extract_string_tokens:
> > > token:
> > > > > > > > > ALLOW_TO_FAILOVER
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config:
> > > > > allow_to_failover on
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: slot
> number 1
> > > > > flag:
> > > > > > > 0000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_hostname2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'will-db3'
> kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_port2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 5432 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: port slot
> > > number
> > > > > 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_weight2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 1 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: weight
> slot
> > > > > number 2
> > > > > > > > > weight: 1.000000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > backend_data_directory2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> '/data/postgres'
> > > kind:
> > > > > 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: backend_flag2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> 'ALLOW_TO_FAILOVER'
> > > > > kind: 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: extract_string_tokens:
> > > token:
> > > > > > > > > ALLOW_TO_FAILOVER
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config:
> > > > > allow_to_failover on
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: pool_config: slot
> number 2
> > > > > flag:
> > > > > > > 0000
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: enable_pool_hba
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: on kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > authentication_timeout
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 60 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: ssl
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: num_init_children
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 32 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: max_pool
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 4 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: child_life_time
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 300 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> child_max_connections
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> connection_life_time
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: client_idle_limit
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_destination
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'stderr' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: print_timestamp
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: on kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_connections
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_hostname
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: on kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_statement
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key:
> > > log_per_node_statement
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: off kind: 1
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: w
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value:
> > > > > > > > >  kind: 7
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: log_standby_delay
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'always' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: syslog_facility
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'LOCAL0' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: syslog_ident
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 'pgpool' kind:
> 4
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: key: debug_level
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: value: 0 kind: 2
> > > > > > > > > 2012-11-05 14:58:40 DEBUG: pid 2145: loading
> > > > > > > "/usr/local/etc/pool_hba.conf"
> > > > > > > > > for client authentication configuration file
> > > > > > > > > 2012-11-05 14:58:40 LOG:   pid 2145: wd_chk_sticky:
> > > > > > > ifup[/sbin/ifconfig]
> > > > > > > > > doesn't have sticky bit
> > > > > > > > > 2012-11-05 14:58:40 LOG:   pid 2145: wd_create_send_socket:
> > > > > connect()
> > > > > > > > > reports failure (Connection refused). You can safely ignore
> > > this
> > > > > while
> > > > > > > > > starting up.
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145: wd_init: delegate_IP
> > > already
> > > > > > > exists
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145: watchdog: wd_init
> failed
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145: wd_main error
> > > > > > > > > 2012-11-05 14:58:42 ERROR: pid 2145:
> unlink(/tmp/.s.PGSQL.9898)
> > > > > > > failed: No
> > > > > > > > > such file or directory
> > > > > > > > > 2012-11-05 14:58:42 DEBUG: pid 2145: shmem_exit(1)
> > > > > > > > _______________________________________________
> > > > > > > > pgpool-general mailing list
> > > > > > > > pgpool-general at pgpool.net
> > > > > > > > http://www.pgpool.net/mailman/listinfo/pgpool-general
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > > > >
> > > > > >
> > > > > --
> > > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > >
> > >
> > >
> > > --
> > > Yugo Nagata <nagata at sraoss.co.jp>
> > >
>
>
> --
> Yugo Nagata <nagata at sraoss.co.jp>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20121106/cae7982b/attachment-0001.html>


More information about the pgpool-general mailing list