[pgpool-general: 1141] Possible race condition in pool_get_passwd
raistlin at tacorp.net
Sun Oct 28 14:12:52 JST 2012
I've been digging into an issue we've been having with many threads
connect to the pgpool backend at once. Some of the threads are failing
md5 authentication without ever sending an auth packet.
We're currently running pgpool 3.1.3 on FreeBSD.
After several hours of digging, I determined that pool_get_passwd was
returning NULL. I littered the thing with pool_debugs and started
digging in. The interesting this was the more debugging I added, the
more often the issue occurred. This got me looking more.
It appears to be there is a race condition based on the fact the
underlying fd's are shared between the child processes. Note this from
the fork(2) man page on FreeBSD:
The child process has its own copy of the parent's descriptors.
These descriptors reference the same underlying objects, so
that, for instance, file pointers in file objects are shared
between the child and the parent, so that an lseek(2) on a
descriptor in the child process can affect a subsequent read(2)
or write(2) by the parent. This descriptor copying is also
used by the shell to establish standard input and output for
newly created processes as well as to set up pipes.
The linux man page has a similar note about then sharing offsets.
In all cases where I see the error, the first call to fgetc in
pool_get_passwd returns EOF, even though the file pointer in the FILE
struct is at 0 right before the call.
In this snippet, QQQ reports the position returned by ftell(), while NNN
reports the position obtained by calling lseek(fileno(passwd_fd),0,SEEK_CUR)
Oct 28 01:10:09 apu pgpool: QQQ Pos 0
Oct 28 01:10:09 apu pgpool: NNN Pos 48
As you can see, the FILE struct thinks we're at 0, while the underlying
kernel fd shows us to be at 48. This causes the subsequent fgetc to fail.
Is the process of sharing the FILE struct, and consequently the
underlying fd safe? Would a better approach be to cache the contents of
the passwd file in memory, and provide a mechanism to reload them on reload?
More information about the pgpool-general