[pgpool-hackers: 3249] Re: select(2) storm

Jesper Pedersen jesper.pedersen at redhat.com
Fri Feb 22 03:29:27 JST 2019


Hi,

On 2/20/19 4:24 PM, Jesper Pedersen wrote:
> On 2/19/19 8:33 PM, Tatsuo Ishii wrote:
>> Attached is a patch trying eliminate calls to pool_check_fd() (the
>> same one I sent to you privately. I just wanted to share with other
>> developers).
>>
>> The idea is checking select(2) timeout parameter set in a static
>> variable in pool_read() and pool_read2(). If it's -1, it means no
>> select timeout will be set in pool_check_fd(), which implies we can
>> avoid to call pool_check_fd().
>>
>> Also I moved pool_check_fd() and friends to pool_stream.c from a
>> modularity point of view.
>>
> 
> I see a minor speed up and the patch looks good, so I think it is a win 
> overall. Please, merge.
> 
>> I heard from you that the patch did no performance improvement.  I am
>> thinking about it and came to an idea; even if we avoid to call
>> pool_check_fd(), subsequent read(2) has to wait for packets from
>> frontend or backend. So the total time for processing has not been
>> changed by the patch. Of course this is just my guess. I need a proof
>> of the theory...
>>
> 

Here are some flamegraphs.

select.svg is your eliminate-select patch, which shows that select(2) 
are now "isolated" to startup, and wait_for_query_response(). The 
distribution is

__libc_write: 57.2%
__libc_read :  8.1%
__select    : 14.0%
-------------------
Total         79.3%

running at 2629 TPS.

However, looking at wait_for_query_response() and its call-site 
wait_for_query_response_with_trans_cleanup() it doesn't really look as 
we need the select(2) here unless we are in replication mode (maybe I'm 
missing something further up the stack ?). By setting status = 0 always 
we get select_nowait.svg which is more interesting

__libc_write: 60.5%
__libc_read : 14.9%
__select    :  3.6%
-------------------
Total         79.0%

running at 2718 TPS (3.4%). Not a lot, but a much better distribution of 
the system calls. Can we short cut this function if !REPLICATION, or 
should there be a configuration setting that turns this check off ?

Removing the 3 system calls from the flamegrahs, and we get nosys.svg 
showing what pgpool is spending time on. Yes, there are still system 
calls in there, but I havn't really looked into those areas yet.

For reference, proxy v1.1 gives 5697 TPS and direct access gives 8505 TPS.

Hope that this is useful.

Best regards,
  Jesper
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nosys.svg
Type: image/svg+xml
Size: 419956 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190221/85cd1994/attachment-0003.svg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: select.svg
Type: image/svg+xml
Size: 931480 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190221/85cd1994/attachment-0004.svg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: select_nowait.svg
Type: image/svg+xml
Size: 888642 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190221/85cd1994/attachment-0005.svg>


More information about the pgpool-hackers mailing list