[pgpool-hackers: 96] Re: "COMMIT" sent to read-only slave in streaming replication

Tatsuo Ishii ishii at postgresql.org
Wed Jul 25 10:37:43 JST 2012


I have tried with following test case(primary is node 1 and standby is
node 0, black_function_list = '.*'):

psql -p 11002 test
Pager usage is off.
psql (9.2beta2)
Type "help" for help.

test=# select 3; -- make sure that load balance node is 0
 ?column? 
----------
        3
(1 row)

test=# begin;
BEGIN
test=# select current_setting('transaction_isolation');
 current_setting 
-----------------
 read committed
(1 row)

test=# end;
COMMIT

It seems I couldn't reproduce your problem. Here is pgpool log.

2012-07-25 10:08:45 LOG:   pid 26030: connection received: host=[local]
2012-07-25 10:08:49 LOG:   pid 26030: DB node id: 0 backend pid: 26048 statement: select 3;
2012-07-25 10:08:57 LOG:   pid 26030: DB node id: 0 backend pid: 26048 statement: begin;
2012-07-25 10:08:57 LOG:   pid 26030: DB node id: 1 backend pid: 26049 statement: begin;
2012-07-25 10:09:01 LOG:   pid 26030: DB node id: 1 backend pid: 26049 statement: select current_setting('transaction_isolation');
2012-07-25 10:09:18 LOG:   pid 26030: DB node id: 1 backend pid: 26049 statement: end;
2012-07-25 10:09:18 LOG:   pid 26030: DB node id: 0 backend pid: 26048 statement: end;

select current_setting is as expected sent to node 1, that is primary
although load balance node is 0(standby). If pgpool thought that
"select current_setting" could be load balanced, it should have been
sent to node 0. So it seems working correctly.

Any advice to reproduce your problem?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Thanks for the report. I will look into this.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
>> Sorry for the slow reply, but I had to build a test setup in our
>> development environment --the problem occurred in our production
>> environment.
>> 
>> I can reproduce the problem reliably now.  The key seems to be having
>> the primary as backend1 and slave as backend0 when pgpool starts.
>> We're using python with sqlalchemy which, for its own reasons, does a
>> "SELECT current_setting('transaction_isolation')" followed by a
>> COMMIT.  Really neither of those statements should go to the slave,
>> the COMMIT because there's no transaction and the SELECT because that
>> function isn't whitelisted.
>> 
>> Let me know what other information I can provide to help.
>> 
>> 
>>   alex
>> 
>> 
>> On 7/19/12 4:02 PM, Tatsuo Ishii wrote:
>>> Ok. Do you know the way to reproduce the problem reliably?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>>
>>>> I meant I turned off query caching; I'm still using connection
>>>> caching.  This is pgpool-II version 3.2alpha1 (hatsuiboshi).
>>>>
>>>>
>>>>    alex
>>>>
>>>>
>>>> On 7/19/12 3:00 PM, Tatsuo Ishii wrote:
>>>>> Thanks for the report. I need more info however. What version is
>>>>> pgpool-II?
>>>>>
>>>>>> load balancing with no caching (yet).  The whitelist is empty and the
>>>>> You mean you turn off connection caching? or query caching?
>>>>> --
>>>>> Tatsuo Ishii
>>>>> SRA OSS, Inc. Japan
>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>> Japanese: http://www.sraoss.co.jp
>>>>>
>>>>>> I had an interesting problem yesterday and wanted to report it before
>>>>>> spending too much time tracking it down.
>>>>>>
>>>>>> I have two databases configured with postgres streaming replication
>>>>>> that I was putting together with a new instance of pgpool.  I'm using
>>>>>> load balancing with no caching (yet).  The whitelist is empty and the
>>>>>> blacklist is ".*" (so, everything).  I had the primary configured as
>>>>>> backend1 and the slave configured as backend0.  I know pgpool prefers
>>>>>> it the other way around, but also that it's smart enough to figure out
>>>>>> which is which; backend0 is the normal primary but we'd had a failover
>>>>>> recently and haven't failed back yet.
>>>>>>
>>>>>> What I saw was the queries correctly went to backend1, but the COMMIT
>>>>>> went to both, causing errors that filtered up to the application.
>>>>>> When I switched backend0 and backend1 in the configs everything worked
>>>>>> fine.
>>>>>>
>>>>>> (from the postgres logs)
>>>>>> LOG:  statement: COMMIT
>>>>>> WARNING:  there is no transaction in progress
>>>>>>
>>>>>> (and from pgpool's logs)
>>>>>> ProcessBackendResponse: kind from backend: Z
>>>>>> pool_read_message_length: slot: 1 length: 5
>>>>>> ReadyForQuery: transaction state:T
>>>>>> pool_unset_query_in_progress: done
>>>>>> pool_unset_query_in_progress: done
>>>>>> ProcessBackendResponse: Ready For Query
>>>>>> ProcessFrontendResponse: kind from frontend Q(51)
>>>>>> pool_unset_doing_extended_query_message: done
>>>>>> statement: COMMIT
>>>>>> pool_set_query_in_progress: done
>>>>>> send_to_where: 3 query: COMMIT
>>>>>> DB node id: 1 backend pid: 18487 statement: COMMIT
>>>>>> wait_for_query_response: waiting for backend 1 completing the query
>>>>>> DB node id: 0 backend pid: 7692 statement: COMMIT
>>>>>> wait_for_query_response: waiting for backend 0 completing the query
>>>>>> pool_send_and_wait: Error or notice message from backend: : DB node
>>>>>> id: 0 backend pid: 7692 statement: COMMIT message: there is no tr
>>>>>> ansaction in progress
>>>>>> read_kind_from_backend: read kind from 0 th backend N NUM_BACKENDS: 2
>>>>>> read_kind_from_backend: read kind from 1 th backend C NUM_BACKENDS: 2
>>>>>> read_kind_from_backend: 1 th kind C does not match with master or
>>>>>> majority connection kind Nkind mismatch among backends. Possible last
>>>>>> query was: "COMMIT" kind details are: 0[N: there is no transaction in
>>>>>> progress] 1[C]
>>>>>> do_child: exits with status 1 due to error
>>>>>>
>>>>>>
>>>>>> It looks like find_primary_node works as intended, but for at least
>>>>>> one check PRIMARY_NODE_ID isn't being tested for type stream. Before I
>>>>>> spend too long wandering the code I wanted to see if this was intended
>>>>>> (configurable) behavior, even though it doesn't seem to be.
>>>>>> _______________________________________________
>>>>>> pgpool-hackers mailing list
>>>>>> pgpool-hackers at pgpool.net
>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>> 
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers


More information about the pgpool-hackers mailing list