[pgpool-hackers: 94] Re: "COMMIT" sent to read-only slave in streaming replication

alex alex at smalldemons.com
Wed Jul 25 08:27:29 JST 2012


Sorry for the slow reply, but I had to build a test setup in our 
development environment --the problem occurred in our production 
environment.

I can reproduce the problem reliably now.  The key seems to be having 
the primary as backend1 and slave as backend0 when pgpool starts.  We're 
using python with sqlalchemy which, for its own reasons, does a "SELECT 
current_setting('transaction_isolation')" followed by a COMMIT.  Really 
neither of those statements should go to the slave, the COMMIT because 
there's no transaction and the SELECT because that function isn't 
whitelisted.

Let me know what other information I can provide to help.


   alex


On 7/19/12 4:02 PM, Tatsuo Ishii wrote:
> Ok. Do you know the way to reproduce the problem reliably?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
>> I meant I turned off query caching; I'm still using connection
>> caching.  This is pgpool-II version 3.2alpha1 (hatsuiboshi).
>>
>>
>>    alex
>>
>>
>> On 7/19/12 3:00 PM, Tatsuo Ishii wrote:
>>> Thanks for the report. I need more info however. What version is
>>> pgpool-II?
>>>
>>>> load balancing with no caching (yet).  The whitelist is empty and the
>>> You mean you turn off connection caching? or query caching?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>>
>>>> I had an interesting problem yesterday and wanted to report it before
>>>> spending too much time tracking it down.
>>>>
>>>> I have two databases configured with postgres streaming replication
>>>> that I was putting together with a new instance of pgpool.  I'm using
>>>> load balancing with no caching (yet).  The whitelist is empty and the
>>>> blacklist is ".*" (so, everything).  I had the primary configured as
>>>> backend1 and the slave configured as backend0.  I know pgpool prefers
>>>> it the other way around, but also that it's smart enough to figure out
>>>> which is which; backend0 is the normal primary but we'd had a failover
>>>> recently and haven't failed back yet.
>>>>
>>>> What I saw was the queries correctly went to backend1, but the COMMIT
>>>> went to both, causing errors that filtered up to the application.
>>>> When I switched backend0 and backend1 in the configs everything worked
>>>> fine.
>>>>
>>>> (from the postgres logs)
>>>> LOG:  statement: COMMIT
>>>> WARNING:  there is no transaction in progress
>>>>
>>>> (and from pgpool's logs)
>>>> ProcessBackendResponse: kind from backend: Z
>>>> pool_read_message_length: slot: 1 length: 5
>>>> ReadyForQuery: transaction state:T
>>>> pool_unset_query_in_progress: done
>>>> pool_unset_query_in_progress: done
>>>> ProcessBackendResponse: Ready For Query
>>>> ProcessFrontendResponse: kind from frontend Q(51)
>>>> pool_unset_doing_extended_query_message: done
>>>> statement: COMMIT
>>>> pool_set_query_in_progress: done
>>>> send_to_where: 3 query: COMMIT
>>>> DB node id: 1 backend pid: 18487 statement: COMMIT
>>>> wait_for_query_response: waiting for backend 1 completing the query
>>>> DB node id: 0 backend pid: 7692 statement: COMMIT
>>>> wait_for_query_response: waiting for backend 0 completing the query
>>>> pool_send_and_wait: Error or notice message from backend: : DB node
>>>> id: 0 backend pid: 7692 statement: COMMIT message: there is no tr
>>>> ansaction in progress
>>>> read_kind_from_backend: read kind from 0 th backend N NUM_BACKENDS: 2
>>>> read_kind_from_backend: read kind from 1 th backend C NUM_BACKENDS: 2
>>>> read_kind_from_backend: 1 th kind C does not match with master or
>>>> majority connection kind Nkind mismatch among backends. Possible last
>>>> query was: "COMMIT" kind details are: 0[N: there is no transaction in
>>>> progress] 1[C]
>>>> do_child: exits with status 1 due to error
>>>>
>>>>
>>>> It looks like find_primary_node works as intended, but for at least
>>>> one check PRIMARY_NODE_ID isn't being tested for type stream. Before I
>>>> spend too long wandering the code I wanted to see if this was intended
>>>> (configurable) behavior, even though it doesn't seem to be.
>>>> _______________________________________________
>>>> pgpool-hackers mailing list
>>>> pgpool-hackers at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers



More information about the pgpool-hackers mailing list