[pgpool-hackers: 95] Re: "COMMIT" sent to read-only slave in streaming replication

Tatsuo Ishii ishii at postgresql.org
Wed Jul 25 08:33:16 JST 2012


Thanks for the report. I will look into this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Sorry for the slow reply, but I had to build a test setup in our
> development environment --the problem occurred in our production
> environment.
> 
> I can reproduce the problem reliably now.  The key seems to be having
> the primary as backend1 and slave as backend0 when pgpool starts.
> We're using python with sqlalchemy which, for its own reasons, does a
> "SELECT current_setting('transaction_isolation')" followed by a
> COMMIT.  Really neither of those statements should go to the slave,
> the COMMIT because there's no transaction and the SELECT because that
> function isn't whitelisted.
> 
> Let me know what other information I can provide to help.
> 
> 
>   alex
> 
> 
> On 7/19/12 4:02 PM, Tatsuo Ishii wrote:
>> Ok. Do you know the way to reproduce the problem reliably?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>>> I meant I turned off query caching; I'm still using connection
>>> caching.  This is pgpool-II version 3.2alpha1 (hatsuiboshi).
>>>
>>>
>>>    alex
>>>
>>>
>>> On 7/19/12 3:00 PM, Tatsuo Ishii wrote:
>>>> Thanks for the report. I need more info however. What version is
>>>> pgpool-II?
>>>>
>>>>> load balancing with no caching (yet).  The whitelist is empty and the
>>>> You mean you turn off connection caching? or query caching?
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>>> I had an interesting problem yesterday and wanted to report it before
>>>>> spending too much time tracking it down.
>>>>>
>>>>> I have two databases configured with postgres streaming replication
>>>>> that I was putting together with a new instance of pgpool.  I'm using
>>>>> load balancing with no caching (yet).  The whitelist is empty and the
>>>>> blacklist is ".*" (so, everything).  I had the primary configured as
>>>>> backend1 and the slave configured as backend0.  I know pgpool prefers
>>>>> it the other way around, but also that it's smart enough to figure out
>>>>> which is which; backend0 is the normal primary but we'd had a failover
>>>>> recently and haven't failed back yet.
>>>>>
>>>>> What I saw was the queries correctly went to backend1, but the COMMIT
>>>>> went to both, causing errors that filtered up to the application.
>>>>> When I switched backend0 and backend1 in the configs everything worked
>>>>> fine.
>>>>>
>>>>> (from the postgres logs)
>>>>> LOG:  statement: COMMIT
>>>>> WARNING:  there is no transaction in progress
>>>>>
>>>>> (and from pgpool's logs)
>>>>> ProcessBackendResponse: kind from backend: Z
>>>>> pool_read_message_length: slot: 1 length: 5
>>>>> ReadyForQuery: transaction state:T
>>>>> pool_unset_query_in_progress: done
>>>>> pool_unset_query_in_progress: done
>>>>> ProcessBackendResponse: Ready For Query
>>>>> ProcessFrontendResponse: kind from frontend Q(51)
>>>>> pool_unset_doing_extended_query_message: done
>>>>> statement: COMMIT
>>>>> pool_set_query_in_progress: done
>>>>> send_to_where: 3 query: COMMIT
>>>>> DB node id: 1 backend pid: 18487 statement: COMMIT
>>>>> wait_for_query_response: waiting for backend 1 completing the query
>>>>> DB node id: 0 backend pid: 7692 statement: COMMIT
>>>>> wait_for_query_response: waiting for backend 0 completing the query
>>>>> pool_send_and_wait: Error or notice message from backend: : DB node
>>>>> id: 0 backend pid: 7692 statement: COMMIT message: there is no tr
>>>>> ansaction in progress
>>>>> read_kind_from_backend: read kind from 0 th backend N NUM_BACKENDS: 2
>>>>> read_kind_from_backend: read kind from 1 th backend C NUM_BACKENDS: 2
>>>>> read_kind_from_backend: 1 th kind C does not match with master or
>>>>> majority connection kind Nkind mismatch among backends. Possible last
>>>>> query was: "COMMIT" kind details are: 0[N: there is no transaction in
>>>>> progress] 1[C]
>>>>> do_child: exits with status 1 due to error
>>>>>
>>>>>
>>>>> It looks like find_primary_node works as intended, but for at least
>>>>> one check PRIMARY_NODE_ID isn't being tested for type stream. Before I
>>>>> spend too long wandering the code I wanted to see if this was intended
>>>>> (configurable) behavior, even though it doesn't seem to be.
>>>>> _______________________________________________
>>>>> pgpool-hackers mailing list
>>>>> pgpool-hackers at pgpool.net
>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> 


More information about the pgpool-hackers mailing list