[pgpool-hackers: 2043] [pgpool-II 0000271]: balanced requests after started transaction
Sergey Kim
skim at odin.com
Wed Feb 15 06:41:20 JST 2017
Dear Mr Tatsuo Ishii and Pgpool Community,
It is regarding the issue #271.
We see what you do, you've created a tool that simulates interaction to the Postgres on a low level - it is free of any application logic and thus it gives flexibility which is usually not available on the application layer. Frankly speaking we were looking for such a tool and had plans to create it. But now we see we can use the tool which you created and even try to invest into it!
We see the solution is not provided yet but we see what actually and why you do that. If you could share us the current build for CentOS 7 x86_64 that includes the current changes? We will be glad to test and verify possible weak places which you can ask us to verify. Also we can observe it from the performance point of view and compare to the latest versions. We promise you that the preliminary results will be not available on the Internet! You can ask us to verify the certain cases under different stress logic.
Thus we want to invest into the Pgpool project. We think that extra tests would be productive for your product.
Sergey Kim,
HA Architect.
skim at odin.com
ODIN Ingram Micro Cloud
________________________________
From: pgpool Bug Tracker <bugtracker at pgpool.net>
Sent: Tuesday, February 14, 2017 10:12:10 AM
To: Sergey Kim
Subject: [pgpool-II 0000271]: balanced requests after started transaction
A NOTE has been added to this issue.
==========================
==========================
====================
http://www.pgpool.net/mantisbt/view.php?id=271
==========================
==========================
====================
Reported By: supp_k
Assigned To: t-ishii
==========================
==========================
====================
Project: pgpool-II
Issue ID: 271
Category: Bug
Reproducibility: random
Severity: major
Priority: urgent
Status: assigned
==========================
==========================
====================
Date Submitted: 2016-12-16 04:46 JST
Last Modified: 2017-02-14 16:12 JST
==========================
==========================
====================
Summary: balanced requests after started transaction
Description:
We see that in transaction pgpool balances requests to standby read-only
backend. But this should happen because slave doesn't contain required data
until the transaction committed on slave.
The bug is related to pgpool v3.6.
Additional Information:
We analysed the log records and did the conclusion:
First 5 lines show the transaction was initiated and for some period pgpool
routes requests to the backend #0 only which is master. But then we see the
request (2016-12-15T12:46:13.229382+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79083-1]) was routed to
standby backend http://www.pgpool.net/mantisbt/view.php?id=1. At this we =
moment
we face data inconsistency in the frontend because for sure the backend
http://www.pgpool.net/mantisbt/view.php?id=1 doesn't contain the data whi=
ch was
created in backend #0 until it is commited. The SELECT for the data is perf=
ormed
within the same transaction in which it was INSERT-ed.
2016-12-15T12:46:12.007225+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78172-1] LOG: DB node id: 0 backend p=
id:
17007 statement: B message
2016-12-15T12:46:12.007229+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78172-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.007240+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78175-1] LOG: DB node id: 1 backend p=
id:
2946 statement: B message
2016-12-15T12:46:12.007242+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78175-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.007446+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78189-1] LOG: DB node id: 0 backend p=
id:
17007 statement: Execute: BEGIN
2016-12-15T12:46:12.007455+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78189-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.007480+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78194-1] LOG: DB node id: 1 backend p=
id:
2946 statement: Execute: BEGIN
2016-12-15T12:46:12.007481+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78194-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.008519+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78260-1] LOG: DB node id: 1 backend p=
id:
2946 statement: Parse: SELECT 1
2016-12-15T12:46:12.008525+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78260-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.008653+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78274-1] LOG: DB node id: 1 backend p=
id:
2946 statement: B message
2016-12-15T12:46:12.008659+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78274-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.008823+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78287-1] LOG: DB node id: 1 backend p=
id:
2946 statement: D message
2016-12-15T12:46:12.008825+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78287-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:12.008906+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78297-1] LOG: DB node id: 1 backend p=
id:
2946 statement: Execute: SELECT 1
2016-12-15T12:46:12.008915+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle[16742]: [78297-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.218025+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78372-1] LOG: DB node =
id: 1
backend pid: 2946 statement: B message
2016-12-15T12:46:13.218034+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78372-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.218197+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78386-1] LOG: DB node =
id: 1
backend pid: 2946 statement: Execute: SELECT CASE WHEN sr.sub_limit = -=
1 THEN
1 ELSE 0 END AS is_inf, sr.sub_limit - sr.curr_usage AS res_avail,
sr.rt_instance_id AS rt_instance_id FROM resource_classes rc JOIN resource_=
types
rt ON (rc.class_id = rt.class_id) JOIN subs_resources sr ON (sr.rt_id =
=
rt.rt_id) JOIN subscriptions s ON (sr.sub_id = s.sub_id) WHERE rc.nam=
e = $1
AND s.owner_id = $2 AND s.is_active = 1 ORDER BY is_inf DESC, res_a=
vail
DESC, sr.rt_instance_id
2016-12-15T12:46:13.218203+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78386-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.221080+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78446-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Parse: INSERT INTO accounts_serial DEFAULT VA=
LUES
2016-12-15T12:46:13.221083+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78446-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.221220+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78460-1] LOG: DB node =
id: 0
backend pid: 17007 statement: D message
2016-12-15T12:46:13.221223+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78460-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.222001+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78520-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Parse: INSERT INTO accounts_serial DEFAULT VA=
LUES
2016-12-15T12:46:13.222006+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78520-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.222146+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78534-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.222150+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78534-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.222272+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78547-1] LOG: DB node =
id: 0
backend pid: 17007 statement: D message
2016-12-15T12:46:13.222276+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78547-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.222351+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78557-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: INSERT INTO accounts_serial DEFAULT
VALUES
2016-12-15T12:46:13.222355+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78557-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.223461+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78629-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Parse: select currval('accounts_serial_seq')
2016-12-15T12:46:13.223465+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78629-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.223592+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78642-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.223594+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78642-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.223708+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78655-1] LOG: DB node =
id: 0
backend pid: 17007 statement: D message
2016-12-15T12:46:13.223711+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78655-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.223814+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78665-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: select currval('accounts_serial_seq'=
)
2016-12-15T12:46:13.223822+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78665-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.224918+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78742-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Parse: DELETE FROM accounts_serial WHERE
account_id = $1
2016-12-15T12:46:13.224926+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78742-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.225052+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78755-1] LOG: DB node =
id: 0
backend pid: 17007 statement: D message
2016-12-15T12:46:13.225056+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78755-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.225842+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78815-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Parse: DELETE FROM accounts_serial WHERE
account_id = $1
2016-12-15T12:46:13.225846+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78815-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.225965+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78829-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.225973+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78829-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.226088+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78842-1] LOG: DB node =
id: 0
backend pid: 17007 statement: D message
2016-12-15T12:46:13.226092+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78842-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.226179+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78852-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: DELETE FROM accounts_serial WHERE
account_id = $1
2016-12-15T12:46:13.226183+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78852-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.227219+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78921-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.227222+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78921-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.227380+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78935-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into addresses (address, addr=
ess2,
address3, city, country, postal_code, state) values ($1, $2, $3, $4, $5, $6=
, $7)
2016-12-15T12:46:13.227383+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78935-2] #011RETURNING =
*
2016-12-15T12:46:13.227386+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78935-3] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.228185+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78995-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.228188+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [78995-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.228343+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79009-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contacts (display_name)
values ($1)
2016-12-15T12:46:13.228345+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79009-2] #011RETURNING =
*
2016-12-15T12:46:13.228348+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79009-3] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.229216+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79069-1] LOG: DB node =
id: 1
backend pid: 2946 statement: B message
2016-12-15T12:46:13.229220+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79069-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.229382+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79083-1] LOG: DB node =
id: 1
backend pid: 2946 statement: Execute: select account0_.account_id as
account_1_3_0_, account0_.address_id as address21_3_0_, account0_.adm_conta=
ct_id
as adm_con22_3_0_, account0_.aps_uuid as aps_uuid2_3_0_, account0_.ext_syst=
em_id
as ext_sys23_3_0_, account0_.bill_contact_id as bill_co24_3_0_,
account0_.brand_name as brand_na3_3_0_, account0_.ccp_version as ccp_vers4_=
3_0_,
account0_.company_name as company_5_3_0_, account0_.customers_quantity as
customer6_3_0_, account0_.l_country_code_default as l_countr7_3_0_,
account0_.l_language_code_default as l_langua8_3_0_, account0_.l_variant_de=
fault
as l_varian9_3_0_, account0_.ext_account_id as ext_acc10_3_0_, account0_.c_=
time
as c_time11_3_0_, account0_.l_country_code as l_count12_3_0_,
account0_.l_language_code as l_langu13_3_0_, account0_.l_variant as
l_varia14_3_0_, account0_.is_locked as is_lock15_3_0_, account0_.note
2016-12-15T12:46:13.229388+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79083-2] as note16_3_0=
_,
account0_.owner_id as owner_i25_3_0_, account0_.path as path17_3_0_,
account0_.is_personal as is_pers18_3_0_, account0_.rt_instance_id as
rt_inst19_3_0_, acc
2016-12-15T12:46:13.229390+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79083-3] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.230519+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79133-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.230524+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79133-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.230658+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79147-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.230665+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79147-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.231363+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79199-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.231371+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79199-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.231498+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79213-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.231502+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79213-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.232170+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79265-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.232174+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79265-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.232311+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79279-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.232314+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79279-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.232971+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79331-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.232974+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79331-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.233138+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79345-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.233146+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79345-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.233854+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79397-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.233856+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79397-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.233973+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79411-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.233978+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79411-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.234643+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79463-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.234647+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79463-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.234790+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79477-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.234799+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79477-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.235426+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79529-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.235430+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79529-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.235583+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79543-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.235587+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79543-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.236258+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79595-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.236266+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79595-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.236424+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79609-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.236429+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79609-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.237049+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79661-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.237054+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79661-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.237188+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79675-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.237191+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79675-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.237771+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79727-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.237782+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79727-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.237907+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79741-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.237915+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79741-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.238572+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79793-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.238575+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79793-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.238715+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79807-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.238719+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79807-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.239370+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79859-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.239373+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79859-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.239513+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79873-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into accounts (address_id,
adm_contact_id, aps_uuid, ext_system_id, bill_contact_id, brand_name,
ccp_version, company_name, customers_quantity, l_country_code_default,
l_language_code_default, l_variant_default, ext_account_id, c_time,
l_country_code, l_language_code, l_variant, is_locked, note, owner_id, path=
,
is_personal, rt_instance_id, tech_contact_id, account_type, account_id) val=
ues
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17=
,
$18, $19, $20, $21, $22, $23, $24, $25, $26)
2016-12-15T12:46:13.239518+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79873-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.240744+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79925-1] LOG: DB node =
id: 1
backend pid: 2946 statement: B message
2016-12-15T12:46:13.240748+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79925-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.240901+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79939-1] LOG: DB node =
id: 1
backend pid: 2946 statement: Execute: select user0_.user_id as user_id1_193=
_0_,
identity1_.identity_id as identity1_102_1_, account2_.account_id as
account_1_3_2_, user0_.scope_id as scope_i16_193_0_, user0_.aps_uuid as
aps_uuid2_193_0_, user0_.contact_id as contact17_193_0_, user0_.date_format=
as
date_for3_193_0_, user0_.deleting as deleting4_193_0_,
user0_.dns_path_of_domain_part as dns_path5_193_0_, user0_.domain_id as
domain_i6_193_0_, user0_.is_enabled as is_enabl7_193_0_, user0_.auth_identi=
ty_id
as auth_id18_193_0_, user0_.l_country_code as l_countr8_193_0_,
user0_.l_language_code as l_langua9_193_0_, user0_.l_variant as
l_varia10_193_0_, user0_.locked_by as locked_11_193_0_, user0_.member_id as
member_12_193_0_, user0_.sub_id as sub_id13_193_0_, user0_.timezone as
timezon14_193_0_, user0_.type as type15_193_0_, identity1_.system_id as
system_i4_102_1_,
2016-12-15T12:46:13.240906+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79939-2] identity1_.lo=
gin
as login2_102_1_, identity1_.login_lowered as login_lo3_102_1_,
account2_.address_id as address21_3_2_, account2_.adm_contact_id as
adm_con22_3_2_, account2_.aps_u
2016-12-15T12:46:13.240908+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79939-3] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.241819+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79997-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.241827+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79997-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.241969+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80011-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: select localident0_.identity_id as
identity1_109_0_, localident0_.pwd_expired as pwd_expi2_109_0_,
localident0_.pwd_ctime as pwd_ctim3_109_0_, localident0_.pwd_hash as
pwd_hash4_109_0_, localident0_.pwd_type as pwd_type5_109_0_ from
local_identities localident0_ where localident0_.identity_id=$1
2016-12-15T12:46:13.241977+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80011-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.242963+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80069-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.242971+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80069-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.243102+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80083-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: select role0_.role_id as role_id1_16=
0_,
role0_.account_id as account_6_160_, role0_.description as descript2_160_,
role0_.name as name3_160_, role0_.admin as admin4_160_, role0_.role_type as
role_typ5_160_ from roles role0_ where role0_.account_id=$1 and role0_.na=
me=$2
and (role0_.role_type=$3 or role0_.role_type=$4)
2016-12-15T12:46:13.243106+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80083-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.243851+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80133-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.243853+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80133-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.244015+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80147-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: insert into roles (account_id,
description, name, admin, role_type) values ($1, $2, $3, $4, $5)
2016-12-15T12:46:13.244024+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80147-2] #011RETURNING =
*
2016-12-15T12:46:13.244028+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80147-3] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.245087+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80207-1] LOG: DB node =
id: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.245091+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80207-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.245241+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [80221-1] LOG: DB node =
id: 0
backend pid: 17007 statement: Execute: select privilege0_.privilege_id as
privileg1_144_, privilege0_.allow_locked as allow_lo2_144_, privilege0_.are=
a as
area3_144_, privilege0_.name as name4_144_, privilege0_1_.group_id as
group_id1_143_ from privileges privilege0_ left outer join
privilege_groups_content privilege0_1_ on
privilege0_.privilege_id=privilege0_1_.privilege_id where privilege0_.nam=
e=$1
==========================
==========================
====================
----------------------------------------------------------------------
(0001225) supp_k (reporter) - 2016-12-16 04:59
http://www.pgpool.net/mantisbt/view.php?id=271#c1225
----------------------------------------------------------------------
One more strange case:
if we do pcp_detach_node for the SLAVE backend and then pcp_attach_node the=
n the
problem of "incorrect balancing" disappeares.
----------------------------------------------------------------------
(0001226) supp_k (reporter) - 2016-12-17 02:33
http://www.pgpool.net/mantisbt/view.php?id=271#c1226
----------------------------------------------------------------------
Please pay attention to the following information. In the following piece o=
f
code the requests are coming with interval less then ONE millisecond. It is
possible that the 1st request is not complete yet. Does it make sense? Prob=
ably
pgpool has not created the "flag" that balancing is prohibited after this
moment?
2016-12-15T12:46:13.238715+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79807-1] LOG: DB node i=
d: 0
backend pid: 17007 statement: Execute: insert into contact_properties (valu=
e,
contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.238719+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79807-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.239370+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79859-1] LOG: DB node i=
d: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.239373+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79859-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.239513+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79873-1] LOG: DB node i=
d: 0
backend pid: 17007 statement: Execute: insert into accounts (address_id,
adm_contact_id, aps_uuid, ext_system_id, bill_contact_id, brand_name,
ccp_version, company_name, customers_quantity, l_country_code_default,
l_language_code_default, l_variant_default, ext_account_id, c_time,
l_country_code, l_language_code, l_variant, is_locked, note, owner_id, path=
,
is_personal, rt_instance_id, tech_contact_id, account_type, account_id) val=
ues
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17=
,
$18, $19, $20, $21, $22, $23, $24, $25, $26)
2016-12-15T12:46:13.239518+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79873-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.240744+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79925-1] LOG: DB node i=
d: 1
backend pid: 2946 statement: B message
2016-12-15T12:46:13.240748+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79925-2] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.240901+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79939-1] LOG: DB node i=
d: 1
backend pid: 2946 statement: Execute: select user0_.user_id as user_id1_193=
_0_,
identity1_.identity_id as identity1_102_1_, account2_.account_id as
account_1_3_2_, user0_.scope_id as scope_i16_193_0_, user0_.aps_uuid as
aps_uuid2_193_0_, user0_.contact_id as contact17_193_0_, user0_.date_format=
as
date_for3_193_0_, user0_.deleting as deleting4_193_0_,
user0_.dns_path_of_domain_part as dns_path5_193_0_, user0_.domain_id as
domain_i6_193_0_, user0_.is_enabled as is_enabl7_193_0_, user0_.auth_identi=
ty_id
as auth_id18_193_0_, user0_.l_country_code as l_countr8_193_0_,
user0_.l_language_code as l_langua9_193_0_, user0_.l_variant as
l_varia10_193_0_, user0_.locked_by as locked_11_193_0_, user0_.member_id as
member_12_193_0_, user0_.sub_id as sub_id13_193_0_, user0_.timezone as
timezon14_193_0_, user0_.type as type15_193_0_, identity1_.system_id as
system_i4_102_1_,
2016-12-15T12:46:13.240906+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79939-2] identity1_.log=
in as
login2_102_1_, identity1_.login_lowered as login_lo3_102_1_,
account2_.address_id as address21_3_2_, account2_.adm_contact_id as
adm_con22_3_2_, account2_.aps_u
2016-12-15T12:46:13.240908+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79939-3] LOCATION:
pool_proto_modules.c:3085
2016-12-15T12:46:13.241819+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79997-1] LOG: DB node i=
d: 0
backend pid: 17007 statement: B message
2016-12-15T12:46:13.241827+03:00 srv-60b2f6e99b9b lesk plesk
192.168.102.240(59893) idle in transaction[16742]: [79997-2] LOCATION:
pool_proto_modules.c:3085
----------------------------------------------------------------------
(0001231) serk (reporter) - 2016-12-19 06:10
http://www.pgpool.net/mantisbt/view.php?id=271#c1231
----------------------------------------------------------------------
Hi Pgpool developers!
Please consider the logic that reproduces the error.
In approximately 30% of cases the attached class (TestPgpool.java) reproduc=
es
the issue.
The target testing DB has 2 tables used for testing (tables.ddl). We did se=
veral
modifications in pgpool code that show the cases (as we could detect it) wh=
en
Pgpool does balancing instead of forwarding the SELECT request to MASTER ba=
ckend
after DDL statement was processed.
In several runnings one will see messages: "USE BACKEND: {1} THOUGH THE WRI=
TE
TRANSACTION IS ACTIVE!!!!!!!!" which say that previously the UPDATE stateme=
nt
was executed but the next SELECT method is forwarded to SLAVE - whi=
ch should
not happen.
We enforced balancing ratio in pgpool by proportion MASTER/SLAVE as 0/1000.=
Thus
the majority of requests will be forwarded to SLAVE. The attached configura=
tion
assumes the 2 backends are synchronous and backend #0 is master
Notes: The attached DIFF file also contains patch for the issue 0000231.
----------------------------------------------------------------------
(0001242) t-ishii (developer) - 2016-12-20 17:47
http://www.pgpool.net/mantisbt/view.php?id=271#c1242
----------------------------------------------------------------------
It seems there's no evidence that the SELECT in question was performed in a=
n
explicit transaction in your log.
----------------------------------------------------------------------
(0001243) supp_k (reporter) - 2016-12-20 18:07
http://www.pgpool.net/mantisbt/view.php?id=271#c1243
----------------------------------------------------------------------
Please have a look at the attached log file.
The issue is reproduced when time lag between requests is very small. This =
is
why we create attached java multithread java test.
Regarding the attached log:
1) at 2016-12-20T12:02:38.200806 there was a write statement
2) Later (2016-12-20T12:02:38.202535) one can see that the new SELECT reque=
st
was balanced to the node 1 which is read-only slave.
----------------------------------------------------------------------
(0001244) supp_k (reporter) - 2016-12-20 18:08
http://www.pgpool.net/mantisbt/view.php?id=271#c1244
----------------------------------------------------------------------
Multuthread java test:
import java.sql.*;
import java.util.*;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.locks.*;
public class TestPgpoolLoad {
private static final int TEST_THREADS_NUM = 40;
public static void main(String[] args) throws Exception {
try (Connection conn =
DriverManager.getConnection("jdbc:postgresql://10.28.78.103:5432/postgres",
"postgres", "password")) {
Lock lock = new ReentrantLock();
conn.setAutoCommit(false);
CyclicBarrier cb = new CyclicBarrier(TEST_THREADS_NUM);
try (PreparedStatement stmtSelect = conn.prepareStatement("selec=
t * from
read_table where id <= ?");
PreparedStatement stmtWrite = conn.prepareStatement("update w=
rite_table set
id = 1 where id = ?")) {
List<Thread> thrs = new ArrayList<>();
for (int i = 0; i < TEST_THREADS_NUM; i++) {
boolean isRead = i < TEST_THREADS_NUM - 1;
boolean isSleep = i < (TEST_THREADS_NUM / 2);
thrs.add(new TestReader(isSleep, cb, isRead, isRead ? stmtSe=
lect :
stmtWrite, lock));
}
for (Thread thr : thrs) {
thr.start();
}
Thread.sleep(1000);
for (Thread thr : thrs) {
thr.join();
}
System.out.println("ok");
} catch (Exception ex) {
ex.printStackTrace();
throw ex;
}
conn.commit();
}
}
public static class TestReader extends Thread {
private final CyclicBarrier cb;
private final boolean readThread;
private final PreparedStatement stmt;
private final Lock lock;
private final boolean isSleep;
private final boolean isRead;
TestReader(boolean isSleep, CyclicBarrier cb, boolean readThread,
PreparedStatement readStmt, Lock lock) {
this.isSleep = isSleep;
this.cb = cb;
this.isRead = readThread;
this.readThread = readThread;
this.stmt = readStmt;
this.lock = lock;
}
@Override
public void run() {
try {
cb.await();
if (!isRead) {
Thread.sleep(1);
}
lock.lock();
stmt.setInt(1, 1);
stmt.execute();
} catch (Exception ex) {
ex.printStackTrace();
} finally {
lock.unlock();
}
}
}
}
----------------------------------------------------------------------
(0001245) supp_k (reporter) - 2016-12-20 18:11
http://www.pgpool.net/mantisbt/view.php?id=271#c1245
----------------------------------------------------------------------
Also there is the transaction start mentioned in the log:
2016-12-20T12:02:38.032665+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [208-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 0 bac=
kend
pid: 31091 statement: Parse: BEGIN
2016-12-20T12:02:38.032667+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [208-2] 2016-12-20 12:02:38: pid 30813:LOCATION:
pool_proto_modules.c:3089
2016-12-20T12:02:38.032800+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [211-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 1 bac=
kend
pid: 13293 statement: Parse: BEGIN
2016-12-20T12:02:38.032806+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [211-2] 2016-12-20 12:02:38: pid 30813:LOCATION:
pool_proto_modules.c:3089
2016-12-20T12:02:38.033030+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [224-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 0 bac=
kend
pid: 31091 statement: B message
2016-12-20T12:02:38.033032+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [224-2] 2016-12-20 12:02:38: pid 30813:LOCATION:
pool_proto_modules.c:3089
2016-12-20T12:02:38.033178+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [227-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 1 bac=
kend
pid: 13293 statement: B message
2016-12-20T12:02:38.033179+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [227-2] 2016-12-20 12:02:38: pid 30813:LOCATION:
pool_proto_modules.c:3089
2016-12-20T12:02:38.033435+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [239-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 0 bac=
kend
pid: 31091 statement: Execute: BEGIN
2016-12-20T12:02:38.033437+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [239-2] 2016-12-20 12:02:38: pid 30813:LOCATION:
pool_proto_modules.c:3089
2016-12-20T12:02:38.033577+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [244-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 1 bac=
kend
pid: 13293 statement: Execute: BEGIN
2016-12-20T12:02:38.033578+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(4=
3669)
idle[30813]: [244-2] 2016-12-20 12:02:38: pid 30813:LOCATION:
pool_proto_modules.c:3089
----------------------------------------------------------------------
(0001247) t-ishii (developer) - 2016-12-20 18:15
http://www.pgpool.net/mantisbt/view.php?id=271#c1247
----------------------------------------------------------------------
But the process id is different (30813 vs. 16742). So I can not confirm tha=
t the
SELECT and DML are in a same transaction.
----------------------------------------------------------------------
(0001248) supp_k (reporter) - 2016-12-20 18:24
http://www.pgpool.net/mantisbt/view.php?id=271#c1248
----------------------------------------------------------------------
It seems you missed the freshest attached log file. Please have a look at t=
he
freshest "test7.log".
Also the pgpool settings are:
backend_hostname0 = 'a.db.node'
backend_port0 = 15432
backend_weight0 = 0
backend_data_directory0 = '/var/lib/pgsql/9.5/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'b.db.node'
backend_port1 = 15432
backend_weight1 = 1000
backend_data_directory1 = '/var/lib/pgsql/9.5/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 240
max_pool = 1
child_life_time = 0
child_max_connections = 1
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
log_line_prefix = '%t: pid %p:'
log_connections = on
log_hostname = on
log_statement = on
log_per_node_statement = on
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 5
log_error_verbosity = verbose
client_min_messages = warning
log_min_messages = debug1
----------------------------------------------------------------------
(0001249) supp_k (reporter) - 2016-12-20 19:47 (private)
http://www.pgpool.net/mantisbt/view.php?id=271#c1249
----------------------------------------------------------------------
Have you reproduced the case with the help of the attached java test? Can y=
ou
see the evidence?
----------------------------------------------------------------------
(0001250) supp_k (reporter) - 2016-12-21 05:42
http://www.pgpool.net/mantisbt/view.php?id=271#c1250
----------------------------------------------------------------------
I have updated the test scenario. Please look at the source code below.
There are N-1 threads that read Value which was previously inserted by the
Single thread. The N threads start simultaneously in front of the barrier. =
There
is a semaphore (ReentrantLock) that prevents logic from parallel queries bu=
t we
still have very high concurrency mode and minimal time period between recei=
ving
response and sending of the next query. After the INSERT threads completes
insert operation it sets valatile flag (writePerformed) and all SELECT thre=
ads
start trying to fetch the value. If the value was not returned then it defi=
nely
means the SELECT was balanced to the node which doesn't contain the value
inserted so far.
Try to launch the test several times and you will see the exception in stdo=
ut.
In our environment we reproduce the issue in 50%.
If you wish I can create fully functional test case: debug it in the local =
QA
environment and provide you so that you will be able to use it in your CI
solution.
package com;
import java.sql.*;
import java.util.*;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.locks.*;
public class TestPgpool {
private static final int TEST_THREADS_NUM = 40;
static volatile boolean writePerformed = false; // variable will be se=
t into
TRUE when the write thread did INSERT operation
public static void main(String[] args) throws Exception {
Random random = new Random(System.currentTimeMillis());
try (Connection conn =
DriverManager.getConnection("jdbc:postgresql://10.28.78.103/plesk", "plesk"=
,
"jjHCF7sGHq")) {
Lock lock = new ReentrantLock();
conn.setAutoCommit(false); // here we define explicit transaction
CyclicBarrier cb = new CyclicBarrier(TEST_THREADS_NUM);
final int valueTest = random.nextInt();
System.out.println("test value:" + valueTest);
try (PreparedStatement stmtSelect = conn.prepareStatement("selec=
t * from
write_table where id = ?");
PreparedStatement stmtWrite = conn.prepareStatement("inser=
t into
write_table(id) values(?)")) {
List<Thread> thrs = new ArrayList<>();
for (int i = 0; i < TEST_THREADS_NUM; i++) {
boolean isRead = i > 0;
//boolean isSleep = i < (TEST_THREADS_NUM / 2);
thrs.add(new TestReader(valueTest, cb, isRead, isRead ? stmt=
Select :
stmtWrite, lock));
}
for (Thread thr : thrs) {
thr.start();
}
Thread.sleep(1000);
for (Thread thr : thrs) {
thr.join();
}
} catch (Exception ex) {
ex.printStackTrace();
throw ex;
}
conn.commit();
}
}
public static class TestReader extends Thread {
private final CyclicBarrier cb;
private final PreparedStatement stmt;
private final Lock lock;
private final boolean isRead;
private final int valueTest;
TestReader(int valueTest, CyclicBarrier cb, boolean readThread,
PreparedStatement stmt, Lock lock) {
this.valueTest = valueTest;
this.cb = cb;
this.isRead = readThread;
this.stmt = stmt;
this.lock = lock;
}
@Override
public void run() {
try {
cb.await();
lock.lock();
stmt.setInt(1, valueTest);
if (isRead) {
try (ResultSet rs = stmt.executeQuery()) {
if (writePerformed) {
if (!rs.next()) {
throw new RuntimeException("The request was balance=
d. The value was
inserted but the the following select returned nothing");
} else {
int val = rs.getInt(1);
System.out.println("recevied value:" + val);
}
} else {
// write was not performed so one should not expect th=
e value is in the
database
}
}
} else {
stmt.execute();
writePerformed = true; // set mark the write performed. si=
nce this moment
the READ threads will be trying to read the value.
}
} catch (Exception ex) {
ex.printStackTrace();
throw new RuntimeException(ex);
} finally {
lock.unlock();
}
}
}
}
----------------------------------------------------------------------
(0001254) t-ishii (developer) - 2016-12-21 13:39
http://www.pgpool.net/mantisbt/view.php?id=271#c1254
----------------------------------------------------------------------
I have took a look at test7.log and confirmed the issue. Looking into the s=
ource
code now...
----------------------------------------------------------------------
(0001255) t-ishii (developer) - 2016-12-21 14:24
http://www.pgpool.net/mantisbt/view.php?id=271#c1255
----------------------------------------------------------------------
Is it possible to provide a Java debug trace when the issue occurs?
Pgpool-II decides the transaction is writing when a write query is sent and
backend returns the "ready for query" message to Pgpool-II. My guess is, cl=
ient
sends parse message for a select *before* Pgpool-II receives the "ready for
query message". If it actually happens, the select can be sent to standby
because the writing flag is not set yet.
----------------------------------------------------------------------
(0001256) supp_k (reporter) - 2016-12-21 15:18
http://www.pgpool.net/mantisbt/view.php?id=271#c1256
----------------------------------------------------------------------
Yes, please have a look at the attached test10.zip file. It contains pgpool=
log
and the corresponding java/jdbc log.
----------------------------------------------------------------------
(0001257) supp_k (reporter) - 2016-12-21 18:11
http://www.pgpool.net/mantisbt/view.php?id=271#c1257
----------------------------------------------------------------------
'Probably it makes sense to set the flag on the receiving of a WRITE query.=
In
this case the pgpool's state machine is free of the problem and even the
possibility (in this particular case only).
----------------------------------------------------------------------
(0001258) t-ishii (developer) - 2016-12-21 21:35
http://www.pgpool.net/mantisbt/view.php?id=271#c1258
----------------------------------------------------------------------
It's not that simple. By looking at the jdbc log, I found that:
1) "prepare select" is was sent to node 1
2) write query was sent to node 0
3) bind/execute request for 1) was coming. They were sent to node 1 (oops!)
If we simply redirect 3) to node 0, it will fail because there's no prepare=
d
statement for the select on node 0. So the only solution is sending prepare
select to node 0 before 3). Actually the code for the logic was in 3.4. I s=
eem
to forgot to port it to 3.5 or later:-)
----------------------------------------------------------------------
(0001259) supp_k (reporter) - 2016-12-21 21:54
http://www.pgpool.net/mantisbt/view.php?id=271#c1259
----------------------------------------------------------------------
Still I don't understand the real cause of the problem because I'm not an e=
xpert
in Pgpool's state machine and Postgres protocol.
But from my point of view it seems the Parse message should be sent to all
backends if load balancing is activated. This is True becase if for instanc=
e
there are no DML requests Pgpool still has this right to balance requests.
Right now from the logs I see the Parse is forwarded to the node 1 (from th=
e
test10.log) and later after the INSERT is complete Pgpool still forwards
BIND/EXECUTE for the Selects to the node 1..... this is what I see.
But I see you already have the solution which was forgotten in the past )))=
It
seems the lost code should be restored )))
----------------------------------------------------------------------
(0001260) t-ishii (developer) - 2016-12-21 23:56
http://www.pgpool.net/mantisbt/view.php?id=271#c1260
----------------------------------------------------------------------
> But from my point of view it seems the Parse message should be sent to al=
l
backends if load balancing is activated.
Actually we did this long time ago and people blamed us because that way wa=
s
ineffective and slow.
----------------------------------------------------------------------
(0001261) supp_k (reporter) - 2016-12-22 00:04
http://www.pgpool.net/mantisbt/view.php?id=271#c1261
----------------------------------------------------------------------
Ok, probably there are some reasons exist. But we still need to fix the pro=
blem.
----------------------------------------------------------------------
(0001262) supp_k (reporter) - 2016-12-22 18:24
http://www.pgpool.net/mantisbt/view.php?id=271#c1262
----------------------------------------------------------------------
Would it be possible to restore the lost code? We still facing problems bec=
ause
of the issue (((
----------------------------------------------------------------------
(0001263) elepuser (reporter) - 2016-12-23 20:57
http://www.pgpool.net/mantisbt/view.php?id=271#c1263
----------------------------------------------------------------------
Hi Guys,
recently under not very high load we faced exactly the same issue. It resul=
ted
in data loses and integrity problems.
The program logic was inserting data but than occasionally it failed to fet=
ch
the same data within the same transaction. First moment we decided that pos=
tgres
violates the ACID concept, but later revealed problem in pgpool's behaviour=
(((
I see there is the fix for the problem exists. Can you please apply it plea=
se
ASAP? We look forward to obtain the fixed build.
----------------------------------------------------------------------
(0001264) t-ishii (developer) - 2016-12-26 15:06
http://www.pgpool.net/mantisbt/view.php?id=271#c1264
----------------------------------------------------------------------
> Would it be possible to restore the lost code?
Of course. But it would not be straight forward because of the message hand=
ling
logic difference between Pgpool-II 3.5 and pre 3.5. Also I need a tool to s=
end
arbitrary message stream for testing/debugging. I'm working on developing i=
t...
----------------------------------------------------------------------
(0001271) supp_k (reporter) - 2016-12-28 20:01
http://www.pgpool.net/mantisbt/view.php?id=271#c1271
----------------------------------------------------------------------
Do you have any estimations when you will be able to provide the patch for
testing? We look forward to obtain it.
----------------------------------------------------------------------
(0001277) t-ishii (developer) - 2017-01-04 18:06
http://www.pgpool.net/mantisbt/view.php?id=271#c1277
----------------------------------------------------------------------
>From December 29 to January 3rd my office was closed because of national
holidays. I would like to finish the work by the end of this month.
----------------------------------------------------------------------
(0001278) supp_k (reporter) - 2017-01-04 19:17
http://www.pgpool.net/mantisbt/view.php?id=271#c1278
----------------------------------------------------------------------
Happy new year and Merry Christmas Mr. T-Ishii!
We look forward to obtain the fixed version. Lots of our activities are del=
ayed
because of this problem. If possible we'd like to participate in testing as=
soon
as you are ready to release the fix. Probably if we participate you will be=
able
to accomplish with the fix faster than the end of the month with a better
quality.
----------------------------------------------------------------------
(0001286) elepuser (reporter) - 2017-01-10 05:15
http://www.pgpool.net/mantisbt/view.php?id=271#c1286
----------------------------------------------------------------------
Yes, is it possible to provide the fix earlier than the end of the month?
Recently we faced several more issues of this type. Every time it takes muc=
h
efforts to rollback the business logic and restore data ((( We are also rea=
dy to
participate in the testing.
PLEASE if possible provide the fix faster!
----------------------------------------------------------------------
(0001295) supp_k (reporter) - 2017-01-25 19:21
http://www.pgpool.net/mantisbt/view.php?id=271#c1295
----------------------------------------------------------------------
Dear Tatsui Ishii,
do you have any information about estimated timelines when you will be able=
to
provide the fix? We are experiencing a lot of troubles because of the issue=
(((
Please is possible provide the fix or the information wehn it will be avail=
able.
Thank you!
----------------------------------------------------------------------
(0001296) supp_k (reporter) - 2017-01-25 19:22
http://www.pgpool.net/mantisbt/view.php?id=271#c1296
----------------------------------------------------------------------
...I'm sorry for the mistake in your name!!
----------------------------------------------------------------------
(0001310) elepuser (reporter) - 2017-01-29 07:35
http://www.pgpool.net/mantisbt/view.php?id=271#c1310
----------------------------------------------------------------------
Any updates on the issue?
>From the message thread I see there is a fundamental problem in pgpool(((
The attached by supp_k test shows that the problem easily reproduced when t=
he
time lag between response from pgpool and the next request is minimal. Also=
the
PARSE message is being sent to one node only. Just wonder how do you guys =
are
hoing to fix the trouble if you dont want to send the parse request to all
backends?
----------------------------------------------------------------------
(0001312) t-ishii (developer) - 2017-01-30 18:05
http://www.pgpool.net/mantisbt/view.php?id=271#c1312
----------------------------------------------------------------------
We don't need to send PARSE message to all backends.
If we know that a write query was sent in the transaction, then succeeding
SELECT is sent to the primary node. If we already send a parse message for =
the
SELECT to other than the primary, then we will send the parse message to th=
e
primary.
I'm working on the implementation in this direction now.
----------------------------------------------------------------------
(0001329) supp_k (reporter) - 2017-02-06 16:40
http://www.pgpool.net/mantisbt/view.php?id=271#c1329
----------------------------------------------------------------------
Hi T-Ishii,
are there any updates about this problem?
We are ready to start testing it.
----------------------------------------------------------------------
(0001336) elepuser (reporter) - 2017-02-07 17:48
http://www.pgpool.net/mantisbt/view.php?id=271#c1336
----------------------------------------------------------------------
Guys,
any updates on the issue? Can you please provide the fix?
----------------------------------------------------------------------
(0001337) t-ishii (developer) - 2017-02-08 10:27
http://www.pgpool.net/mantisbt/view.php?id=271#c1337
----------------------------------------------------------------------
Still working on it. However I'm pretty busy now and I'm not sure when the =
work
will finish.
----------------------------------------------------------------------
(0001341) t-ishii (developer) - 2017-02-14 16:12
http://www.pgpool.net/mantisbt/view.php?id=271#c1341
----------------------------------------------------------------------
I'm getting a slight progress. With over 800 lines diff, now in certain cas=
e the
issue seems to be fixed. Here is the case described by pgproto
(https://github.com/tatsuo-ishii/pgproto).
However I am experiencing a regression with our regression test suit, and
definitely I need to work on it.
# Test data for bug271.
# In an explicit transaction, SELECT is parsed, DML issued, same SELECT/bin=
d
issued.
# In this case Pgpool-II should parse resend to primary node.
#
# Create test table
'Q' "DROP TABLE IF EXISTS pgproto_test1"
'Y'
'Q' "CREATE TABLE pgproto_test1(i INT)"
'Y'
# Start a transaction
'P' "S1" "BEGIN" 0
'B' "" "S1" 0 0 0
'E' "" 0
'C' 'S' "S1"
# Issue SELECT
'P' "S2" "SELECT 1" 0
#'B' "" "S1" 0 0 0
#'E' "" 0
#'C' 'S' "S1"
# Issue INSERT
'P' "S1" "INSERT INTO pgproto_test1 VALUES(1)" 0
'B' "" "S1" 0 0 0
'E' "" 0
'C' 'S' "S1"
#'S'
#'Y'
# Issue SELECT. This should be sent to primary node.
#'P' "S1" "SELECT 1" 0
'B' "" "S2" 0 0 0
'E' "" 0
'C' 'S' "S2"
# Issue COMMIT
'P' "S1" "COMMIT" 0
'B' "" "S1" 0 0 0
'E' "" 0
'C' 'S' "S1"
'S'
'Y'
'X'
Issue History
Date Modified Username Field Change =
==========================
==========================
====================
2016-12-16 04:46 supp_k New Issue =
2016-12-16 04:59 supp_k Note Added: 0001225 =
2016-12-17 02:33 supp_k Note Added: 0001226 =
2016-12-19 06:10 serk File Added: issues_271_244.zip =
2016-12-19 06:10 serk Note Added: 0001231 =
2016-12-20 09:35 t-ishii Assigned To => t-ishii =
2016-12-20 09:35 t-ishii Status new => assigned =
2016-12-20 17:47 t-ishii Note Added: 0001242 =
2016-12-20 17:47 t-ishii Status assigned => feed=
back
2016-12-20 18:07 supp_k File Added: test7.log =
2016-12-20 18:07 supp_k Note Added: 0001243 =
2016-12-20 18:07 supp_k Status feedback => assi=
gned
2016-12-20 18:08 supp_k Note Added: 0001244 =
2016-12-20 18:11 supp_k Note Added: 0001245 =
2016-12-20 18:15 t-ishii Note Added: 0001247 =
2016-12-20 18:24 supp_k Note Added: 0001248 =
2016-12-20 19:47 supp_k Note Added: 0001249 =
2016-12-21 05:42 supp_k Note Added: 0001250 =
2016-12-21 13:39 t-ishii Note Added: 0001254 =
2016-12-21 14:24 t-ishii Note Added: 0001255 =
2016-12-21 14:25 t-ishii Status assigned => feed=
back
2016-12-21 15:18 supp_k File Added: test10.zip =
2016-12-21 15:18 supp_k Note Added: 0001256 =
2016-12-21 15:18 supp_k Status feedback => assi=
gned
2016-12-21 18:11 supp_k Note Added: 0001257 =
2016-12-21 21:35 t-ishii Note Added: 0001258 =
2016-12-21 21:54 supp_k Note Added: 0001259 =
2016-12-21 23:56 t-ishii Note Added: 0001260 =
2016-12-22 00:04 supp_k Note Added: 0001261 =
2016-12-22 18:24 supp_k Note Added: 0001262 =
2016-12-23 20:57 elepuser Note Added: 0001263 =
2016-12-26 15:06 t-ishii Note Added: 0001264 =
2016-12-28 20:01 supp_k Note Added: 0001271 =
2017-01-04 18:06 t-ishii Note Added: 0001277 =
2017-01-04 19:17 supp_k Note Added: 0001278 =
2017-01-10 05:15 elepuser Note Added: 0001286 =
2017-01-25 19:21 supp_k Note Added: 0001295 =
2017-01-25 19:22 supp_k Note Added: 0001296 =
2017-01-29 07:35 elepuser Note Added: 0001310 =
2017-01-30 18:05 t-ishii Note Added: 0001312 =
2017-02-06 16:40 supp_k Note Added: 0001329 =
2017-02-07 17:48 elepuser Note Added: 0001336 =
2017-02-08 10:27 t-ishii Note Added: 0001337 =
2017-02-14 16:12 t-ishii Note Added: 0001341 =
==========================
==========================
====================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20170214/b4ca1fc7/attachment-0001.html>
More information about the pgpool-hackers
mailing list