[pgpool-hackers: 2426] Re: Load balancing with synchronous replication

Tue Jul 4 21:28:58 JST 2017

> Hello again pgpool hackers,
> 
> I have now renamed that proposed feature to "synchronous replay".  I
> am wondering if it would be a good idea to do a proof-of-concept patch
> for pgpool.  One of the central ideas of my patch is that it should be
> easy to use for end users, which is where pgpool can help.
> 
> Here is my idea for step 1.  Use the regular master/slave load
> balancing mode, but with a couple of small modifications:
> 
> 1.  If error 40P02 is raised (the new error introduced by my patch,
> "synchronous replay is not available"), then remember not to pick that
> server again for N seconds (configurable).
> 
> 2.  If error 40P02 is raised, then automatically retry the statement;
> because of point (1), this will happen on a different server.  There
> is a small limit on the number of retries (configurable).

What would happen if the query is executed in an explicit transaction?
If the error raised, the transaction is already in the abort state. So
the retry needs for whole transaction, which seems very difficult for
me.

> I think that would be better than trying to check the
> pg_stat_replication view periodically on the master (the idea I
> mentioned in the earlier email).  What do you think?

Because using pg_stat_replication view does not have such a problem
(aborted transaction), things would be lot easier. Fortunately there's
a convenient infrastructure in Pgpool-II: the relation cache. It's a
per process cache. We could cache the result for querying
pg_stat_replication. The cache can have an expiration time. So by
setting the expiration, it acts like "checking pg_stat_replication
view periodically".

> Here is my idea for step 2.  Multi-statement transactions that begin
> with "BEGIN ... READ ONLY" could also be sent to read-only servers.

Good point. Probably we could do an optimaization by checking "BEGIN
... READ ONLY" today.

> If the error 40P02 occurs, then perhaps the transaction could be
> automatically retried on another server, but *only* if it happens when
> running the *first* statement in the transaction.  (It doesn't make
> sense to replay the transaction automatically on another node if the
> user has already seen some results, which is why it's OK to do it if
> the first statement fails but not if later statements fail).  In
> practice, this mean that REPEATABLE READ + READ ONLY transactions
> would benefit from automatic load balancing with automatic transparent
> retry, because REPEATABLE READ transactions can only fail with error
> 40P02 on the first statement.  That would provide end users with a
> nice way to achieve load balancing of multi-statement read only
> transactions, as long as they are prepared to wrap their transactions
> in "BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ READ ONLY".
> Perhaps READ COMMITTED could be supported too, but then we would have
> to send the 40P02 error to the user: automatic retry would not be
> possible, unless it happens to be on the first statement.  Actually
> that it is very likely to occur on the first statement, so maybe
> that's still quite usable.

I think checking pg_stat_replication view is lot easier. If we cannot
send the queries to read only standbys because they are not sync
standby, then we could just re-route it to the primary.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp