[pgpool-general: 2192] Re: Postgres Sequence numbers different on backends when inserts run concurrently

Tatsuo Ishii ishii at postgresql.org
Sun Oct 13 00:24:24 JST 2013


Ok, here is the first cut of patch trying to fix your problem. Can you
please try it out? The patch is against 3.3-STABLE.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

>> Do you have any update on this problem.  We are getting closer to a
>> PGPool upgrade here and I'd like for this to be in our new production
>> setup.
> 
> Sorry for delay. I tried to inject a LOCK statement into BIND message
> processing. Unfortunately it did not work as expected if client uses
> "unnamed protal" (it's your case), because the injected LOCK smashes
> the unnamed portal. We already solved the problem long time ago for
> SELECT case by using named portal for such that injected SELECT. I
> think we could adopt the similar technique to solve the problem. I
> hope to finish the work in a week or so (I'm pretty busy with my work
> now).
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
>> By the way, PGPool is fantastic. We run millions of transactions/day
>> through our databases using it.
>> 
>> Thanks for your work
>> 
>> Steve
>> 
>> On 09/24/2013 02:36 AM, Tatsuo Ishii wrote:
>>> Ok, I think I found the cause of the problem.
>>>
>>> Pgpool automatically issues a LOCK command(table lock or row lock
>>> depending on your configuration) to keep consistency while handling
>>> INSERT command. This works fine with your test program until 4
>>> iterated INSERT. In each INSERT cycle, pgpool issues LOCK while
>>> processing "parse" message (which is generated by JDBC driver while
>>> handling "prepare") which is followed by "bind" "describe" and
>>> "execute" message. This is good. However, starting from 5th INSERT,
>>> JDBC skips generating the parse message and starts from
>>> "bind". Unfortunately pgpool does not issue LOCK while processing the
>>> bind message. Note that the behavior of JDBC is legal. Just pgpool is
>>> not prepared for the situation. Fixing it is not a trivial work. Give
>>> me some time...
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>>
>>>> I confirm the problem using your test program. Will look into this. In
>>>> the mean time, I noticed that if your program runs with autocommit
>>>> off, it seems the problem goes away. Can you please try it out if you
>>>> like?
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>>> I have a PGPool that is connected to 2 backends in replication and
>>>>> load balance mode one master, one slave.
>>>>>
>>>>> We have been using this for a while with no problems.  Occasionally we
>>>>> get problems where serial sequence numbers for the same database
>>>>> record are different on each of the backends.
>>>>>
>>>>> We have been able to replicate the problem by writing a java program
>>>>> to do 10,000 inserts and running it concurrently on 4 systems
>>>>> connected to pgpool.  We have enabled pgpool insert lock and created
>>>>> the insert lock table to try to keep this from happening.
>>>>>
>>>>> Here is a short java program that re-creates the problem reliably when
>>>>> running 4 instances concurrently.
>>>>>
>>>>> import java.sql.Connection;
>>>>> import java.sql.DriverManager;
>>>>> import java.sql.PreparedStatement;
>>>>> import java.sql.SQLException;
>>>>>
>>>>> public class PgTester {
>>>>>
>>>>> 	/**
>>>>> 	 * @param args
>>>>> 	 */
>>>>> 	public static void main(String[] args) {
>>>>> 		 Integer startValue = 0;
>>>>> 	        try{
>>>>> 	            startValue = Integer.parseInt(args[0]);
>>>>> 	        } catch (Exception e){
>>>>> 	            System.out.println("Please provide a valid number as the first
>>>>> 	            argument to the program");
>>>>> 	            System.exit(-1);
>>>>> 	        }
>>>>>
>>>>> 		
>>>>> 		Connection conn=null;
>>>>> 		try {
>>>>> 			Class.forName ("org.postgresql.Driver");
>>>>>
>>>>> 			conn =
>>>>> 			DriverManager.getConnection("jdbc:postgresql://hostname/pharman?user=userid&password=password");
>>>>> 		} catch (ClassNotFoundException e) {
>>>>> 			e.printStackTrace();
>>>>> 		} catch (SQLException e) {
>>>>> 			e.printStackTrace();
>>>>> 		}
>>>>>
>>>>> 		try {
>>>>>
>>>>>              PreparedStatement query = conn.prepareStatement("INSERT INTO
>>>>>              public.sequencetester (recordno) VALUES (?)");
>>>>>              int max = startValue + 10000;
>>>>>              for(int cv = startValue; cv <= max; cv++){
>>>>>                  query.setInt(1, cv);
>>>>>                  query.execute();
>>>>>                  //System.out.println("interted "+cv);
>>>>>              }
>>>>>              query.close();
>>>>>              conn.close();
>>>>>              System.out.println("Done");
>>>>>          } catch (SQLException e) {
>>>>>              e.printStackTrace();
>>>>>          } 	
>>>>> }
>>>>>
>>>>> The table is created using
>>>>>
>>>>> CREATE TABLE sequencetester
>>>>> (
>>>>>    id serial NOT NULL,
>>>>>    recordno integer,
>>>>>    CONSTRAINT sq_pk PRIMARY KEY (id )
>>>>> )
>>>>>
>>>>> After running the program if you compare the id and recordno column on
>>>>> each of the backend postgres systems there are entries that do not
>>>>> have the same values.
>>>>>
>>>>> for example
>>>>>
>>>>> Backend 1
>>>>> id = 5, recordno = 5
>>>>> id = 6, recordno = 6
>>>>> id = 7, recordno = 6000
>>>>> id = 8, recordno = 6001
>>>>>
>>>>> Backend 2
>>>>> id = 5, recordno = 5
>>>>> id = 6, recordno = 6000
>>>>> id = 7, recordno = 6
>>>>> id = 8, recordno = 6001
>>>>>
>>>>> It appears that the serial sequence was run in a different order on
>>>>> each of the back ends.
>>>>>
>>>>> We are running Postgres 9.3 from the Debian apt repository and PgPool
>>>>> 3.3.1 downloaded from the pgpool.net web site.
>>>>>
>>>>> This problem also occurs on our current production system running
>>>>> Postgres 9.2 PgPool 3.2.3.
>>>>>
>>>>> Is this a configuration problem or an internal PgPool/Postgres
>>>>> problem?
>>>>>
>>>>> Steve Kuekes
>>>>> --
>>>>> Steve Kuekes
>>>>>
>>>>> Physicians Pharmacy Alliance
>>>>> 118 MacKenan Drive, Suite 200
>>>>> Cary, NC  27511
>>>>> 919-465-5801 direct
>>>>> 919-463-5555 main
>>>>> 919-463-5566 fax
>>>>>
>>>>> steve.kuekes at physicianspharmacy.com
>>>>> _______________________________________________
>>>>> pgpool-general mailing list
>>>>> pgpool-general at pgpool.net
>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>> 
>> -- 
>> Steve Kuekes
>> 
>> Physicians Pharmacy Alliance
>> 118 MacKenan Drive, Suite 200
>> Cary, NC  27511
>> 919-465-5801 direct
>> 919-463-5555 main
>> 919-463-5566 fax
>> 
>> steve.kuekes at physicianspharmacy.com
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix.patch
Type: text/x-patch
Size: 5540 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20131013/f042d71e/attachment-0001.bin>


More information about the pgpool-general mailing list