[pgpool-hackers: 169] Re: Improvements

Tatsuo Ishii ishii at postgresql.org
Sun Feb 10 19:26:16 JST 2013


On more thing. pcp/md.c and pcp/md5.h is a symlink to upper directory
and shound not be included in the patch.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Thanks.
> 
> I have tested a little bit your patches and now I would like to hear
> more details of implementation because your changes are big and hard
> to follow implementation details for me(especially
> pool_process_query.c).
> 
> Also here are some requests:
> 
> 1) please include doc changes. (i.e. doc/pgpool-en.html)
> 
> 2) we do not use C++ style codings (we follow the PostgreSQL coding
>    style). So please do not use '//' comments. Also please do not
>    declare variable in the middle of blocks. I mean:
> 
>    if (a < b)
>    {
> 		int d;
> 
> 		a = 1;
> 		b = 2;
> 		c = 3;
> 		:
> 		:
> 	}
> 
> 	is ok but this is not ok:
> 
>    if (a < b)
>    {
> 		a = 1;
> 		b = 2;
> 		int c  = 3;
> 		:
> 		:
> 	}
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
>> Sure.
>> 
>> Best regards,
>> Alexey Fedonin.
>> 
>> 2013/2/1 Tatsuo Ishii <ishii at postgresql.org>
>> 
>>> Alexey,
>>>
>>> I tried to make a diff between current git repo master and you repo in
>>> github and found yoursis based on a little bit older version of
>>> repo. Can you please rebase it? Or even better, could you send diff
>>> against current git repo master?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>>
>>> > Hello, everyone!
>>> >
>>> > I've added new features and improvements to pgpool (for parallel mode)
>>> > which I would like to contribute as well as get feedback from the
>>> community.
>>> > Features:
>>> >  - Transaction support added (for SELECT * FROM distributed_table);
>>> >  - CURSORS support added (for distributed tables);
>>> >  - Dynamic connections to System DB (optional);
>>> >  - SELECT rewriting fixes (some improvements and some bugfixes).
>>> >
>>> > 1) Transactions.
>>> > When client sends 'BEGIN' statement, pgpool child connects to another
>>> child
>>> > and resend all client statements to it, except for 'SELECT ... FROM
>>> > distributed_table', which executes in usual way. In this case, 'SELECT
>>> ...
>>> > FROM distributed_table' executes inside the same transaction block as
>>> well
>>> > as other statements.
>>> > The better way would be to use the same child's connections as it uses
>>> for
>>> > non-distributed SELECTs, but I haven't found the means of doing it.
>>> >
>>> > 2) Cursors.
>>> > For example, we have two nodes and distributed table:
>>> >  Node 0             Node 1
>>> >
>>> >  value              value
>>> > -------            -------
>>> >  4                  3
>>> >  6                  1
>>> >  8                  7
>>> >  2                  9
>>> >  0                  5
>>> >
>>> >  a. BEGIN;
>>> >  b. DECLARE cursor_name CURSOR FOR SELECT value FROM distributed_table
>>> > ORDER BY value;
>>> >  c. MOVE FORWARD 2 IN cursor_name;
>>> >  d. FETCH FORWARD 3 IN cursor_name:
>>> >
>>> >  value
>>> > -------
>>> >  2
>>> >  3
>>> >  4
>>> >
>>> >  CURSORS work for FORWARD/BACKWARD ALL/(positive number).
>>> >  Cursors work for any type of distribution, e.g. round-robin.
>>> >
>>> >  3) Dynamic connections (optional).
>>> >  If number of clients is not usually too high, it is not necessary to
>>> have
>>> > connections to System DB for free childs.
>>> >  New parameter 'system_db_dynamic_connection' added to config.
>>> >  If value of parameter is 0, child works as usual.
>>> >  If value > 0, child connects to System DB when client connects. Also,
>>> > child without System DB connections sleeps for
>>> > 'system_db_dynamic_connection' mcs before accepting new client. So new
>>> > client will be accepted by child, which already has connections to System
>>> > DB.
>>> >
>>> >  4) SELECT.
>>> >  SELECT rewriting logic was changed a little, and now more complex
>>> queries
>>> > work correctly, such as:
>>> >   a. SELECT field_1, (SELECT field_2 FROM replicated_table WHERE
>>> > field_3=S1.field_3) AS field_2 FROM distributed_table S1;
>>> >   b. (SELECT field_1, (SELECT field_2 FROM replicated_table_1) AS field_2
>>> > FROM distributed_table_1 S1) UNION ALL (SELECT field_1, (SELECT field_3
>>> > FROM replicated_table_2) AS field_3 FROM distributed_table_2 S2) ORDER BY
>>> > field_1;
>>> >   c. Also, 'ORDER BY + LIMIT + OFFSET' optimization added:
>>> > SELECT field FROM distributed_table ORDER BY field LIMIT 10 OFFSET 100;
>>> > rewrites into:
>>> > SELECT "pool_c$1" AS field FROM dblink('host=localhost.localdomain
>>> > dbname=dbname port=9999 user=username','SELECT pool_parallel("SELECT
>>> > distributed_table.time FROM distributed_table ORDER BY field LIMIT 10 +
>>> > 100")',false) AS pool_t$0("pool_c$1" field_type) ORDER BY "pool_c$1"
>>> OFFSET
>>> > 100 LIMIT 10;
>>> >   d. some bugfixes.
>>> >
>>> > If you are interested in, I could explain each change in detail.
>>> >
>>> > Source code is available at https://github.com/afedonin/repo-test.git
>>> >
>>> > P.S. All changes was made and tested for parallel_mode.
>>> > P.P.S. I don't know pgpool as good as you, so I appreciate any feedback.
>>> >
>>> > Best regards,
>>> > Alexey Fedonin.
>>>
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers


More information about the pgpool-hackers mailing list