*** pgpool-en.html.old Thu Jul 19 17:48:54 2012 --- pgpool-en.html Thu Jul 19 18:10:55 2012 *************** *** 7,100 **** ! Last modified: Sun Jan 30 09:14:28 JST 2011
! !Welcome to the pgpool -II page |
! ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
! ! ! !
- [Japanese page] |
-
-
! What is pgpool-II?!pgpool-II is a middleware that sits between PostgreSQL servers and a PostgreSQL database client. It provides the following features:
pgpool-II maintains established connections to the PostgreSQL servers, and reuses them whenever a new connection with the same properties ! (i.e. username, database, protocol version) comes in. It reduces the ! connection overhead, and improves system's overall throughput. pgpool-II can manage multiple PostgreSQL servers. Activating the ! replication feature makes it possible to create a realtime backup on 2 or ! more PostgreSQL clusters, so that the service can continue without interruption ! if one of those clusters fails. If a database is replicated, performing a SELECT query on any ! server will return the same result. pgpool-II takes advantage of ! the replication feature in order to reduce the load on each PostgreSQL server. ! It does that by distributing SELECT queries among available servers, improving ! the system's overall throughput. In an ideal scenario, read performance could ! improve proportionally to the number of PostgreSQL servers. Load balancing works ! best in a scenario where there are a lot of users executing many read-only ! queries at the same time. There is a limit on the maximum number of concurrent ! connections with PostgreSQL, and new connections are rejected when this ! number is reached. Raising this maximum number of connections, however, ! increases resource consumption and has a negative impact on overall system ! performance. pgpool-II also has a limit on the maximum number of ! connections, but extra connections will be queued instead of returning ! an error immediately. Using the parallel query feature, data can be split ! among multiple servers, so that a query can be executed on all the ! servers concurrently, reducing the overall execution time. Parallel query ! works best when searching large-scale data. Welcome to pgpool -II page
!
+
+
!
!
! What is pgpool-II?!pgpool-II is a middle ware that sits between PostgreSQL servers and a PostgreSQL database client. It provides the following features:
pgpool-II maintains established connections to the PostgreSQL servers, and reuses them whenever a new connection with the same properties ! (i.e. user name, database, protocol version) comes in. It reduces the ! connection overhead, and improves system's overall throughput. pgpool-II can manage multiple PostgreSQL servers. Activating the ! replication feature makes it possible to create a real time backup on 2 or ! more PostgreSQL clusters, so that the service can continue without interruption ! if one of those clusters fails. If a database is replicated(because running in either replication mode or master/slave mode), ! performing a SELECT query on any ! server will return the same result. pgpool-II takes advantage of ! the replication feature in order to reduce the load on each PostgreSQL server. ! It does that by distributing SELECT queries among available servers, improving ! the system's overall throughput. In an ideal scenario, read performance could ! improve proportionally to the number of PostgreSQL servers. Load balancing works ! best in a scenario where there are a lot of users executing many read-only ! queries at the same time. There is a limit on the maximum number of concurrent ! connections with PostgreSQL, and new connections are rejected when this ! number is reached. Raising this maximum number of connections, however, ! increases resource consumption and has a negative impact on overall system ! performance. pgpool-II also has a limit on the maximum number of ! connections, but extra connections will be queued instead of returning ! an error immediately. Using the parallel query feature, data can be split ! among multiple servers, so that a query can be executed on all the ! servers concurrently, reducing the overall execution time. Parallel query ! works best when searching large-scale data. Supported Platformspgpool-II works on Linux, Solaris, FreeBSD, and most of the *************** *** 124,143 **** You must also make sure that all of your PostgreSQL servers are using the same major PostgreSQL version. ! In addition to this, hardware architectures and OSes must be identical if you ! want to use online recovery. pgpool-II Installation! pgpool-II can be downloaded from the pgpool Development page. Packages are also provided for various platforms including CentOS, RedHat Enterprise Linux, Fedora and Debian. pgpool-II's source code can be downloaded from: ! pgpool development page Installing pgpool-II from source code requires gcc 2.9 or higher, and GNU --- 132,157 ---- You must also make sure that all of your PostgreSQL servers are using the same major PostgreSQL version. ! In addition to this, we do not recommend mixing different PostgreSQL installation with different build options: including supporting SSL or not, to use --disable-integer-datetimes or not, different block size. ! These might affect part of functionality of pgpool-II. ! The difference of PostgreSQL minor versions is not usually a problem. However we do not test every occurrence of minor versions and we recommend to use exact same minor version of PostgreSQL. + + + +pgpool-II Installation! pgpool-II can be downloaded from the pgpool Development page. Packages are also provided for various platforms including CentOS, RedHat Enterprise Linux, Fedora and Debian. + Check appropriate repository. pgpool-II's source code can be downloaded from: ! pgpool development page Installing pgpool-II from source code requires gcc 2.9 or higher, and GNU *************** *** 148,245 ****
Configuring pgpool-IIDefault configuration files for pgpool-II are
make make install! ! will install pgpool-II. (If you use Solaris or FreeBSD, replace make with gmake) ! !! If you are using PostgreSQL 8.0 or later, installing pgpool_regclass function on all PostgreSQL ! to be accessed by pgpool-II is strongly recommended, as it is used internally by pgpool-II. ! Without this, handling of duplicate table names in different schema ! might cause trouble (temporary tables aren't a problem). ! cd pgpool-II-x.x.x/sql/pgpool-regclass make make install psql -f pgpool-regclass.sql template1! ! Executing pgpool-regclass.sql should be performed on every databases accessed ! with pgpool-II. ! You do not need to do this for a database created after the ! execution of "psql -f pgpool-regclass.sql template1", as this template database ! will be cloned to create new databases. ! !! If you use insert_lock in replication mode, creating pgpool_catalog.insert_lock ! table for mutual exclusion is strongly recommended. ! Without this, insert_lock works so far. However in that case pgpool-II locks ! against the insert target table. This behavior is same as pgpool-II 2.2 and 2.3 ! series. The table lock conflicts with VACUUM. So INSERT processing may be thereby ! kept waiting for a long time. ! ! cd pgpool-II-x.x.x/sql ! psql -f insert_lock.sql template1! ! Executing insert_lock.sql should be performed on every databases accessed ! with pgpool-II. ! You do not need to do this for a database created after the execution of ! "psql -f insert_lock.sql template1", as this template database will be ! cloned to create new databases. ! !! Installing is done. If you are using Solaris or FreeBSD, ! you need to replace "make" with "gmake" in the above description ! because those operating systems requires GNU make. ! ! !Configuring pgpool-IIDefault configuration files for pgpool-II are
Configuring
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Connection Pool | !X | !O | !O | !O | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Replication | !X | !O | !X | !(*1) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Load Balance | !X | !O | !O | !(*1) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Failover | !O | !O | !O | !X | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Online recovery | !X | !0 | !(*2) | !X | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Parallel Query | !X | !X | !X | !O | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Required # of Servers | !1 or higher | !2 or higher | !2 or higher | !2 or higher | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
System DB required? | !no | !no | !no | !yes |
Special character | Description |
%d | Backend ID of a detached node. |
%h | Hostname of a detached node. |
%p | Port number of a detached node. |
%D | Database cluster directory of a detached node. |
%M | Old master node ID. |
%m | New master node ID. |
%H | Hostname of the new master node. |
%P | Old primary node ID. |
%% | '%' character |
- You need to reload pgpool.conf if you change failover_command. -
- -- When a failover is performed, pgpool kills all its child processes, which - will in turn terminate all active sessions to pgpool. Then pgpool invokes - the failover_command and waits for its completion. - After this, pgpool starts new child processes and is ready again to accept - connections from clients. -
-! This parameter specifies a command to run when a node is attached. ! pgpool-II replaces special the following characters with backend specific ! information. !
!!!!
!! Special character Description ! %d Backend ID of an attached node. ! %h Hostname of an attached node. ! %p Port number of an attached node. ! %D Database cluster path of an attached node. ! %M Old master node ! %m New master node ! %H Hostname of the new master node. ! %P Old primary node ID. ! %% '%' character
! You need to reload pgpool.conf if you change failback_command.
!! If true, and an error occurs when writing to the backend communication, ! pgpool-II will trigger the fail over procedure . This is the same behavior as of ! pgpool-II 2.2.x or earlier. If set to false, pgpool will report an error and ! disconnect the session. ! Please note that if set, however, pgpool will also do the fail over when ! connecting to a backend fails or pgpool detects the administrative shutdown of ! postmaster. You need to reload pgpool.conf if you change this value. !
!pgpool-II ignores white spaces at the beginning of SQL ! queries while in the load balance mode. It is useful if used with ! APIs like DBI/DBD:Pg which adds white spaces against the user's ! will. ! You need to reload pgpool.conf if you change this value. !
!Produces SQL log messages when true. This is similar to the ! log_statement parameter in PostgreSQL. It produces logs even if the ! debug option was not passed to pgpool-II at startup. ! You need to reload pgpool.conf if you change this value. !
!Similar to log_statement, except that it prints logs for each DB node separately. ! It can be useful to make sure that replication is working, for example. ! You need to reload pgpool.conf if you change this value. !
!! If true, ps command status will show the client's hostname instead ! of an IP address. Also, if log_connections is enabled, hostname will ! be logged. You need to reload pgpool.conf if you change this value.
-! If true, all incoming connections will be printed to the log. ! You need to reload pgpool.conf if you change this value.
-! If true, use pool_hba.conf for client authentication. See ! setting up pool_hba.conf for client authentication. ! You need to restart pgpool-II if you change this value.
!Specifies where to connect with the PostgreSQL backend. It can ! It is used by pgpool-II to communicate with the server. ! This parameter can only be set at server start. !
!
! For TCP/IP communication, this parameter can take a hostname or an IP address.
! If this begins with a slash, it specifies Unix-domain communication
! rather than TCP/IP; the value is the name of the directory
! in which the socket file is stored. The default behavior when backend_hostname
! is empty (''
) is to connect to a Unix-domain socket in /tmp
.
!
! Multiple backends can be specified by adding a number at the end
! of the parameter name (e.g.backend_hostname0
). This
! number is referred to as "DB node ID", and it starts from 0. The
! backend which was given the DB node ID of 0 will be called
! "Master DB". When multiple backends are defined, the service can
! be continued even if the Master DB is down (not true in some
! modes). In this case, the youngest DB node ID alive will be the
! new Master DB.
If you plan to use only one PostgreSQL server, specify it by
! backend_hostname0
.
! New nodes can be added in this parameter by reloading a configuration ! file. However, values cannot be updated so you must restart ! pgpool-II in that case. !
!Specifies the port number of the backends. Multiple backends
! can be specified by adding a number at the end of the parameter
! name (e.g. backend_port0
). If you plan to use only
! one PostgreSQL server, specify it by
! backend_port0
.
! New backend ports can be added in this parameter by reloading a configuration ! file. However, values cannot be updated so you must restart ! pgpool-II in that case. !
!Specifies the load balance ratio for the backends. Multiple
! backends can be specified by adding a number at the end of the
! parameter name (e.g. backend_weight0
). If you plan
! to use only one PostgreSQL server, specify it by
! backend_weight0
. In the raw mode, set to 1.
! New backend weights can be added in this parameter by reloading a configuration ! file. However, values cannot be updated so you must restart ! pgpool-II in that case. !
!! In pgpool-II 2.2.6/2.3 or later, you can change this value by re-loading the configuration file. ! This will take effect only for new established client sessions. ! This is useful if you want to prevent any query sent to slaves to perform ! some administrative work in master/slave mode. !
!Specifies the database cluster directory of the
! backends. Multiple backends can be specified by adding a number
! at the end of the parameter name
! (e.g. backend_data_directory0
).
! If you don't plan to use online recovery, you do not need to
! specify this parameter.
!
! New backend data directories can be added in this parameter by reloading a configuration ! file. However, values cannot be updated so you must restart ! pgpool-II in that case. !
!
! If true, enable SSL support for both the frontend and backend
! connections. Note that ssl_key
and ssl_cert
! must also be set in order for SSL to work with frontend connections.
!
! SSL is off by default. Note that OpenSSL support must also ! have been configured at compilation time, as mentioned in the ! installation section. !
!! The pgpool-II daemon must be restarted when updating SSL related settings. !
!! The path to the private key file to use for incoming frontend connections. !
!! There is no default value for this option, and if left unset SSL will ! be disabled for incoming frontend connections. !
!! The path to the public x509 certificate file to use for incoming ! frontend connections. !
!! There is no default value for this option, and if left unset SSL will ! be disabled for incoming frontend connections. !
!! Debug message verbosity level. 0 means no message, greater than 1 ! means more verbose message. Default value is 0. !
!! Life time of relation cache in seconds. 0 means no cache ! expiration(the default). ! The relation cache is used for cache the query result against PostgreSQL ! system catalog to obtain various information including table structures ! or if it's a temporary table or not. The cache is maintained in a pgpool ! child local memory and being kept as long as it survives. ! If someone modify the table by using ALTER TABLE or some such, the relcache ! is not consistent anymore. ! For this purpose, relcache_expiration controls the life time of the cache. !
!! Certificate handling is outside the scope of this document. The ! ! Secure TCP/IP Connections with SSL page at postgresql.org has pointers ! with sample commands for how to generate self-signed certificates. !
!Failover can be performed in raw mode if multiple servers are
! defined. pgpool-II usually accesses the backend specified by
! backend_hostname0
during normal operation. If the
! backend_hostname0 fails for some reason, pgpool-II tries to access the
! backend specified by backend_hostname1. If that fails, pgpool-II tries
! the backend_hostname2, 3 and so on.
In connection pool mode, all functions in raw mode and the ! connection pool function can be used. To enable this mode, set ! raw mode configuration parameters and parameters below.
!The maximum number of cached connections in pgpool-II
! children processes. pgpool-II reuses the cached connection if an
! incoming connection is connecting to the same database with the
! same username. If not, pgpool-II creates a new connection to the
! backend. If the number of cached connections exceeds max_pool,
! the oldest connection will be discarded, and uses that slot for
! the new connection.
!
! Default value is 4. Please be aware that the number of
! connections from pgpool-II processes to the backends may reach
! num_init_children
* max_pool
.
! This parameter can only be set at server start.
Cached connections expiration time in seconds. An expired ! cached connection will be disconnected. Default is 0, which ! means the cached connections will not be disconnected.
!Specifies the SQL commands sent to reset the connection ! to the backend when exiting a session. Multiple commands can be ! specified by delimiting each by ";". Default is ! the following, but can be changed to suit your system. ! !
! reset_query_list = 'ABORT; DISCARD ALL' !!
! Commands differ in each PostgreSQL versions. Here are the recommended settings. !
!!
PostgreSQL version | reset_query_list value |
---|---|
7.1 or before | ABORT |
7.2 to 8.2 | ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT |
8.3 or later | ABORT; DISCARD ALL |
! You need to reload pgpool.conf upon modification of this directive. !
!Failover in the Connection Pool Mode
!Failover in the connection pool mode is the same as in the raw mode.
!This mode enables data replication between the backends. The ! configuration parameters below must be set in addition to everything above.
Setting to true enables replication mode. Default ! is false.
!When set to true, SELECT queries will be ! distributed to each backend for load balancing. Default is ! false.
!When set to true, if backends don't return the same number of affected ! tuples during an INSERT/UPDATE/DELETE, ! the backends that differ from most frequent result set are degenerated. ! If set to false, the session is terminated and the backends are not ! degenerated. Default is false.
!When set to true, if all backends don't return the same packet kind, ! the backends that differ from most ! frequent result set are degenerated. A typical use case is a SELECT ! statement part of a transaction, replicate_select set to ! true, and SELECT returning a different number of rows among backends. ! Non-SELECT statements might trigger this though. ! For example, a backend succeeded in an UPDATE, while others ! failed. ! Note that pgpool does NOT examine the content of records ! returned by SELECT. ! If set to false, the session is terminated and the backends are not ! degenerated. Default is false.
!! Specify a comma separated list of function names that do not ! update the database. SELECTs using functions not specified in this list are ! neither load balanced, nor replicated if in replication mode. ! In master slave mode, such SELECTs are sent to master (primary) only. !
!You can use regular expression into the list to match function name, for example if you have prefixed all your read only function with 'get_' or 'select_' !
!! white_function_list = 'get_.*,select_.*' !!
! Specify a comma separated list of function names that do ! update the database. SELECTs using functions specified in this list are neither ! load balanced, nor replicated if in replication mode. ! In master slave mode, such SELECTs are sent to master(primary) only. !
!You can use regular expression into the list to match function name, for example if you have prefixed all your updating functions with 'set_', 'update_', 'delete_' or 'insert_': !
!! black_function_list = 'nextval,setval,set_.*,update_.*,delete_.*,insert_.*' !!
! Only one of these two lists can be filled in a configuration. !
!! Prior to pgpool-II 3.0, nextval() and setval() were known to do ! writes to the database. You can emulate this by using white_function_list and ! black_function_list: !
!!
! white_function_list = '' ! black_function_list = 'nextval,setval,lastval,currval' !! !
! Please notice that we have lastval, currval in addition to nextval and setval. ! Though lastval() and currval() are not writing functions, it is wise to add lastval() and currval() to avoid errors in the case when these functions are accidentaly load balanced to other DB node. ! Adding to black_function_list will prevent load balancing. !
!When set to true, pgpool-II replicates SELECTs in replication mode. If false, ! pgpool-II only sends them to the Master DB. Default is false.
! !! If a SELECT query is inside an explicit transaction block, replicate_select and ! load_balance_mode will have an effect on how replication works. ! Details are shown below. !
!!
SELECT is inside a transaction block | !Y | !Y | !Y | !N | !N | !N | !Y | !N | !
replicate_select is true | !Y | !Y | !N | !N | !Y | !Y | !N | !N | !
load_balance_mode is true | !Y | !N | !N | !N | !Y | !N | !Y | !Y | !
results(R:replication, M: send only to master, L: load balance) | !R | !R | !M | !M | !R | !R | !M | !L | !
If replicating a table with SERIAL data type, the SERIAL column ! value may differ between the backends. This problem is avoidable by ! locking the table explicitly (although, transactions' parallelism will ! be severely degraded). To achieve this, however, the following change must be ! made: ! !
! INSERT INTO ... !!
! to !
!! BEGIN; ! LOCK TABLE ... ! INSERT INTO ... ! COMMIT; !! !
When insert_lock
is true, pgpool-II
! automatically adds the above queries each time an INSERT is
! executed (if already in transaction, it simply adds LOCK TABLE
! ....).
!
As of pgpool-II 2.2, it automatically detects if the table has a ! SERIAL columns or not, so only tables having SERIAL columns are ! locked. !
!! pgpool-II 3.0 or later uses a row lock against the sequence ! relation, rather than table lock. This will minimize lock ! conflict with VACUUM (including autovacuum). !
!! You might want to have a finer (per statement) control:
!insert_lock
to true, and add /*NO
! INSERT LOCK*/
at the beginning of an INSERT
! statement for which you do not want to acquire the table
! lock.insert_lock
to false, and add
! /*INSERT LOCK*/
at the beginning of an INSERT
! statement for which you want to acquire the table lock.
! Default value is false. If insert_lock
is enabled,
! the regression tests for PostgreSQL 8.0 will fail in transactions,
! privileges, rules, and alter_table. The reason for this is that
! pgpool-II tries to LOCK the VIEW for the rule test, and will
! produce the following error message:
! ! ERROR: current transaction is aborted, commands ignored until ! end of transaction block !!
For example, the transactions test tries an INSERT into a table ! which does not exist, and pgpool-II causes PostgreSQL to acquire the ! lock before that. The transaction will be aborted, and the following ! INSERT statement produces the above error message.
!! This parameter specifies a PostgreSQL username for online recovery. ! It can be changed without restarting. !
!! This parameter specifies a PostgreSQL password for online recovery. ! It can be changed without restarting. !
!! This parameter specifies a command to be run at the first stage of online ! recovery. The command file must be put in the database cluster directory ! for security reasons. ! For example, if recovery_1st_stage_command = 'sync-command', then ! pgpool-II executes $PGDATA/sync-command. ! Note that pgpool-II accepts connections and queries while ! recovery_1st_stage command is executed. You can retrieve and update ! data during this stage. !
!! This parameter can be changed without restarting. !
!! This parameter specifies a command to be run at the second stage of online ! recovery. The command file must be put in the database cluster directory ! for security reasons. ! For example, if recovery_2nd_stage_command = 'sync-command', then ! pgpool-II executes $PGDATA/sync-command. ! Note that pgpool-II does not accept connections and queries while ! recovery_2nd_stage_command is running. Thus if a client stays connected for a ! long time, the recovery command won't be executed. pgpool-II waits until all ! clients have closed their connections. The command is only executed when no ! client is connected to pgpool-II anymore. !
!! This parameter can be changed without restarting. !
!! pgpool does not accept new connections during the second stage. If a client ! connects to pgpool during recovery processing, it will have to wait for the end ! of the recovery. !
!! This parameter specifies recovery timeout in sec. If this timeout is reached, ! pgpool cancels online recovery and accepts connections. 0 means no wait. !
!! This parameter can be changed without restarting. !
!Similar to client_idle_limit but only takes effect in the second ! stage of recovery. A client being idle for client_idle_limit_in_recovery ! seconds since its last query will get disconnected. ! This is useful for preventing the pgpool recovery from being ! disturbed by a lazy client or if the TCP/IP connection between the client ! and pgpool is accidentally down (a cut cable for instance). ! If set to -1, disconnect the client immediately. ! The default value for client_idle_limit_in_recovery is 0, ! which means the feature is turned off. !
!! If your clients are very busy, pgpool-II cannot enter the second stage of ! recovery whatever value of client_idle_limit_in_recovery you may choose. In ! this case, you can set client_idle_limit_in_recovery to -1 so that pgpool-II ! immediately disconnects such busy clients before entering the second stage. !
!! You need to reload pgpool.conf if you change client_idle_limit_in_recovery.
!! This parameter specifies a table name used for large object replication control. ! If it is specified, pgpool will lock the table specified by ! lobj_lock_table and generate a large object id by looking into ! pg_largeobject system catalog and then call lo_create to create the large object. ! This procedure guarantees that pgpool will get the same large object id in all DB ! nodes in replication mode. Please note that PostgreSQL 8.0 or older does not ! have lo_create, thus this feature will not work. !
!! A call to the libpq function lo_creat() will trigger this feature. Also ! large object creation through Java API (JDBC driver), PHP ! API (pg_lo_create, or similar API in PHP library such as PDO), and this same ! API in various programming languages are known to use a similar protocol, ! and thus should work. !
!! The following large object create operation will not work: !
!
! It does not matter what schema lobj_lock_table is stored in, but this table ! should be writable by any user. Here is an example showing how to create such ! a table:
!
CREATE TABLE public.my_lock_table (); GRANT ALL ON public.my_lock_table TO PUBLIC;! !
! The table specified by lobj_lock_table must be created beforehand. If ! you create the table in template1, any database created afterward will ! have it. !
!! If lobj_lock_table has empty string(''), the feature is disabled ! (thus large object replication will not work). The default value for ! lobj_lock_table is ''. !
condition for load balancing
For a query to be load balanced, all the following requirements ! must be met:
Note that you could suppress load balancing by inserting arbitrary comments just in front of the SELECT query:
! /*REPLICATION*/ SELECT ...--- 408,1709 ---- The
pcp.conf
file must be readable by the user who
executes pgpool-II.
! pgpool.conf
As already explained, each operation mode has its specific
configuration parameters in pgpool.conf
. A
$prefix/etc/pgpool.conf.sample
file is created during the
! installation of pgpool-II. Rename the file to
pgpool.conf
and edit its contents.
cp $prefix/etc/pgpool.conf.sample $prefix/etc/pgpool.conf+
! There are additional sample pgpool.conf for each mode. V2.3 -
!Mode | !sample file | !
---|---|
replication mode | !pgpool.conf.sample-replication | !
master/slave mode(Slony-I) | !pgpool.conf.sample-master-slave | !
master/slave mode(Streaming replication) | !pgpool.conf.sample-stream | !
! An empty line or a line starting with "#" is treated as a ! comment and will be ignored.
!Specifies the hostname or IP address, on which pgpool-II will accept
! TCP/IP connections. '*'
accepts
! all incoming connections. ''
disables TCP/IP
! connections. Default is 'localhost'
. Connections via UNIX
! domain socket are always accepted.
!
! This parameter can only be set at server start.
!The port number used by pgpool-II to listen for connections. Default is 9999.
! This parameter can only be set at server start.
!The directory where the UNIX domain socket accepting connections for
! pgpool-II will be created. Default is '/tmp'
. Be
! aware that this socket might be deleted by a cron job. We recommend to
! set this value to '/var/run'
or such directory.
! This parameter can only be set at server start.
!The port number where PCP process accepts connections. Default is 9898.
!! This parameter can only be set at server start.
!The directory path of the UNIX domain socket accepting
! connections for the PCP process. Default is '/tmp'
.
! Be aware that the socket might be deleted by cron.
! We recommend to set this value to '/var/run'
or such directory.
!
! This parameter can only be set at server start.
!! DEPRECATED : ! This parameter is deprecated for consistency with the default libpq policy. ! See the backend_hostname parameter definition to adapt your configuration accordingly. !
!This parameter was defining the PostgreSQL server's UNIX domain socket directory.
! Default is '/tmp'
.
!
! This parameter can only be set at server start.
!PCP connection's timeout value in seconds. If a client does not ! respond within the set seconds, PCP closes the connection with ! the client. Default is 10 seconds. 0 means no timeout. !
!! This can be changed with a reload.
!The number of preforked pgpool-II server processes. Default is 32. ! num_init_children is also the concurrent connections limit to pgpool-II from clients. ! If more than num_init_children clients try to connect to pgpool-II, ! they are blocked (not rejected) until a connection to any pgpool-II process is closed. ! Up to 2*num_init_children can be queued. ! Number of connections to each PostgreSQL is roughly max_pool*num_init_children
! !Some hints in addition to above:
!In summary, max_pool, num_init_children, max_connections, ! superuser_reserved_connections must satisfy the following formula: !
!! max_pool*num_init_children <= (max_connections - superuser_reserved_connections) (no query canceling needed) ! max_pool*num_init_children*2 <= (max_connections - superuser_reserved_connections) (query canceling needed) !!
! This parameter can only be set at server start.
!A pgpool-II child process' life time in seconds. ! When a child is idle for that many seconds, it is terminated and a new child is created. ! This parameter is a measure to prevent memory leaks and other unexpected errors. ! Default value is 300 (5 minutes). ! 0 disables this feature. Note that this doesn't apply for processes ! that have not accepted any connection yet. !
!! You need to reload pgpool.conf if you change this value. !
!A pgpool-II child process will be terminated after this many connections from clients. ! This parameter is useful on a server ! if it is so busy that child_life_time and ! connection_life_time are never triggered. ! Thus this is also useful to prevent PostgreSQL servers from getting too big. !
!! You need to reload pgpool.conf if you change this value. !
!Disconnect a client if it has been idle for client_idle_limit seconds ! after the last query has completed. ! This is useful to prevent pgpool childs from being ! occupied by a lazy client or a broken TCP/IP connection between client and pgpool. ! The default value for client_idle_limit is 0, which means ! the feature is turned off. this value. ! This parameter is ignored in the second stage of online recovery. !
!! You need to reload pgpool.conf if you change client_idle_limit.
!! If true, use pool_hba.conf for client authentication. ! See setting up pool_hba.conf for client authentication. !
!! You need to restart pgpool-II if you change this value. !
!Specify the timeout for pgpool authentication. 0 disables the time out. ! Default value is 60. !
!! You need to restart pgpool-II if you change authentication_timeout.
!PgPool II supports several methods for logging server messages, ! including stderr and syslog. The default is to log to stderr. !
!Note: you will need to alter the configuration of your system's syslog daemon ! in order to make use of the syslog option for log_destination. ! PgPool can log to syslog facilities LOCAL0 through LOCAL7 (see syslog_facility), ! but the default syslog configuration on most platforms will discard all such messages. ! You will need to add something like !
!! local0.* /var/log/pgpool.log !!
to the syslog daemon's configuration file to make it work.
!Add timestamps to the logs when set to true. Default is true. !
!! You need to reload pgpool.conf if you change print_timestamp. !
!! If true, all incoming connections will be printed to the log. !
!! You need to reload pgpool.conf if you change this value. !
!! If true, ps command status will show the client's hostname instead ! of an IP address. Also, if log_connections is enabled, ! hostname will be logged. !
!! You need to reload pgpool.conf if you change this value. !
!Produces SQL log messages when true. This is similar to the ! log_statement parameter in PostgreSQL. It produces logs even if the ! debug option was not passed to pgpool-II at start up. !
!! You need to reload pgpool.conf if you change this value. !
!Similar to log_statement, ! except that it prints logs for each DB node separately. ! It can be useful to make sure that replication is working, for example. !
!! You need to reload pgpool.conf if you change this value. !
!When logging to syslog is enabled, this parameter determines the syslog "facility" to be used. ! You can choose from LOCAL0, LOCAL1, LOCAL2, LOCAL3, LOCAL4, LOCAL5, LOCAL6, LOCAL7; ! the default is LOCAL0. See also the documentation of your system's syslog daemon. !
!When logging to syslog is enabled, this parameter determines the program name ! used to identify PgPool messages in syslog logs. The default is pgpool. !
!! Debug message verbosity level. 0 means no message, greater than 1 ! means more verbose message. Default value is 0. !
!Full path to a file which contains pgpool's process id. ! Default is "/var/run/pgpool/pgpool.pid". !
!! You need to restart pgpool-II if you change this value. !
!pgpool_status is written into this directory.
!Caches connections to backends when set to true. Default is true. !
!! You need to restart pgpool-II if you change this value.
!pgpool-II periodically tries to connect to the backends ! to detect any error on the servers or networks. ! This error check procedure is called "health check". ! If an error is detected, pgpool-II tries to perform failover or degeneration. !
!! This parameter serves to prevent the health check from waiting for a long ! time in a case such as unplugged network cable. ! The timeout value is in seconds. Default value is 20. ! 0 disables timeout (waits until TCP/IP timeout). !
!
! This health check requires one extra connection to each backend,
! so max_connections
in the
! postgresql.conf
needs to be incremented as needed.
!
! You need to reload pgpool.conf if you change this value. !
!This parameter specifies the interval between the health ! checks in seconds. Default is 0, which means health check is disabled. !
!! You need to reload pgpool.conf if you change health_check_period. !
!The user name to perform health check. ! This user must exist in all the PostgreSQL backends. ! Otherwise, health check causes an error. !
!! You need to reload pgpool.conf if you change health_check_user. !
!The password of the user to perform health check. !
!! You need to reload pgpool.conf if you change health_check_password. !
!The maximum number of times to retry a failed health check ! before giving up and initiating failover. ! This setting can be useful in spotty networks, when it is expected that health ! checks will fail occasionally even when the master is fine. ! Default is 0, which means do not retry. ! It is advised that you disable fail_over_on_backend_error ! if you want to enable health_check_max_retries. !
!! You need to reload pgpool.conf if you change health_check_max_retries. !
!The amount of time (in seconds) to sleep between failed health check ! retries (not used unless health_check_max_retries is > 0). ! If 0, then retries are immediate (no delay). !
!! You need to reload pgpool.conf if you change health_check_retry_delay.
+! This parameter specifies a command to run when a node is detached. ! pgpool-II replaces the following special characters with backend specific ! information. !
!Special character | Description |
---|---|
%d | Backend ID of a detached node. |
%h | Hostname of a detached node. |
%p | Port number of a detached node. |
%D | Database cluster directory of a detached node. |
%M | Old master node ID. |
%m | New master node ID. |
%H | Hostname of the new master node. |
%P | Old primary node ID. |
%r | New master port number. |
%R | New master database cluster directory. |
%% | '%' character |
! You need to reload pgpool.conf if you change failover_command. !
!! When a failover is performed, pgpool kills all its child processes, which ! will in turn terminate all active sessions to pgpool. Then pgpool invokes ! the failover_command and waits for its completion. ! After this, pgpool starts new child processes and is ready again to accept ! connections from clients. !
!+ This parameter specifies a command to run when a node is attached. + pgpool-II replaces special the following characters with backend specific + information. +
!Special character | Description |
---|---|
%d | Backend ID of an attached node. |
%h | Hostname of an attached node. |
%p | Port number of an attached node. |
%D | Database cluster path of an attached node. |
%M | Old master node |
%m | New master node |
%H | Hostname of the new master node. |
%P | Old primary node ID. |
%r | New master port number. |
%R | New master database cluster directory. |
%% | '%' character |
! You need to reload pgpool.conf if you change failback_command. !
!! This parameter specifies a command to run in master/slave streaming replication mode ! only after a master failover. ! pgpool-II replaces the following special characters with backend specific information. !
!Special character | Description |
---|---|
%d | Backend ID of a detached node. |
%h | Hostname of a detached node. |
%p | Port number of a detached node. |
%D | Database cluster directory of a detached node. |
%M | Old master node ID. |
%m | New master node ID. |
%H | Hostname of the new master node. |
%P | Old primary node ID. |
%r | New master port number. |
%R | New master database cluster directory. |
%% | '%' character |
! You need to reload pgpool.conf if you change follow_master_command. !
!! If follow_master_command is not empty, when a master failover is ! completed in master/slave streaming replication, ! pgpool degenerate all nodes excepted the new master and starts new child processes ! to be ready again to accept connections from clients. ! After this, pgpool run the command set into the 'follow_master_command' for each ! degenerated nodes. Typically the command should be used to recover the slave from the new master ! by call the pcp_recovery_node command for example. !
!! If true, and an error occurs when reading/writing to the backend communication, ! pgpool-II will trigger the fail over procedure. ! If set to false, pgpool will report an error and disconnect the session. ! If you set this parameter to off, it is recommended that you turn on health checking. ! Please note that even if this parameter is set to off, however, pgpool will also do the fail over when ! pgpool detects the administrative shutdown of postmaster. !
!! You need to reload pgpool.conf if you change this value. !
!pgpool-II ignores white spaces at the beginning of SQL ! queries while in the load balance mode. It is useful if used with ! APIs like DBI/DBD:Pg which adds white spaces against the user's will. !
!! You need to reload pgpool.conf if you change this value. !
!Specifies where to connect with the PostgreSQL backend. ! It is used by pgpool-II to communicate with the server. !
!
! For TCP/IP communication, this parameter can take a hostname or an IP address.
! If this begins with a slash, it specifies Unix-domain communication
! rather than TCP/IP; the value is the name of the directory
! in which the socket file is stored. The default behavior when backend_hostname
! is empty (''
) is to connect to a Unix-domain socket in /tmp
.
!
! Multiple backends can be specified by adding a number at the end
! of the parameter name (e.g.backend_hostname0
).
! This number is referred to as "DB node ID", and it starts from 0.
! The backend which was given the DB node ID of 0 will be called "Master DB".
! When multiple backends are defined, the service can be continued
! even if the Master DB is down (not true in some modes).
! In this case, the youngest DB node ID alive will be the new Master DB.
! Please note that the DB node which has id 0 has no special meaning ! if operated in streaming replication mode. ! Rather, you should care about if the DB node is the "primary node" or not. ! See Streaming Replication for more details. !
!If you plan to use only one PostgreSQL server, specify it by
! backend_hostname0
.
! New nodes can be added in this parameter by reloading a configuration ! file. However, values cannot be updated so you must restart pgpool-II in that case. !
!Specifies the port number of the backends.
! Multiple backends can be specified by adding a number at the end of the parameter name
! (e.g. backend_port0
).
! If you plan to use only one PostgreSQL server, specify it by backend_port0
.
! New backend ports can be added in this parameter by reloading a configuration ! file. However, values cannot be updated so you must restart ! pgpool-II in that case. !
!Specifies the load balance ratio for the backends. Multiple
! backends can be specified by adding a number at the end of the
! parameter name (e.g. backend_weight0
). If you plan
! to use only one PostgreSQL server, specify it by
! backend_weight0
. In the raw mode, set to 1.
! New backend weights can be added in this parameter by reloading a configuration file. !
!! From pgpool-II 2.2.6/2.3 or later, you can change this value by re-loading the configuration file. ! This will take effect only for new established client sessions. ! This is useful if you want to prevent any query sent to slaves to perform ! some administrative work in master/slave mode. !
!Specifies the database cluster directory of the backends.
! Multiple backends can be specified by adding a number
! at the end of the parameter name
! (e.g. backend_data_directory0
).
! If you don't plan to use online recovery, you do not need to specify this parameter.
!
! New backend data directories can be added in this parameter by reloading a configuration file. ! However, values cannot be updated so you must restart pgpool-II in that case. !
!Controls various backend behavior.
! Multiple backends can be specified by adding a number at the end of the parameter name
! (e.g. backend_flag0
).
!
! Currently followings are allowed. ! Multiple flags can be specified by using "|". !
! !ALLOW_TO_FAILOVER | !Allow to failover or detaching backend. This is the default. ! You cannot specify with DISALLOW_TO_FAILOVER at a same time. ! |
---|---|
DISALLOW_TO_FAILOVER | !Disallow to failover or detaching backend. ! This is useful when you protect backend by using HA(High Availability) softwares ! such as Heartbeat or Pacemaker. ! You cannot specify with ALLOW_TO_FAILOVER at a same time. ! |
! If true, enable SSL support for both the frontend and backend connections.
! Note that ssl_key
and ssl_cert
! must also be set in order for SSL to work with frontend connections.
!
! SSL is off by default. Note that OpenSSL support must also ! have been configured at compilation time, as mentioned in the ! installation section. !
! !! The pgpool-II daemon must be restarted when updating SSL related settings. !
!! The path to the private key file to use for incoming frontend connections. !
! !! There is no default value for this option, and if left unset SSL will ! be disabled for incoming frontend connections. !
!! The path to the public x509 certificate file to use for incoming ! frontend connections. !
! !! There is no default value for this option, and if left unset SSL will ! be disabled for incoming frontend connections. !
!! Life time of relation cache in seconds. 0 means no cache ! expiration(the default). ! The relation cache is used for cache the query result against PostgreSQL ! system catalog to obtain various information including table structures ! or if it's a temporary table or not. The cache is maintained in a pgpool ! child local memory and being kept as long as it survives. ! If someone modify the table by using ALTER TABLE or some such, the relcache ! is not consistent anymore. ! For this purpose, relcache_expiration controls the life time of the cache. !
!! Size of relation cache in seconds. Default is 256. ! If you see following message frequently, increase the number. !
!! "pool_search_relcache: cache replacement happened" !!
! If on, enable temporary table check in SELECT statements. This ! initiates queries against system catalog of primary/master thus ! increases load of primary/master. If you are absolutely sure that your ! system never uses temporary tables and you want to save access ! to primary/master, you could turn this off. Default is on. !
!! Certificate handling is outside the scope of this document. The ! ! Secure TCP/IP Connections with SSL page at postgresql.org has pointers ! with sample commands for how to generate self-signed certificates.
! !Failover can be performed in raw mode if multiple servers are
! defined. pgpool-II usually accesses the backend specified by
! backend_hostname0
during normal operation.
! If the backend_hostname0 fails for some reason, pgpool-II tries to access the
! backend specified by backend_hostname1.
! If that fails, pgpool-II tries the backend_hostname2, 3 and so on.
In connection pool mode, all functions in raw mode and the ! connection pool function can be used. ! To enable this mode, you need to turn on "connection_cache". ! Following parameters take effect to connection pool.
! !The maximum number of cached connections in pgpool-II children processes. ! pgpool-II reuses the cached connection if an incoming connection is ! connecting to the same database with the same username. ! If not, pgpool-II creates a new connection to the backend. ! If the number of cached connections exceeds max_pool, ! the oldest connection will be discarded, and uses that slot for the new connection. !
!
! Default value is 4. Please be aware that the number of
! connections from pgpool-II processes to the backends may reach
! num_init_children
*
! max_pool
.
!
! This parameter can only be set at server start.
!Cached connections expiration time in seconds. An expired ! cached connection will be disconnected. Default is 0, which ! means the cached connections will not be disconnected.
!Specifies the SQL commands sent to reset the connection ! to the backend when exiting a session. Multiple commands can be ! specified by delimiting each by ";". ! Default is the following, but can be changed to suit your system. ! !
! reset_query_list = 'ABORT; DISCARD ALL' !! !
! Commands differ in each PostgreSQL versions. Here are the recommended settings. !
! !PostgreSQL version | reset_query_list value |
---|---|
7.1 or before | ABORT |
7.2 to 8.2 | ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT |
8.3 or later | ABORT; DISCARD ALL |
! You need to reload pgpool.conf upon modification of this directive. !
!Failover in the connection pool mode is the same as in the raw mode.
! ! ! ! ! !This mode enables data replication between the backends. The ! configuration parameters below must be set in addition to everything above.
! !Setting to true enables replication mode. Default is false.
!When set to true, SELECT queries will be ! distributed to each backend for load balancing. Default is false.
!This parameter can only be set at server start.
!When set to true, if all backends don't return the same packet kind, ! the backends that differ from most frequent result set are degenerated. !
!! A typical use case is a SELECT statement being part of a transaction, ! replicate_select set to true, ! and SELECT returning a different number of rows among backends. ! Non-SELECT statements might trigger this though. ! For example, a backend succeeded in an UPDATE, while others failed. ! Note that pgpool does NOT examine the content of records returned by SELECT. !
!! If set to false, the session is terminated and the backends are not degenerated. ! Default is false.
!When set to true, if backends don't return the same number of affected ! tuples during an INSERT/UPDATE/DELETE, ! the backends that differ from most frequent result set are degenerated. ! If the frequencies are same, the group which includes master DB node ! (a DB node having the youngest node id) is remained and other groups are degenerated. !
!! If set to false, the session is terminated and the backends are not ! degenerated. Default is false.
!! Specify a comma separated list of function names that do not ! update the database. SELECTs using functions not specified in this list are ! neither load balanced, nor replicated if in replication mode. ! In master slave mode, such SELECTs are sent to master (primary) only. !
!You can use regular expression into the list to match function name ! (to which added automatically ^ and $), ! for example if you have prefixed all your read only function with 'get_' or 'select_' !
!! white_function_list = 'get_.*,select_.*' !!
! Specify a comma separated list of function names that do ! update the database. SELECTs using functions specified in this list are neither ! load balanced, nor replicated if in replication mode. ! In master slave mode, such SELECTs are sent to master(primary) only. !
!You can use regular expression into the list to match function name ! (to which added automatically ^ and $) ! for example if you have prefixed all your updating functions with 'set_', 'update_', 'delete_' or 'insert_': !
!! black_function_list = 'nextval,setval,set_.*,update_.*,delete_.*,insert_.*' !!
! Only one of these two lists can be filled in a configuration. !
!! Prior to pgpool-II 3.0, nextval() and setval() were known to do ! writes to the database. You can emulate this by using white_function_list and ! black_function_list: !
!! white_function_list = '' ! black_function_list = 'nextval,setval,lastval,currval' !! !
! Please note that we have lastval and currval in addition to nextval and setval. ! Though lastval() and currval() are not writing functions, ! it is wise to add lastval() and currval() to avoid errors ! in the case when these functions are accidentally load balanced to other DB node. ! Because adding to black_function_list will prevent load balancing. !
!When set to true, pgpool-II replicates SELECTs in replication mode. If false, ! pgpool-II only sends them to the Master DB. Default is false. !
!! If a SELECT query is inside an explicit transaction block, replicate_select and ! load_balance_mode will have an effect on how replication works. ! Details are shown below. !
! !SELECT is inside a transaction block | !Y | !Y | !Y | !N | !N | !N | !Y | !N | !
---|---|---|---|---|---|---|---|---|
replicate_select is true | !Y | !Y | !N | !N | !Y | !Y | !N | !N | !
load_balance_mode is true | !Y | !N | !N | !N | !Y | !N | !Y | !Y | !
results(R:replication, M: send only to master, L: load balance) | !R | !R | !M | !M | !R | !R | !M | !L | !
If replicating a table with SERIAL data type, the SERIAL column value may differ between the backends. ! This problem is avoidable by locking the table explicitly ! (although, transactions' parallelism will be severely degraded). ! To achieve this, however, the following change must be made: !
!! INSERT INTO ... !! !
! to !
! !! BEGIN; ! LOCK TABLE ... ! INSERT INTO ... ! COMMIT; !! !
When insert_lock
is true, pgpool-II automatically adds
! the above queries each time an INSERT is executed
! (if already in transaction, it simply adds LOCK TABLE ....).
!
pgpool-II 2.2 or later, it automatically detects whether the table ! has a SERIAL columns or not, so it will never lock the table if it does not use SERIAL columns. !
!! pgpool-II 3.0 series until 3.0.4 uses a row lock against the sequence ! relation, rather than table lock. ! This is intended to minimize lock conflict with VACUUM (including autovacuum). ! However this will lead to another problem. ! After transaction wraparound happens, row locking against the sequence relation ! causes PostgreSQL internal error ! (more precisely, access error on pg_clog, which keeps transaction status). ! To prevent this, PostgreSQL core developers decided to disallow row locking ! against sequences and this will break pgpool-II of course ! (the "fixed" version of PostgreSQL will be released as 9.0.5, ! 8.4.9, 8.3.16 and 8.2.22). !
!! pgpool-II 3.0.5 or later uses a row lock against pgpool_catalog.insert_lock ! table because new PostgreSQL disallows a row lock against the sequence relation. ! So creating insert_lock table in all databases which are accessed via pgpool-II beforehand is required. ! See Creating insert_lock table for more details. ! If does not exist insert_lock table, pgpool-II locks the insert target table. ! This behavior is same as pgpool-II 2.2 and 2.3 series. ! If you want to use insert_lock which is compatible with older releases, ! you can specify lock method by configure script. ! See configure for more details. !
!! You might want to have a finer (per statement) control: !
! !insert_lock
to true, and add /*NO INSERT LOCK*/
! at the beginning of an INSERT statement for which
! you do not want to acquire the table lock.insert_lock
to false, and add /*INSERT LOCK*/
! at the beginning of an INSERT statement for which
! you want to acquire the table lock.
! Default value is false. If insert_lock
is enabled,
! the regression tests for PostgreSQL 8.0 will fail in transactions,
! privileges, rules, and alter_table. The reason for this is that
! pgpool-II tries to LOCK the VIEW for the rule test, and will
! produce the following error message:
! ! ERROR: current transaction is aborted, commands ignored until ! end of transaction block !! !
For example, the transactions test tries an INSERT into a table which does not exist, ! and pgpool-II causes PostgreSQL to acquire the lock before that. ! The transaction will be aborted, and the following INSERT statement produces ! the above error message.
!! This parameter specifies a PostgreSQL username for online recovery. ! It can be changed without restarting. !
!! This parameter specifies a PostgreSQL password for online recovery. ! It can be changed without restarting. !
!! This parameter specifies a command to be run by master(primary) PostgreSQL server ! at the first stage of online recovery. ! The command file must be put in the database cluster directory for security reasons. ! For example, if recovery_1st_stage_command = 'sync-command', ! then pgpool-II executes $PGDATA/sync-command. !
!! recovery_1st_stage_command will receive 3 parameters as follows: !
!! Note that pgpool-II accepts connections and queries while ! recovery_1st_stage command is executed. You can retrieve and update data during this stage. !
!! This parameter can be changed without restarting. !
!! This parameter specifies a command to be run by master(primary) PostgreSQL server ! at the second stage of online recovery. ! The command file must be put in the database cluster directory for security reasons. ! For example, if recovery_2nd_stage_command = 'sync-command', then ! pgpool-II executes $PGDATA/sync-command. !
!! recovery_2nd_stage_command will receive 3 parameters as follows: !
!! Note that pgpool-II does not accept connections and queries while ! recovery_2nd_stage_command is running. ! Thus if a client stays connected for a long time, ! the recovery command won't be executed. pgpool-II waits ! until all clients have closed their connections. The command is only executed ! when no client is connected to pgpool-II anymore. !
!! This parameter can be changed without restarting. !
!! pgpool does not accept new connections during the second stage. If a client ! connects to pgpool during recovery processing, it will have to wait for the end of the recovery. !
!! This parameter specifies recovery timeout in sec. If this timeout is reached, ! pgpool cancels online recovery and accepts connections. 0 means no wait. !
!! This parameter can be changed without restarting. !
!Similar to client_idle_limit but only takes effect in the second ! stage of recovery. A client being idle for client_idle_limit_in_recovery ! seconds since its last query will get disconnected. ! This is useful for preventing the pgpool recovery from being ! disturbed by a lazy client or if the TCP/IP connection between the client ! and pgpool is accidentally down (a cut cable for instance). ! If set to -1, disconnect the client immediately. ! The default value for client_idle_limit_in_recovery is 0, ! which means the feature is turned off. !
!! If your clients are very busy, pgpool-II cannot enter the second stage of ! recovery whatever value of client_idle_limit_in_recovery you may choose. ! In this case, you can set client_idle_limit_in_recovery to -1 ! so that pgpool-II immediately disconnects such busy clients before entering the second stage. !
!! You need to reload pgpool.conf if you change client_idle_limit_in_recovery.
!! This parameter specifies a table name used for large object replication control. ! If it is specified, pgpool will lock the table specified by ! lobj_lock_table and generate a large object id by looking into ! pg_largeobject system catalog and then call lo_create to create the large object. ! This procedure guarantees that pgpool will get the same large object id in all DB ! nodes in replication mode. Please note that PostgreSQL 8.0 or older does not ! have lo_create, thus this feature will not work. !
!! A call to the libpq function lo_creat() will trigger this feature. Also ! large object creation through Java API (JDBC driver), PHP ! API (pg_lo_create, or similar API in PHP library such as PDO), and this same ! API in various programming languages are known to use a similar protocol, ! and thus should work. !
!! The following large object create operation will not work: !
!
! It does not matter what schema lobj_lock_table is stored in, but this table ! should be writable by any user. Here is an example showing how to create such a table: !
CREATE TABLE public.my_lock_table (); GRANT ALL ON public.my_lock_table TO PUBLIC;! !
! The table specified by lobj_lock_table must be created beforehand. If ! you create the table in template1, any database created afterward will have it. !
!! If lobj_lock_table has empty string(''), the feature is disabled ! (thus large object replication will not work). The default value for ! lobj_lock_table is ''. !
!For a query to be load balanced, all the following requirements ! must be met: !
Note that you could suppress load balancing by inserting arbitrary comments just in front of the SELECT query:
! /*REPLICATION*/ SELECT ...*************** *** 1369,1409 ****
! Note: the JDBC driver has an autocommit option. If autocommit is false, the JDBC ! driver sends "BEGIN" and "COMMIT" by itself. So pgpool cannot do any load balancing. You need to call setAutoCommit(true) to enable autocommit.
!Failover in Replication Mode
!pgpool-II degenerates a dead backend and continues the service. The service can be continued if there is at least one backend alive.
!This mode is used to couple pgpool-II with another master/slave
replication software (like Slony-I and Streaming replication), which is responsible
for doing the actual data replication.
! DB nodes' information (backend_hostname, backend_port,
! backend_weight, and backend_data_directory you need the
! online recovery fonctionality) must be set, in the same way as in the replication mode.
! In addition to that, set
! master_slave_mode
and load_balance_mode
to
! true. pgpool-II will then send queries that need to be replicated to the
! Master DB, and other queries will be load balanced if possible. Queries sent to
! Master DB because they cannot be balanced are of course accounted for in the
load balancing algorithm.
In master/slave mode, DDL and DML for temporary table can be executed on the ! master node only. ! SELECT can be forced to be executed on the master as well, but for this you ! need to put a /*NO LOAD BALANCE*/ comment before the SELECT statement.
!In the master/slave mode, replication_mode
must be set
! to false, and master_slave_mode
to true.
The master/slave mode has a 'master_slave_sub mode'. The default is 'slony' which is --- 1714,1784 ----
! Note: the JDBC driver has an autocommit option. If autocommit is false, the JDBC ! driver sends "BEGIN" and "COMMIT" by itself. So pgpool cannot do any load balancing. You need to call setAutoCommit(true) to enable autocommit.
!pgpool-II degenerates a dead backend and continues the service. The service can be continued if there is at least one backend alive.
!! In replication mode, if pgpool finds that the number of affected tuples ! by INSERT, UPDATE, DELETE are not same, ! it sends erroneous SQL statement to all DB nodes to abort the transaction ! if failover_if_affected_tuples_mismatch is set to false ! (degeneration occurs if it is set to true). ! In this case you will see following error messages on client terminal: !
!! =# UPDATE t SET a = a + 1; ! ERROR: pgpool detected difference of the number of update tuples Possible last query was: "update t1 set i = 1;" ! HINT: check data consistency between master and other db node !!
! You will see number of updated rows in PostgreSQL log ! (in this case DB node 0 has 0 updated row and DB node 1 has 1 updated row) !
! 2010-07-22 13:23:25 LOG: pid 5490: SimpleForwardToFrontend: Number of affected tuples are: 0 1 ! 2010-07-22 13:23:25 LOG: pid 5490: ReadyForQuery: Degenerate backends: 1 ! 2010-07-22 13:23:25 LOG: pid 5490: ReadyForQuery: Number of affected tuples are: 0 1 !! ! ! ! ! ! !
This mode is used to couple pgpool-II with another master/slave
replication software (like Slony-I and Streaming replication), which is responsible
for doing the actual data replication.
! DB nodes' information (backend_hostname,
! backend_port, backend_weight,
! backend_flag and backend_data_directory
! if you need the online recovery functionality) must be set, in the same way as in the replication mode.
! In addition to that, set master_slave_mode
and
! load_balance_mode
to true.
!
! pgpool-II will then send queries that need to be replicated to the ! Master DB, and other queries will be load balanced if possible. ! Queries sent to Master DB because they cannot be balanced are of course accounted for in the load balancing algorithm.
!In master/slave mode, DDL and DML for temporary table can be executed on the master node only. ! SELECT can be forced to be executed on the master as well, ! but for this you need to put a /*NO LOAD BALANCE*/ comment before the SELECT statement.
!In the master/slave mode, replication_mode
must be set
! to false, and master_slave_mode
to true.
The master/slave mode has a 'master_slave_sub mode'. The default is 'slony' which is *************** *** 1418,1503 **** Please restart pgpool-II if you change any of the above parameters.
! You can set white_function_list and black_function_list to control load balancing in master/slave mode. ! See white_function_list for more details.
!! As stated above, pgpool-II can work together with Streaming Replication, which ! is available since PostgreSQL 9.0. To use it, enable 'master_slave' and ! set 'master_slave_sub_mode' to 'stream'. pgpool-II assumes that Streaming ! Replication is used with Hot Standby at present, which means that the standby ! database is open read-only. ! The following directives can be used with this mode: !
!!
! Specifies the maximum tolerated replication delay of the standby against the ! primary server in WAL bytes. ! If the delay exceeds delay_threshold, pgpool-II does not send SELECT queries to ! the standby server anymore. Everything is sent to the primary server ! even if load balance mode is enabled, until the standby has caught-up. ! If delay_threshold is 0 or health checking is disabled, the delay ! checking is not performed. This check is performed every 'health_check_period'. ! The default value for delay_threshold is 0. ! You need to reload pgpool.conf if you change this directive. !
! !! Specifies how to log the replication delay. ! If 'none' is specified, no log is written. ! If 'always', log the delay every time health checking is performed. ! If 'if_over_threshold' is specified, the log is written when the delay ! exceeds delay_threshold. ! The default value for log_standby_delay is 'none'. ! You need to reload pgpool.conf if you change this directive. !
!! You could monitor the replication delay by using the "show pool_status" ! command as well. !
!In master/slave mode with streaming replication, if the primary or standby node goes down, pgpool-II can be set up to trigger a failover. Nodes can be detached automatically without further setup. ! While doing streaming replication, the standby node checks for the presence of a ! "trigger file" and on finding it, the standby stops continuous recovery and goes ! into read-write mode. By using this, you can have the standby database take over ! when the primary goes down.
Caution: If you plan to use multiple standby nodes, we recommend ! to set a delay_threshold to prevent any query directed to other standby ! nodes from retrieving older data.
If a second standby took over primary when the first standby has already taken over too, you would get bogus data from the second standby. We recommend not to plan this kind of configuration.
How to setup a failover configuration is as follows.
$ cd /usr/loca/pgsql/bin $ cat failover_stream.sh #! /bin/sh # Failover command for streaming replication. # This script assumes that DB node 0 is primary, and 1 is standby. ! # # If standby goes down, do nothing. If primary goes down, create a # trigger file so that standby takes over primary node. # --- 1793,1927 ---- Please restart pgpool-II if you change any of the above parameters.! You can set white_function_list and ! black_function_list to control load balancing in master/slave mode. ! See white_function_list for more details.
! ! ! ! !Streaming Replication V3.1 -
!! As stated above, pgpool-II can work together with Streaming Replication, which ! is available since PostgreSQL 9.0. To use it, enable 'master_slave' and ! set 'master_slave_sub_mode' to 'stream'. ! pgpool-II assumes that Streaming Replication is used with Hot Standby at present, ! which means that the standby database is open read-only. ! The following directives can be used with this mode: !
!! !
! Specifies the maximum tolerated replication delay of the standby against the ! primary server in WAL bytes. ! If the delay exceeds delay_threshold, pgpool-II does not send SELECT queries to ! the standby server anymore. Everything is sent to the primary server ! even if load balance mode is enabled, until the standby has caught-up. ! If delay_threshold is 0 or sr checking is disabled, the delay ! checking is not performed. This check is performed every 'sr_check_period'. ! The default value for delay_threshold is 0. !
!! You need to reload pgpool.conf if you change this directive. !
!! This parameter specifies the interval between the streaming replication ! delay checks in seconds. Default is 0, which means the check is disabled. !
!! You need to reload pgpool.conf if you change sr_check_period. !
!! The user name to perform streaming replication check. This user must ! exist in all the PostgreSQL backends. ! Otherwise, the check causes an error. ! Note that sr_check_user and sr_check_password are used even sr_check_period is 0. ! To identify the primary server, pgpool-II sends function call request to each backend. ! sr_check_user and sr_check_password are used for this session. !
!! You need to reload pgpool.conf if you change sr_check_user. !
!! The password of the user to perform streaming replication check. ! If no password is required, specify empty string(''). !
!! You need to reload pgpool.conf if you change sr_check_password. !
!! Specifies how to log the replication delay. ! If 'none' is specified, no log is written. ! If 'always', log the delay every time health checking is performed. ! If 'if_over_threshold' is specified, the log is written when the delay ! exceeds delay_threshold. ! The default value for log_standby_delay is 'none'. ! You need to reload pgpool.conf if you change this directive. !
!! You could monitor the replication delay by using the "show pool_status" ! command as well. ! The column name is "standby_delay#"(where '#' should be replaced by DB node id). !
!In master/slave mode with streaming replication, if the primary or standby node goes down, pgpool-II can be set up to trigger a failover. Nodes can be detached automatically without further setup. ! While doing streaming replication, the standby node checks for the presence of a "trigger file" ! and on finding it, the standby stops continuous recovery and goes into read-write mode. ! By using this, you can have the standby database take over when the primary goes down.
Caution: If you plan to use multiple standby nodes, we recommend ! to set a delay_threshold to prevent any query directed to other standby ! nodes from retrieving older data.
If a second standby took over primary when the first standby has already taken over too, you would get bogus data from the second standby. We recommend not to plan this kind of configuration. +
How to setup a failover configuration is as follows.
$ cd /usr/loca/pgsql/bin $ cat failover_stream.sh #! /bin/sh # Failover command for streaming replication. # This script assumes that DB node 0 is primary, and 1 is standby. ! # # If standby goes down, do nothing. If primary goes down, create a # trigger file so that standby takes over primary node. # *************** *** 1510,1516 **** # Do nothing if standby goes down. if [ $failed_node = 1 ]; then ! exit 0; fi # Create the trigger file. --- 1934,1940 ---- # Do nothing if standby goes down. if [ $failed_node = 1 ]; then ! exit 0; fi # Create the trigger file. *************** *** 1520,1564 **** chmod 755 failover_stream.sh!
failover_command = '/usr/local/src/pgsql/9.0-beta/bin/failover_stream.sh %d %H /tmp/trigger_file0'!
standby_mode = 'on' primary_conninfo = 'host=name of primary_host user=postgres' trigger_file = '/tmp/trigger_file0'!
wal_level = hot_standby max_wal_senders = 1!
! host replication postgres 192.168.0.10/32 trust
Start primary and secondary PostgreSQL nodes to initiate Streaming replication. ! If the primary node goes down, the standby node will automatically start as a ! normal PostgreSQL and will be ready to accept write queries.
!While using Streaming replication and Hot Standby, it is important to determine which query can be sent to the primary or the standby, --- 1944,1993 ---- chmod 755 failover_stream.sh !
failover_command = '/usr/local/src/pgsql/9.0-beta/bin/failover_stream.sh %d %H /tmp/trigger_file0'+
standby_mode = 'on' primary_conninfo = 'host=name of primary_host user=postgres' trigger_file = '/tmp/trigger_file0'+
wal_level = hot_standby max_wal_senders = 1+
! host replication postgres 192.168.0.10/32 trust+
Start primary and secondary PostgreSQL nodes to initiate Streaming replication. ! If the primary node goes down, the standby node will automatically start as a normal PostgreSQL ! and will be ready to accept write queries.
!While using Streaming replication and Hot Standby, it is important to determine which query can be sent to the primary or the standby, *************** *** 1572,1631 ****
In an explicit transaction: -
- Commands which cannot be executed on the standby such as INSERT are sent - to the primary. - After one of these commands, even SELECTs are sent to the primary node, - This is because these SELECTs might want to see the result of an INSERT - immediately. This behavior continues until the transaction closes or aborts.
In the extended protocol, it is possible to determine if the query can --- 2001,2066 ----
In an explicit transaction:
+ +In the extended protocol, it is possible to determine if the query can *************** *** 1637,1689 ****
[Note: If the parse of a SELECT statement is sent to the standby node due to load ! balancing, and then a DML statement, such as an INSERT, is sent to pgpool-II, ! then the parsed SELECT will have to be executed on the primary node. Therefore, ! we re-parse the SELECT on the primary node.]
Lastly, queries that pgpool-II's parser thinks to be an error are sent to the primary node.
!! In master/slave mode with streaming replication, online recovery can be ! performed. ! Only a standby node can be recovered. You cannot recover the primary node. ! To recover the primary node, you have to stop all DB nodes and pgpool-II, and ! then restore it from a backup.
recovery_user = 'postgres'!
recovery_password = 't-ishii'!
recovery_1st_stage_command = 'basebackup.sh'!
recovery_2nd_stage_command = ''!
--- 2072,2132 ----[Note: If the parse of a SELECT statement is sent to the standby node due to load ! balancing, and then a DML statement, such as an INSERT, is sent to pgpool-II, ! then the parsed SELECT will have to be executed on the primary node. ! Therefore, we re-parse the SELECT on the primary node.] !
!Lastly, queries that pgpool-II's parser thinks to be an error are sent to the primary node.
!Online recovery with Streaming Replication
! In master/slave mode with streaming replication, online recovery can be performed. ! In the online recovery procedure, primary server acts as a master server and recovers specified standby server. ! Thus the recovery procedure requires that the primary server is up and running. ! If the primary server goes down, and no standby server is promoted, you need to stop ! pgpool-II and all PostgreSQL servers and recover them manually.
!
!- Set recovery_user. Usually it's "postgres".
! !recovery_user = 'postgres'!- Set recovery_password for ! recovery_user to login database.
!recovery_password = 't-ishii'+- Set recovery_1st_stage_command. ! The script for this stage should perform a base backup of the primary and ! restore it on the standby node. ! Place this script inside the primary database cluster directory and give it ! execute permission. ! Here is the sample script (basebackup.sh) for ! a configuration of one primary and one standby. ! You need to setup ssh so that recovery_user can login from the primary to ! the standby without being asked for a password.
! !recovery_1st_stage_command = 'basebackup.sh'!- Leave recovery_2nd_stage_command be empty.
!recovery_2nd_stage_command = ''+- Install required C and SQL functions to perform online recovery into each DB nodes.
*************** *** 1693,1807 **** # psql -f pgpool-recovery.sql template1!- After completing online recovery, pgpool-II will start PostgreSQL on ! the standby node. ! Install the script for this purpose on each DB nodes. ! Sample script is included in "sample" ! directory of the source code. ! This script uses ssh. You need to allow recovery_user to login from the ! primary node to the standby node without being asked password.
That's it. ! Now you should be able to use pcp_recovery_node (as long as the standby node stops) or push "recovery" button of pgpoolAdmin to perform online recovery.
!Parallel Mode
!This mode activates parallel execution of queries. Tables can be split, and ! data distributed to each node. Moreover, the replication and the load balancing ! features can be used at the same time. In parallel mode, replication_mode and ! load_balance_mode are set to true in pgpool.conf, master_slave is set to false, ! and parallel_mode is set to true. When you change this parameter, restart pgpool-II.
!Configuring the System DB
To use the parallel mode, the System DB must be configured ! properly. The System DB contains rules, stored in a table, to choose an appropriate backend to send partitioned data to. The System DB does not need to be created on the same host as pgpool-II. The System DB's configuration is done in
pgpool.conf
.
The hostname where the System DB is. Specifying the ! empty string ('') means the System DB is on the same host as ! pgpool-II, and will be accessed via a UNIX domain socket.
!The port number for the System DB
!The partitioning rules and other information will be defined
! in the database specified here. Default value is:
! 'pgpool'
.
The partitioning rules and other information will be defined
! in the schema specified here. Default value is:
! 'pgpool_catalog'
.
The username to connect to the System DB.
!The password for the System DB. If no password is necessary, ! set the empty string ('').
!
! The path to a PEM format file containing one or more CA root
! certificates, which can be used to verify the backend server
! certificate. This is analogous to the -CAfile
option
! of the OpenSSL verify(1)
command.
!
! The default value for this option is unset, so no
! verification takes place. Verification will still occur if
! this option is not set but a value has been given for
! ssl_ca_cert_dir
.
!
! The path to a directory containing PEM format CA certficate
! files, which can be used to verify the backend server
! certificate. This is analogous to the -CApath
option
! of the OpenSSL verify(1)
command.
!
! The default value for this option is unset, so no
! verification takes place. Verification will still occur if
! this option is not set but a value has been given for
! ssl_ca_cert
.
!
Initial Configuration of the System DB
First, create the database and schema specified in the
pgpool.conf
file. A sample script can be found in
--- 2136,2291 ----
# psql -f pgpool-recovery.sql template1
!
That's it. ! Now you should be able to use pcp_recovery_node (as long as the standby node stops) or push "recovery" button of pgpoolAdmin to perform online recovery. + If something goes wrong, please examine pgpool-II log, primary server log and standby server log(s).
!! For your reference, here are the steps taken in the recovery procedure. !
This mode activates parallel execution of queries. ! Tables can be split, and data distributed to each node. ! Moreover, the replication and the load balancing features can be used at the same time. ! In parallel mode, replication_mode and ! load_balance_mode are set to true in pgpool.conf, ! master_slave is set to false, ! and parallel_mode is set to true. ! When you change this parameter, restart pgpool-II.
!To use the parallel mode, the System DB must be configured
! properly. The System DB contains rules, stored in a table, to choose an
appropriate backend to send partitioned
data to. The System DB does not need to be created on the same host as
pgpool-II. The System DB's configuration is done in
pgpool.conf
.
The hostname where the System DB is. Specifying the ! empty string ('') means the System DB is on the same host as ! pgpool-II, and will be accessed via a UNIX domain socket.
!The port number for the System DB
!The partitioning rules and other information will be defined
! in the database specified here. Default value is:
! 'pgpool'
.
The partitioning rules and other information will be defined
! in the schema specified here. Default value is:
! 'pgpool_catalog'
.
The username to connect to the System DB.
!The password for the System DB. If no password is necessary, ! set the empty string ('').
!
! The path to a PEM format file containing one or more CA root
! certificates, which can be used to verify the backend server certificate.
! This is analogous to the -CAfile
option
! of the OpenSSL verify(1)
command.
!
! The default value for this option is unset, so no verification takes place.
! Verification will still occur if this option is not set
! but a value has been given for ssl_ca_cert_dir
.
!
! The path to a directory containing PEM format CA certificate
! files, which can be used to verify the backend server certificate.
! This is analogous to the -CApath
option
! of the OpenSSL verify(1)
command.
!
! The default value for this option is unset, so no verification takes place.
! Verification will still occur if this option is not set
! but a value has been given for ssl_ca_cert
.
!
First, create the database and schema specified in the
pgpool.conf
file. A sample script can be found in
***************
*** 1814,1839 ****
Registering a Partitioning Rule
The rules for data partitioning must be registered into the
pgpool_catalog.dist_def
table.
CREATE TABLE pgpool_catalog.dist_def( ! dbname TEXT, -- database name ! schema_name TEXT, -- schema name ! table_name TEXT, -- table name ! col_name TEXT NOT NULL CHECK (col_name = ANY (col_list)), -- partitioning key column name ! col_list TEXT[] NOT NULL, -- names of table attributes ! type_list TEXT[] NOT NULL, -- types of table attributes ! dist_def_func TEXT NOT NULL, -- name of the partitioning rule function ! PRIMARY KEY (dbname,schema_name,table_name) );!
Registering a Replication Rule
Tables that are not distributed have to be replicated. When a query joins a distributed table with another table, pgpool gets the replication information --- 2298,2323 ----
!The rules for data partitioning must be registered into the
pgpool_catalog.dist_def
table.
CREATE TABLE pgpool_catalog.dist_def( ! dbname TEXT, -- database name ! schema_name TEXT, -- schema name ! table_name TEXT, -- table name ! col_name TEXT NOT NULL CHECK (col_name = ANY (col_list)), -- partitioning key column name ! col_list TEXT[] NOT NULL, -- names of table attributes ! type_list TEXT[] NOT NULL, -- types of table attributes ! dist_def_func TEXT NOT NULL, -- name of the partitioning rule function ! PRIMARY KEY (dbname,schema_name,table_name) );!
Tables that are not distributed have to be replicated. When a query joins a distributed table with another table, pgpool gets the replication information *************** *** 1843,1885 ****
CREATE TABLE pgpool_catalog.replicate_def( ! dbname TEXT, --database name ! schema_name TEXT, --schema name ! table_name TEXT, --table name ! col_list TEXT[] NOT NULL, -- names of table attributes ! type_list TEXT[] NOT NULL, -- types of table attributes ! PRIMARY KEY (dbname,schema_name,table_name) );!
Example for partitioning the pgbench tables
! In this example, the accounts table is partitioned, and the branches and tellers table are replicated. The accounts table and the branches table are joined by bid. The branches table is registered into the replication table. If the three tables (accounts, branches, and tellers) ! are to be joined, it is necessary to register a replication rule for the tellers table too.
INSERT INTO pgpool_catalog.dist_def VALUES ( ! 'pgpool', ! 'public', ! 'accounts', ! 'aid', ! ARRAY['aid','bid','abalance','filler'], ! ARRAY['integer','integer','integer','character(84)'], ! 'pgpool_catalog.dist_def_accounts' ); INSERT INTO pgpool_catalog.replicate_def VALUES ( ! 'pgpool', ! 'public', ! 'branches', ! ARRAY['bid','bbalance','filler'], ! ARRAY['integer','integer','character(84)'] );--- 2327,2369 ----
CREATE TABLE pgpool_catalog.replicate_def( ! dbname TEXT, -- database name ! schema_name TEXT, -- schema name ! table_name TEXT, --table name ! col_list TEXT[] NOT NULL, -- names of table attributes ! type_list TEXT[] NOT NULL, -- types of table attributes ! PRIMARY KEY (dbname,schema_name,table_name) );!
! In this example, the accounts table is partitioned, and the branches and tellers table are replicated. The accounts table and the branches table are joined by bid. The branches table is registered into the replication table. If the three tables (accounts, branches, and tellers) ! are to be joined, it is necessary to register a replication rule for the tellers table too.
INSERT INTO pgpool_catalog.dist_def VALUES ( ! 'pgpool', ! 'public', ! 'accounts', ! 'aid', ! ARRAY['aid','bid','abalance','filler'], ! ARRAY['integer','integer','integer','character(84)'], ! 'pgpool_catalog.dist_def_accounts' ); INSERT INTO pgpool_catalog.replicate_def VALUES ( ! 'pgpool', ! 'public', ! 'branches', ! ARRAY['bid','bbalance','filler'], ! ARRAY['integer','integer','character(84)'] );*************** *** 1890,1986 ****
CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_accounts (val ANYELEMENT) RETURNS INTEGER AS ' ! SELECT CASE WHEN $1 >= 1 and $1 <= 30000 THEN 0 ! WHEN $1 > 30000 and $1 <= 60000 THEN 1 ELSE 2!
! Just like the pg_hba.conf file for PostgreSQL, pgpool supports a similar ! client authentication function using a configuration file called ! "pool_hba.conf".
! When you install pgpool, pool_hba.conf.sample will be installed in ! "/usr/local/etc", which is the default directory for configuration ! files. Copy pool_hba.conf.sample as pool_hba.conf and edit it if necessary. ! By default, pool_hba authentication is enabled. See "6. Setting up ! pgpool.conf" for more detail.
! The format of the pool_hba.conf file follows very closely PostgreSQL's ! pg_hba.conf format.
! local DATABASE USER METHOD [OPTION] ! host DATABASE USER CIDR-ADDRESS METHOD [OPTION]
! See "pool_hba.conf.sample" for a detailed explanation of each field.
! Here are the limitations of pool_hba.
Though "hostssl" is invalid, pgpool-II 2.3 or later supports SSL. ! See SSL for more details. !
Since pgpool does not know anything about users in the backend server, the database name is simply checked against entries in the DATABASE field of pool_hba.conf. !
This is for the same reason as for the "samegroup" described above. A user name is simply checked against the entries in the USER field of pool_hba.conf. !
pgpool currently does not support IPv6. !
Again, this is for the same reason as for the "samegroup" described above. pgpool does not have access to user/password information. !
!! To use md5 authentication, you need to register your name and password in ! "pool_passwd". ! See Authentication / Access Controls for more details.
! Note that everything described in this section is about the authentication ! taking place between a client and pgpool; a client still has to go through ! the PostgreSQL's authentication process. As far as pool_hba is concerned, ! it does not matter if a user name and/or database name given by a client ! (i.e. psql -U testuser testdb) really exists in the backend. pool_hba only ! cares if a match in the pool_hba.conf is found or not.
! PAM authentication is supported using user information on the host where ! pgpool is executed. To enable PAM support in pgpool, specify "--with-pam" ! option to configure:
! configure --with-pam
! To enable PAM authentication, you need to create a ! service-configuration file for pgpool in the system's PAM ! configuration directory (which is usually at "/etc/pam.d"). A sample ! service-configuration file is installed as "share/pgpool.pam" under ! the install directory.
!The Query cache can be used in all modes in pgpool-II. Activating it in ! pgpool.conf is done as follows:
enable_query_cache = true--- 2374,2483 ----
CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_accounts (val ANYELEMENT) RETURNS INTEGER AS ' ! SELECT CASE WHEN $1 >= 1 and $1 <= 30000 THEN 0 ! WHEN $1 > 30000 and $1 <= 60000 THEN 1 ELSE 2! ! ! ! !
! Just like the pg_hba.conf file for PostgreSQL, pgpool supports a similar ! client authentication function using a configuration file called "pool_hba.conf".
! When you install pgpool, pool_hba.conf.sample will be installed in ! "/usr/local/etc", which is the default directory for configuration ! files. Copy pool_hba.conf.sample as pool_hba.conf and edit it if necessary. ! By default, pool_hba authentication is enabled. See "6. Setting up ! pgpool.conf" for more detail.
! The format of the pool_hba.conf file follows very closely PostgreSQL's ! pg_hba.conf format.
! local DATABASE USER METHOD [OPTION] ! host DATABASE USER CIDR-ADDRESS METHOD [OPTION]
! See "pool_hba.conf.sample" for a detailed explanation of each field.
! Here are the limitations of pool_hba. !
!Though "hostssl" is invalid, pgpool-II 2.3 or later supports SSL. ! See SSL for more details. !
!Since pgpool does not know anything about users in the backend server, the database name is simply checked against entries in the DATABASE field of pool_hba.conf. !
!This is for the same reason as for the "samegroup" described above. A user name is simply checked against the entries in the USER field of pool_hba.conf. !
!pgpool currently does not support IPv6. !
!Again, this is for the same reason as for the "samegroup" described above. pgpool does not have access to user/password information. !
!! To use md5 authentication, you need to register your name and password in "pool_passwd". ! See Authentication / Access Controls for more details.
! Note that everything described in this section is about the authentication ! taking place between a client and pgpool; ! a client still has to go through the PostgreSQL's authentication process. ! As far as pool_hba is concerned, it does not matter if a user name and/or database name ! given by a client (i.e. psql -U testuser testdb) really exists in the backend. ! pool_hba only cares if a match in the pool_hba.conf is found or not.
! PAM authentication is supported using user information on the host where ! pgpool is executed. To enable PAM support in pgpool, specify "--with-pam" ! option to configure:
! configure --with-pam
! To enable PAM authentication, you need to create a ! service-configuration file for pgpool in the system's PAM ! configuration directory (which is usually at "/etc/pam.d"). ! A sample service-configuration file is installed as "share/pgpool.pam" under the install directory.
! ! ! ! !The Query cache can be used in all modes in pgpool-II. ! The query cache allow to reuse the SELECT result to boost the performance. ! Activating it in pgpool.conf is done as follows:
enable_query_cache = true*************** *** 2002,2010 **** --- 2499,2875 ---- However, you may have to modify the schema in this statement, if you don't use "pgpool_catalog". +
+ Caution: Current query cache implementation creates cache data on database. + Thus enabling query cache may not contribute to boost performance. + Contents of query cache is not updated even if the underlying table is get updated. + You need to delete the cache data from the cache table or restart pgpool-II with -c (delete cache) option. + +
+ + + + + ++ You can use on memory query cache in any mode. + It is different from the above query cache on the point that + on memory query cache is faster because cache storage is on memory. + Moreover you don't need to restart pgpool-II when the cache is outdated + because the underlying table gets updated. +
+ ++ On memory cache saves pair of SELECT statements + (with its Bind parameters if the SELECT is an extended query). + If the same SELECTs comes in, it returns the value from cache. + Since no SQL parsing nor access to PostgreSQL are involved, it's extremely fast. +
+ ++ On the other hand, it might be slower than the normal path because it adds some overhead to store cache. + Moreover when a table is updated, pgpool automatically deletes all the caches related to the table. + So the performance will be degraded by a system with a lot of updates. + If the cache_hit_ratio is lower than 70%, you might want to disable on memory cache. +
+ ++ To enable the memory cache functionality, set this to on (default is off). +
+ ++ memory_cache_enabled = on ++ +
+ You can choose a cache storage: shared memory or memcached + (you can't use the both). + Query cache with shared memory is fast and easy because you don't have to install and configure memcached, + but restricted the max size of cache by the one of shared memory. + Query cache with memcached needs a overhead to access network, but you can set the size as you like. +
++ Memory cache behavior can be specified by memqcache_method directive. + Either "shmem"(shared memory) or "memcached". Default is shmem. +
++ memqcache_method = 'shmem' ++ + +
+ + Not All of SELECTs and WITH can be cached. In some cases including followings, + cache is avoided to keep consistency between caches and databases. +
++ It can happen that even if the matched query cache exists, pgpool doesn't return it. +
++ These are the parameters used with both of shmem and memcached. +
+ ++ Life time of query cache in seconds. Default is 0. + 0 means no cache expiration, and cache have been enabled until a table is updated. + This parameter and memqcache_auto_cache_invalidation + are orthogonal. +
++ If on, automatically deletes cache related to the updated tables. + If off, does not delete caches. Default is on. + This parameter and memqcache_expire. are orthogonal. +
++ If the size of a SELECT result is larger than memqcache_maxcache bytes, + it is not cached and the messages is shown: +
++ 2012-05-02 15:08:17 LOG: pid 13756: pool_add_temp_query_cache: data size exceeds memqcache_maxcache. current:4095 requested:111 memq_maxcache:4096 ++
+ To avoid this problem, you have to set memqcache_maxcache larger. + But if you use shared memory as the cache storage, + it must be lower than memqcache_cache_block_size. + If memqcached, it must be lower than the size of slab (default is 1 MB). +
++ Specify a comma separated list of table names whose SELECT results are to be cached even if + they are VIEWs or unlogged tables. You can use regular expression (to which added automatically ^ and $). +
++ TABLEs and VIEWs in both of white_memqcache_table_list and + black_memqcache_table_list are cached. +
++ Specify a comma separated list of table names whose SELECT results are NOT to be cached. + You can use regular expression (to which added automatically ^ and $). +
++ Full path to the directory where oids of tables used by SELECTs are stored. + Under memqcache_oiddir there are directories named database oids, + and under each of them there are files named table oids used by SELECTs. + In the file pointers to query cache are stored. + They are used as keys to delete caches. +
++ Directories and files under memqcache_oiddir are not deleted whenever pgpool-II restarts. + If you start pgpool by "pgpool -C", pgpool starts without the old oidmap. +
++ This explains how to monitor on memory query cache. + To know if a SELECT result is from query cache or not, + enable log_per_node_statement. +
++ 2012-05-01 15:42:09 LOG: pid 20181: query result fetched from cache. statement: select * from t1; ++ +
+ pool_status command shows the cache hit ratio. +
++ memqcache_stats_start_time | Tue May 1 15:41:59 2012 | Start time of query cache stats + memqcache_no_cache_hits | 80471 | Number of SELECTs not hitting query cache + memqcache_cache_hits | 36717 | Number of SELECTs hitting query cache ++ + +
+ In this example, you can calculate like the below: +
++
+ (memqcache_cache_hits) / (memqcache_no_cache_hits+memqcache_cache_hits) = 36717 / (36717 + 80471) = 31.3% ++ +
+ show pool_cache commands shows the same one. +
+ ++ These are the parameters used with shared memory as the cache storage. +
+ ++ Specify the size of shared memory as cache storage in bytes. +
++ Specify the number of cache entries. + This is used to define the size of cache management space + (you need this in addition to memqcache_total_size). + The management space size can be calculated by: + memqcache_max_num_cache * 48 bytes. + Too small number will cause an error while registering cache. + On the other hand too large number is just a waste of space. +
++ If cache storage is shared memory, pgpool uses the memory divided by memqcache_cache_block_size. + SELECT result is packed into the block. + However because the SELECT result cannot be placed in several blocks, it cannot be cached if it is larger + than memqcache_cache_block_size. + memqcache_cache_block_size must be greater or equal to 512. +
++ These are the parameters used with memcached as the cache storage. +
+ ++ Specify the host name or the IP address in which memcached works. + If it is the same one as pgpool-II, set 'localhost'. +
++ Specify the port number of memcached. Default is 11211. +
++ To use memcached as cache storage, pgpool-II needs a working memcached and + the client library: libmemcached. + It is easy to install them by rpms. This explains how to install from source codes. +
+ ++ memcached's source code can be downloaded from: + memcached development page +
+ ++ After extracting the source tarball, execute the configure script. +
++ ./configure ++
+ make + make install ++
+ Libmemcached is a client library for memcached. + You need to install libmemcached after installing memcached. +
+ ++ libmemcached's source code can be downloaded from: + libmemcached development page +
+ ++ After extracting the source tarball, execute the configure script. +
++ ./configure ++ +
+ If you want non-default values, some options can be set: +
+ +--with-memcached=path
+ make + make install ++
All the backends and the System DB (if necessary) must be started before starting pgpool-II.
*************** *** 2012,2027 ****pgpool [-c][-f config_file][-a hba_file][-F pcp_config_file][-n][-D][-d]!
!
-c --clear-cache | deletes query cache |
-f config_file --config-file config-file | specifies pgpool.conf |
-a hba_file --hba-file hba_file | specifies pool_hba.conf |
-F pcp_config_file --pcp-password-file | specifies pcp.conf |
-n --no-daemon | no daemon mode (terminal is not detached) |
-D --discard-status | Discard pgpool_status file and do not restore previous status |
-d --debug | debug mode |
pgpool [-c][-f config_file][-a hba_file][-F pcp_config_file][-n][-D][-d]! !
-c | --clear-cache | !deletes query cache |
-f config_file | --config-file config-file | !specifies pgpool.conf |
-a hba_file | --hba-file hba_file | !specifies pool_hba.conf |
-F pcp_config_file | --pcp-password-file | !specifies pcp.conf |
-n | --no-daemon | !no daemon mode (terminal is not detached) |
-D | --discard-status | !Discard pgpool_status file and do not restore previous status ! V3.0 - |
-C | --clear-oidmaps | !Discard oid maps in memqcache_oiddir for on memory query cache ! (only when memqcache_method is 'memcached', ! if shmem, discard whenever pgpool starts). ! V3.2 - |
-d | --debug | debug mode |
There are two ways to stop pgpool-II. One is using a PCP command (described later), the other using a pgpool-II command. Below is an example of the pgpool-II command. *************** *** 2030,2062 ****
pgpool [-f config_file][-F pcp_config_file] [-m {s[mart]|f[ast]|i[mmediate]}] stop!
!
-m s[mart] --mode s[mart] |
waits for clients to disconnect, and shutdown (default) |
-m f[ast] --mode f[ast] |
does not wait for clients; shutdown immediately |
-m i[mmediate] --mode i[mmediate] |
! the same as '-m f' |
! pgpool records backend status into the [logdir]/pgpool_status file. When pgpool ! restarts, it reads this file and restores the backend status. This will prevent a difference in data among DB nodes which might be caused by following scenario:
! If for some reason, for example, the stopped DB has been synced with the ! active DB by another means, pgpool_status can be removed safely before starting pgpool.
pgpool-II can reload configuration files without restarting.
--- 2911,2949 ----pgpool [-f config_file][-F pcp_config_file] [-m {s[mart]|f[ast]|i[mmediate]}] stop! !
-m s[mart] | --mode s[mart] |
waits for clients to disconnect, and shutdown (default) |
-m f[ast] | --mode f[ast] |
does not wait for clients; shutdown immediately |
-m i[mmediate] | --mode i[mmediate] |
! the same as '-m f' |
! pgpool records backend status into the [logdir]/pgpool_status file. When pgpool ! restarts, it reads this file and restores the backend status. This will prevent a difference in data among DB nodes which might be caused by following scenario: +
! If for some reason, for example, the stopped DB has been synced with the ! active DB by another means, pgpool_status can be removed safely before starting pgpool.
+ + + +pgpool-II can reload configuration files without restarting.
*************** *** 2065,2074 **** pgpool [-c][-f config_file][-a hba_file][-F pcp_config_file] reload!
-f config_file --config-file config-file | specifies pgpool.conf |
-a hba_file --hba-file hba_file | specifies pool_hba.conf |
-F pcp_config_file --pcp-password-file | specifies pcp.conf |
--- 2952,2961 ---- pgpool [-c][-f config_file][-a hba_file][-F pcp_config_file] reload
!
-f config_file | --config-file config-file | specifies pgpool.conf |
-a hba_file | --hba-file hba_file | specifies pool_hba.conf |
-F pcp_config_file | --pcp-password-file | specifies pcp.conf |
*************** *** 2076,2214 **** reloading. New configuration takes effect after a change for new sessions.
pgpool-II provides some information via the SHOW command. SHOW is a real SQL statement, but pgPool-II intercepts this command if it asks for specific pgPool-II information. Available options are:
Note : The term 'pool' refers to the pool of PostgreSQL sessions owned by one pgpool process, not the whole sessions owned by pgpool. !
the "pool_status" SQL statement was already available in previous releases, but the other ones have appeared in release 3.0.
!"SHOW pool_status" sends back the list of configuration parameters with their name, value, and description. Here is an excerpt of the result:
benchs2=# show pool_status; ! item | value | description ! -------------------------------+---------------------------------+------------------------------------------------------------------ ! listen_addresses | 127.0.0.1 | host name(s) or IP address(es) to listen to ! port | 9999 | pgpool accepting port number ! socket_dir | /tmp | pgpool socket directory ! num_init_children | 5 | # of children initially pre-forked ! child_life_time | 300 | if idle for this seconds, child exits! !
"SHOW pool_nodes" sends back a list of all configured nodes. It ! displays the node id, the hostname, the port, the status, and the weight (only meaningful ! if you use the load balancing mode). The possible values in the status column are explained in the pcp_node_info reference.
benchs2=# show pool_nodes; ! id | hostname | port | status | lb_weight ! ------+-------------+------+--------+----------- ! 0 | 127.0.0.1 | 5432 | 2 | 0.5 ! 1 | 192.168.1.7 | 5432 | 3 | 0.5 (2 lignes)! !
"SHOW pool_processes" sends back a list of all pgPool-II processes waiting for connections and dealing with a connection.
It has 6 columns:
This view will always return num_init_children lines.
benchs2=# show pool_processes; ! pool_pid | start_time | database | username | create_time | pool_counter ----------+---------------------+----------+-----------+---------------------+-------------- ! 8465 | 2010-08-14 08:35:40 | | | | 8466 | 2010-08-14 08:35:40 | benchs | guillaume | 2010-08-14 08:35:43 | 1 ! 8467 | 2010-08-14 08:35:40 | | | | ! 8468 | 2010-08-14 08:35:40 | | | | ! 8469 | 2010-08-14 08:35:40 | | | | (5 lines)!
"SHOW pool_pools" sends back a list of pools handled by pgPool-II. their name, value, and description. Here is an excerpt of the result:
It has 11 columns:
It'll always return num_init_children*max_pool lines.
! pool_pid | start_time | pool_id | backend_id | database | username | create_time | majorversion | minorversion | pool_counter | pool_backendpid | pool_connected ----------+---------------------+---------+------------+----------+-----------+---------------------+--------------+--------------+--------------+-----------------+---------------- ! 8465 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8465 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8465 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8465 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | 8466 | 2010-08-14 08:35:40 | 0 | 0 | benchs | guillaume | 2010-08-14 08:35:43 | 3 | 0 | 1 | 8473 | 1 ! 8466 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8466 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8466 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | (20 lines)!
"SHOW pool_version" displays a string containing the pgPool-II release number. Here is an example of it:
benchs2=# show pool_version; ! pool_version ------------------------ 3.0-dev (umiyameboshi) (1 line)
--- 2963,3164 ---- reloading. New configuration takes effect after a change for new sessions.
+ + + +pgpool-II provides some information via the SHOW command. SHOW is a real SQL statement, but pgPool-II intercepts this command if it asks for specific pgPool-II information. Available options are: +
Other than "pool_status" are added since pgpool-II 3.0.
Note : The term 'pool' refers to the pool of PostgreSQL sessions owned by one pgpool process, not the whole sessions owned by pgpool. !
the "pool_status" SQL statement was already available in previous releases, but the other ones have appeared in release 3.0.
! !"SHOW pool_status" sends back the list of configuration parameters with their name, value, and description. Here is an excerpt of the result: +
benchs2=# show pool_status; ! item | value | description ! --------------------------------------+--------------------------------+------------------------------------------------------------------ ! listen_addresses | localhost | host name(s) or IP address(es) to listen to ! port | 9999 | pgpool accepting port number ! socket_dir | /tmp | pgpool socket directory ! pcp_port | 9898 | PCP port # to bind ! pcp_socket_dir | /tmp | PCP socket directory! !
"SHOW pool_nodes" sends back a list of all configured nodes. It ! displays the node id, the hostname, the port, the status, the weight (only meaningful ! if you use the load balancing mode) and the role. The possible values in the status column are explained in the pcp_node_info reference. +
benchs2=# show pool_nodes; ! id | hostname | port | status | lb_weight | role ! ------+-------------+------+--------+-----------+--------- ! 0 | 127.0.0.1 | 5432 | 2 | 0.5 | primary ! 1 | 192.168.1.7 | 5432 | 3 | 0.5 | standby (2 lignes)! !
"SHOW pool_processes" sends back a list of all pgPool-II processes waiting for connections and dealing with a connection.
It has 6 columns:
This view will always return num_init_children lines.
benchs2=# show pool_processes; ! pool_pid | start_time | database | username | create_time | pool_counter ----------+---------------------+----------+-----------+---------------------+-------------- ! 8465 | 2010-08-14 08:35:40 | | | | 8466 | 2010-08-14 08:35:40 | benchs | guillaume | 2010-08-14 08:35:43 | 1 ! 8467 | 2010-08-14 08:35:40 | | | | ! 8468 | 2010-08-14 08:35:40 | | | | ! 8469 | 2010-08-14 08:35:40 | | | | (5 lines)! !
"SHOW pool_pools" sends back a list of pools handled by pgPool-II. their name, value, and description. Here is an excerpt of the result:
It has 11 columns:
It'll always return num_init_children * max_pool lines.
! pool_pid | start_time | pool_id | backend_id | database | username | create_time | majorversion | minorversion | pool_counter | pool_backendpid | pool_connected ----------+---------------------+---------+------------+----------+-----------+---------------------+--------------+--------------+--------------+-----------------+---------------- ! 8465 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8465 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8465 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8465 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | 8466 | 2010-08-14 08:35:40 | 0 | 0 | benchs | guillaume | 2010-08-14 08:35:43 | 3 | 0 | 1 | 8473 | 1 ! 8466 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8466 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8466 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8467 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8468 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 0 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 1 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 2 | 0 | | | | | | | | ! 8469 | 2010-08-14 08:35:40 | 3 | 0 | | | | | | | | (20 lines)! !
"SHOW pool_version" displays a string containing the pgPool-II release number. Here is an example of it:
benchs2=# show pool_version; ! pool_version ------------------------ 3.0-dev (umiyameboshi) (1 line)+ +
"SHOW pool_cache" displays cache storage statistics if on memory query cache is enabled. + Here is an example of it:
+ ++ test=# \x + \x + Expanded display is on. + test=# show pool_cache; + show pool_cache; + -[ RECORD 1 ]---------------+--------- + num_cache_hits | 891703 + num_selects | 99995 + cache_hit_ratio | 0.90 + num_hash_entries | 131072 + used_hash_entries | 99992 + num_cache_entries | 99992 + used_cache_enrties_size | 12482600 + free_cache_entries_size | 54626264 + fragment_cache_entries_size | 0 ++ +
*************** *** 2217,2223 ****
! A recovery target node must be in the detached state before doing online recovery. If you wish to add a PostgreSQL server dynamically, add 'backend_hostname' and --- 3167,3173 ----
! A recovery target node must be in the detached state before doing online recovery. If you wish to add a PostgreSQL server dynamically, add 'backend_hostname' and *************** *** 2229,2235 **** caution: Stop autovacuum on the master node (the first node which is up and running). Autovacuum may change the contents of the database and might cause inconsistency after online recovery if ! it's running. This applies only if you're recovering with a simple copy mecanism, such as the rsync one explained below. This doesn't apply if you're using PostgreSQL's PITR mechanism.
--- 3179,3185 ---- caution: Stop autovacuum on the master node (the first node which is up and running). Autovacuum may change the contents of the database and might cause inconsistency after online recovery if ! it's running. This applies only if you're recovering with a simple copy mechanism, such as the rsync one explained below. This doesn't apply if you're using PostgreSQL's PITR mechanism. *************** *** 2239,2321 ****! pgpool-II performs online recovery in two separated phases. There are a few ! seconds or minutes when client will be waiting to connect to pgpool-II while a recovery node synchronizes database. It follows these steps: - -
! The first step of data synchronization is called "first stage". Data is ! synchronized during the first stage. In the first stage, data can be ! updated or retrieved from any table concurrently.
! You can specify a script executed during the first stage. pgpool-II ! passes three arguments to the script. ! !
! Data synchronization is finalized during what is called "second ! stage". Before entering the second stage, pgpool-II waits until all ! clients have disconnected. It blocks any new incoming connection until ! the second stage is over. ! After all connections have terminated, pgpool-II merges updated data between ! the first stage and the second stage. This is the final data ! synchronization step.
! ! Note that there is a restriction about online recovery. If pgpool-II itself ! is installed on multiple hosts, online recovery does not work ! correctly, because pgpool-II has to stop all clients during the 2nd stage of ! online recovery. If there are several pgpool hosts, only one will have ! received the online recovery command and will block connections.
!Set the following parameters for online recovery in pgpool.conf. - -
! You need to install the following C language function for online recovery into the "template1" database of all backend nodes. Its source code is in pgpool-II tarball.
! pgpool-II-x.x.x/sql/pgpool-recovery/
--- 3189,3271 ----
! pgpool-II performs online recovery in two separated phases. There are a few ! seconds or minutes when client will be waiting to connect to pgpool-II while a recovery node synchronizes database. It follows these steps:
+! The first step of data synchronization is called "first stage". Data is ! synchronized during the first stage. In the first stage, data can be ! updated or retrieved from any table concurrently.
! You can specify a script executed during the first stage. pgpool-II ! passes three arguments to the script.
!! Data synchronization is finalized during what is called "second stage". ! Before entering the second stage, pgpool-II waits until all clients have disconnected. ! It blocks any new incoming connection until the second stage is over. !
!! After all connections have terminated, pgpool-II merges updated data between ! the first stage and the second stage. This is the final data ! synchronization step.
! ! Note that there is a restriction about online recovery. If pgpool-II itself ! is installed on multiple hosts, online recovery does not work correctly, ! because pgpool-II has to stop all clients during the 2nd stage of ! online recovery. If there are several pgpool hosts, only one will have received ! the online recovery command and will block connections.
!Set the following parameters for online recovery in pgpool.conf.
+! You need to install the following C language function for online recovery into the "template1" database of all backend nodes. Its source code is in pgpool-II tarball.
! pgpool-II-x.x.x/sql/pgpool-recovery/
*************** *** 2323,2330 ****
! % cd pgpool-II-x.x.x/sql/pgpool-recovery/ ! % make install
--- 3273,3280 ----
! % cd pgpool-II-x.x.x/sql/pgpool-recovery/ ! % make install
*************** *** 2332,2414 ****
! % cd pgpool-II-x.x.x/sql/pgpool-recovery/ ! % psql -f pgpool-recovery.sql template1!
We must deploy some data sync scripts and a remote start script into the database cluster directory ($PGDATA). Sample script files are available in ! pgpool-II-x.x.x/sample directory.
!! Here is how to do online recovery by Point In Time Recovery (PITR), which is available in PostgreSQL 8.2 and later versions. Note that all PostgreSQL servers involved need to have PITR enabled.
A script to get a base backup on a master node and copy it to a recovery target node on the first stage is needed. The script can be named "copy-base-backup" for example. Here is the sample script. -
! #! /bin/sh ! DATA=$1 ! RECOVERY_TARGET=$2 ! RECOVERY_DATA=$3 ! psql -c "select pg_start_backup('pgpool-recovery')" postgres ! echo "restore_command = 'scp $HOSTNAME:/data/archive_log/%f %p'" > /data/recovery.conf ! tar -C /data -zcf pgsql.tar.gz pgsql ! psql -c 'select pg_stop_backup()' postgres ! scp pgsql.tar.gz $RECOVERY_TARGET:$RECOVERY_DATA
! This script puts the master database in backup mode, generates the following ! recovery.conf:
restore_command = 'scp master:/data/archive_log/%f %p'performs the backup, then puts the master database out of backup mode and copies the backup on the chosen target node.
The second stage of the procedure is a script to force an XLOG file switch. This script is named "pgpool_recovery_pitr" here. It enforces a switch of the transaction log. For this purpose, pg_switch_xlog could be used. ! However it may return before the switch is done and this might lead to failure of the online recovery procedure. ! Pgpool-II provides a safer function called "pgpool_switch_xlog" which will wait until the transaction log switching is actually finished. ! pgpool_switch_xlog is installed during the procedure performed in the Installing C functions section.
Here is the sample script.
-
#! /bin/sh # Online recovery 2nd stage script # ! datadir=$1 # master dabatase cluster ! DEST=$2 # hostname of the DB node to be recovered ! DESTDIR=$3 # database cluster of the DB node to be recovered ! port=5432 # PostgreSQL port number ! archdir=/data/archive_log # archive log directory ! # Force to flush current value of sequences to xlog psql -p $port -t -c 'SELECT datname FROM pg_database WHERE NOT datistemplate AND datallowconn' template1| while read i do --- 3282,3370 ----! % cd pgpool-II-x.x.x/sql/pgpool-recovery/ ! % psql -f pgpool-recovery.sql template1!Recovery script deployment
We must deploy some data sync scripts and a remote start script into the database cluster directory ($PGDATA). Sample script files are available in ! pgpool-II-x.x.x/sample directory.
!Online recovery by PITR
! Here is how to do online recovery by Point In Time Recovery (PITR), which is available in PostgreSQL 8.2 and later versions. Note that all PostgreSQL servers involved need to have PITR enabled.
+1st stage
+A script to get a base backup on a master node and copy it to a recovery target node on the first stage is needed. The script can be named "copy-base-backup" for example. Here is the sample script.
! #! /bin/sh ! DATA=$1 ! RECOVERY_TARGET=$2 ! RECOVERY_DATA=$3 ! psql -c "select pg_start_backup('pgpool-recovery')" postgres ! echo "restore_command = 'scp $HOSTNAME:/data/archive_log/%f %p'" > /data/recovery.conf ! tar -C /data -zcf pgsql.tar.gz pgsql ! psql -c 'select pg_stop_backup()' postgres ! scp pgsql.tar.gz $RECOVERY_TARGET:$RECOVERY_DATA! This script puts the master database in backup mode, generates the following recovery.conf:
restore_command = 'scp master:/data/archive_log/%f %p'+ +performs the backup, then puts the master database out of backup mode and copies the backup on the chosen target node.
+2nd stage
+The second stage of the procedure is a script to force an XLOG file switch. This script is named "pgpool_recovery_pitr" here. It enforces a switch of the transaction log. For this purpose, pg_switch_xlog could be used. !
!V3.1 - ! However it may return before the switch is done and this might lead to failure of the online recovery procedure. ! Pgpool-II provides a safer function called "pgpool_switch_xlog" which will wait until the transaction log switching is actually finished. ! pgpool_switch_xlog is installed during the procedure performed in the Installing C functions section.
+Here is the sample script.
#! /bin/sh # Online recovery 2nd stage script # ! datadir=$1 # master dabatase cluster ! DEST=$2 # hostname of the DB node to be recovered ! DESTDIR=$3 # database cluster of the DB node to be recovered ! port=5432 # PostgreSQL port number ! archdir=/data/archive_log # archive log directory ! # Force to flush current value of sequences to xlog psql -p $port -t -c 'SELECT datname FROM pg_database WHERE NOT datistemplate AND datallowconn' template1| while read i do *************** *** 2423,2560 ****This flushing of sequences is only useful in replication mode: in this case, sequences have to have the ! same starting point on all nodes. It's not useful in master-slave mode. ! The loop in the script forces PostgreSQL to emit current value of all sequences ! in all databases in the master node to the transaction log so that it is propagated to the recovery target node. -
We deploy these scripts into the $PGDATA directory.
Finally, we edit pgpool.conf. !
! recovery_1st_stage_command = 'copy-base-backup' ! recovery_2nd_stage_command = 'pgpool_recovery_pitr' !!! We have finished preparing online recovery by PITR.
!pgpool_remote_start
! This script starts up the remote host's postmaster process. ! pgpool-II executes it the following way.
!! % pgpool_remote_start remote_host remote_datadir ! remote_host: Hostname of a recovery target. ! remote_datadir: Database cluster path of a recovery target. !!! In this sample script, we start up the postmaster process over ssh. ! So you need to be able to connect over ssh without a password for it to work.
!! If you recover with PITR, you need to deploy a base backup. PostgreSQL will ! automatically start up doing a PITR recovery. Then it will accept ! connections. !
!! #! /bin/sh ! DEST=$1 ! DESTDIR=$2 ! PGCTL=/usr/local/pgsql/bin/pg_ctl ! # Deploy a base backup ! ssh -T $DEST 'cd /data/; tar zxf pgsql.tar.gz' 2>/dev/null 1>/dev/null < /dev/null ! # Startup PostgreSQL server ! ssh -T $DEST $PGCTL -w -D $DESTDIR start 2>/dev/null 1>/dev/null < /dev/null & !!Online recovery with rsync.
!! PostgreSQL 7.4 does not have PITR. rsync can be used to do online ! recovery. In the "sample" directory of pgpool-II's tarball, ! there is a recovery script named "pgpool_recovery". It uses the rsync ! command. pgpool-II calls the script with three arguments. !
!! % pgpool_recovery datadir remote_host remote_datadir ! datadir: Database cluster path of a master node. ! remote_host: Hostname of a recovery target node. ! remote_datadir: Database cluster path of a recovery target node. !!! This script copies physical files with rsync over ssh. So you need to ! be able to connect over ssh without a password. !
!! Note about rsync: !
! If you use pgpool_recovery, add the following lines into pgpool.conf. !
! recovery_1st_stage_command = 'pgpool_recovery' ! recovery_2nd_stage_command = 'pgpool_recovery' !! !
! In order to do online recovery, use the pcp_recovery_node command or ! pgpoolAdmin. !
!! Note that you need to pass a large number to the first argument of ! pcp_recovery_node. It is the timeout parameter in seconds. If you use ! pgpoolAdmin, set "_PGPOOL2_PCP_TIMEOUT " parameter to a large number in ! pgmgt.conf.php. !
This section describes problems and their workarounds while you are using pgpool-II.
--
! Pgpool-II's health checking feature detects DB nodes failure. !
!
2010-07-23 16:42:57 ERROR: pid 20031: health check failed. 1 th host foo at port 5432 is down 2010-07-23 16:42:57 LOG: pid 20031: set 1 th backend down status --- 3379,4026 ----This flushing of sequences is only useful in replication mode: in this case, sequences have to have the ! same starting point on all nodes. It's not useful in master-slave mode. !
!The loop in the script forces PostgreSQL to emit current value of all sequences ! in all databases in the master node to the transaction log so that it is propagated to the recovery target node.
We deploy these scripts into the $PGDATA directory.
Finally, we edit pgpool.conf. +
+ ++ recovery_1st_stage_command = 'copy-base-backup' + recovery_2nd_stage_command = 'pgpool_recovery_pitr' ++ ++ We have finished preparing online recovery by PITR. +
+ +pgpool_remote_start
++ This script starts up the remote host's postmaster process. + pgpool-II executes it the following way. +
+ ++ % pgpool_remote_start remote_host remote_datadir + remote_host: Hostname of a recovery target. + remote_datadir: Database cluster path of a recovery target. ++ ++ In this sample script, we start up the postmaster process over ssh. + So you need to be able to connect over ssh without a password for it to work. +
+ ++ If you recover with PITR, you need to deploy a base backup. PostgreSQL will + automatically start up doing a PITR recovery. Then it will accept connections. +
+ ++ #! /bin/sh + DEST=$1 + DESTDIR=$2 + PGCTL=/usr/local/pgsql/bin/pg_ctl + + # Deploy a base backup + ssh -T $DEST 'cd /data/; tar zxf pgsql.tar.gz' 2>/dev/null 1>/dev/null < /dev/null + # Startup PostgreSQL server + ssh -T $DEST $PGCTL -w -D $DESTDIR start 2>/dev/null 1>/dev/null < /dev/null & ++ +Online recovery with rsync.
++ PostgreSQL 7.4 does not have PITR. + PostgreSQL 8.0 and 8.1 cannot force to switch transaction log. + So rsync can be used to do online recovery. + In the "sample" directory of pgpool-II's tarball, + there is a recovery script named "pgpool_recovery". It uses the rsync + command. pgpool-II calls the script with three arguments. +
+ ++ % pgpool_recovery datadir remote_host remote_datadir + datadir: Database cluster path of a master node. + remote_host: Hostname of a recovery target node. + remote_datadir: Database cluster path of a recovery target node. ++ ++ This script copies physical files with rsync over ssh. So you need to + be able to connect over ssh without a password. +
+ ++ Note about rsync: +
+ If you use pgpool_recovery, add the following lines into pgpool.conf. +
+ ++ recovery_1st_stage_command = 'pgpool_recovery' + recovery_2nd_stage_command = 'pgpool_recovery' ++ +
+ In order to do online recovery, use the pcp_recovery_node command or + pgpoolAdmin. +
+ ++ Note that you need to pass a large number to the first argument of + pcp_recovery_node. It is the timeout parameter in seconds. If you use + pgpoolAdmin, set "_PGPOOL2_PCP_TIMEOUT " parameter to a large number in + pgmgt.conf.php. +
+ ++ You can update PostgreSQL on each node without stopping pgpool-II if pgpool-II operated in replication mode. + Please note that active sessions from clients to pgpool-II will be disconnected + while disconnecting and attaching DB nodes. + Also please note that you cannot do major version up in the method described below + (i.e. the version up should not require dump/restore). +
+ ++
+ Prepare online recovery.
+Version up should perform nodes which are not master node first. + Stop PostgreSQL on a non-master node. + Pgpool-II will detect PostgreSQL termination and degenerate emitting logs below. + At this point all sessions connected to pgpool-II disconnected. +
++ 2010-07-27 16:32:29 LOG: pid 10215: set 1 th backend down status + 2010-07-27 16:32:29 LOG: pid 10215: starting degeneration. shutdown host localhost(5433) + 2010-07-27 16:32:29 LOG: pid 10215: failover_handler: set new master node: 0 + 2010-07-27 16:32:29 LOG: pid 10215: failover done. shutdown host localhost(5433) ++
+ Version up PostgreSQL on the stopping node. + You can overwrite old PostgreSQL, we recommend move old PostgreSQL somewhere + so that you could recover it just in case however. +
++ If you install new PostgreSQL in different location from the old one and + do not want to update your recovery script, + you need to match the path by using tools including symbolic link. + If you choose to overwrite, you can skip following steps till installation of C function step. + You can execute online recovery immediately. +
++ Change installation directory of old PostgreSQL. + Installting directory of PostgreSQL is supposed to be /usr/local/pgsql in following description. +
++ $ mv /usr/local/pgsql /usr/local/pgsql-old ++
+ Create a symbolic link to the location where newer version of PostgreSQL installed. + This allow you to continue to use command search path you currently use. + Installing directory of newer PostgreSQL is supposed to be /usr/local/pgsql-new in following description. +
++ $ ln -s /usr/local/pgsql-new /usr/local/pgsql ++
+ If database directory is located under older PostgreSQL installation directory, + you should create or copy so that newer PostgreSQL can access it. + We use symbolic link in the following example. +
++ $ ln -s /usr/local/pgsql-old/data /usr/local/pgsql/data ++
+ Install C functions into PostgreSQL. "Installing C functions" section may help you. + Because online recovery copies database cluster, the last step installing functions + using psql is not necessary. + Do make install. +
++ Do online recovery. You are done with one node version up. + To execute online recovery, you can use pcp_recovery_node or pgpoolAdmin. +
++ Repeat steps above on each node. In the very last master node should be updated. + You are done. +
++ You can update standby PostgreSQL server without stopping pgpool-II. +
+ ++ The procedure to update standby PostgreSQL servers are same as the one of replication mode. + Please refer to "Online recovery with Streaming Replication" to set up recovery_1st_stage_command and recovery_2nd_stage_command. +
+ ++ You cannot version up primary server without stopping pgpool-II. + You need to stop pgpool-II while updating primary server. + The procedure to update primary PostgreSQL server is same as the one standby server. + The procedure to update primary PostgreSQL server is as follows: +
++ To back up backend PostgreSQL servers and system DB, you can physical + backup, logical backup (pg_dump, pg_dumpall) and PITR in the same manner as PostgreSQL. + Please note that using logical backup and PITR should be performed directory PostgreSQL, + rather than via pgpool-II to avoid errors caused by load_balance_mode and + replicate_select. +
+ ++ If pgpool-II is operated in replication mode or master/slave mode, take a backup on one DB nodes in the cluster. +
+ ++ If you are using master/slave mode and asynchronous replication + systems(Slony-I and streaming replication) and need the latest backup, + you should take a backup on the master node. +
+ ++ pg_dump takes ACCESS SHARE lock on database. Commands taking ACCESS + EXECUTE lock, such as ALTER TABLE, DROP TABLE, TRUNCATE, REINDEX, + CLUSTER and VACUUM FULL will wait for the completion of pg_dump because + of lock conflict. Also this may affect the primary node even if you are + doing pg_dump on standby. +
+ ++ If you are using parallel mode and need to take a consistent backup, you need to stop pgpool-II. +
+ ++ To use logical backup, stop applications and pgpool-II then perform pg_dump or pg_dumpall on all nodes. + After finishing backup, start pgpool-II then start applications. +
+ ++ To use PITR, please make sure that system times are identical on all nodes. + Prepare archive logging and take base backup. + After finishing backup, stop and restart applications or pgpool-II. + Record the time of stop and start. + This temporary stop will make consistent state among all over cluster. + If you need to restore form the base backup and archive log, + set recovery_target_time of recovery.conf in the middle of the start/stop time. +
+ ++ You need to backup system DB if pgpool-II is operated in parallel query + mode or if you are using query cache. + Backup database specified by system_db_dbname in pgpool.conf. +
+ + + + + ++ pgpool-II can run on a dedicated server, on the server where application + server is running on or other servers. In this section we discuss how + to make those deployments and pros and cons. +
+ ++
+ Pgpool-II is running on a dedicated server. + It's simple and pgpool-II is not affected by other server softwares. + Obvious cons is you need to by more hardware. + Also pgpool-II can be a single point of failure with this configuration + (you can avoid this by using pgpool-HA described somewhere below). +
++ Deploying pgpool-II on a server where Apache, JBoss, Tomcat or other web + server and application servers. + Since communication between pgpool-II and web servers and application servers + is within a local machine, socket communication can be faster than inter sever communication. + Also if you are using multiple web serves or application servers, you can avoid + the single point of failure problem + (in this case you must have identical pgpool.conf on each pgpool-II instance). + There are several things you should care about in this configuration: +
++ Running pgpool-II on the server as PostgreSQL is running on. + You can avoid the single point of failure problem of pgpool-II with configuration. + And obviously you do need to buy additional dedicated server. + Problem with this configuration is, application need to aware that which DB server they should connect to. + To solve the problem you can use virtual IP with pgpool-HA. +
++ Pgpool-HA is a high availability software for pgpool-II using hearbeat. + Pgpool-HA is a sub project of pgpool project as well as pgpool-II. + Pgpool-HA can be available from pgpool development site as an open source software. +
+ + + + + ++ "Watchdog" is a sub process of pgpool-II aiming for adding high availability feature to it. + Features added by watchdog include: +
+ ++ Watchdog monitors responses of pgpool service rather than process. + It sends queries to PostgreSQL via pgpool which is being monitored + by watchdog and watchdog checks the response. +
++ Also watchdog monitors connections to up stream servers + (application servers etc.) from the pgpool. + The connection between the up stream servers and the pgpool is monitored as service of pgpool. +
++ Watchdog processes exchange information on the monitored servers + to keep the information up to date, + and to allow watchdog processes to mutually monitor each other. +
++ When a fault is detected in the pgpool service, watchdog notifies the other watchdogs of it. + Watchdogs decide the new active pgpool if previous active pgpool is broken + by voting and change active/standby state. +
++ When a standby pgpool server promotes to active, the new active server brings up virtual IP interface. + Meanwhile, the previous active server brings down the virtual IP interface. + This enables the active pgpool to work using the same IP address even when servers is switched over. +
+! When broken server recovers or new server is attached, ! the watchdog process notifies the other watchdog process along with information of the new server, ! and the watchdog process receives information on the active server and other servers. ! Then, the attached server is registered as standby. !
!! Figure below describes how pgpool-II and watchdog process is configured.
+! Watchdog process is started/stopped automatically by pgpool-II, ! therefore there is no dedicated command to start/stop it. ! However, pgpool-II must to be started with root privilege ! because watchdog process needs to control the virtual IP interface.
!! Watchdog configuration parameters is described in pgpool.conf. ! There is sample configuration in WATCHDOG section in pgpool.conf.sample file. ! All following options are required to be specified in watchdog process.
!! The list of trusted servers to check the up stream connections. ! Each server is required to respond to ping. ! Specify a comma separated list of servers such as "hostA,hostB,hostC". !
!! This parameter specifies a path of ping command for monitoring connection ! to the upper servers. ! Set the only path such as "/bin". !
!! This parameter specifies the interval between life checks of pgpool-II ! in second. ! (A number greater than or equal to 1) !
!! The times to retry a failed life check of pgpool-II. ! (A number greater than or equal to 1) !
!! Actual query to check pgpool-II. ! Default is "SELECT 1" !
!! Specifies the virtual IP address (VIP) of pgpool-II that is connected from ! client servers (application servers etc.). ! When a pgpool is switched from standby to active, the pgpool takes over this VIP. !
!! This parameter specifies a path of a command to switch the IP address. ! Set the only path such as "/sbin". !
!! This parameter specifies a command to bring up the virtual IP. ! Set the command and parameters such as ! "ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0". ! $_IP_$ is replaced by the IP address specified in delegate_IP. !
!! This parameter specifies a command to bring down the virtual IP. ! Set the command and parameters such as ! "ifconfig eth0:0 down". !
!! This parameter specifies a path of a command to send an ARP request ! after the virtual IP is switched. ! Set the only path such as "/usr/sbin". !
!! This parameter specifies a command to send an ARP request ! after the virtual IP is switched. ! Set the command and parameters such as ! "arping -U $_IP_$ -w 1". ! $_IP_$ is replaced by the IP address specified in delegate_IP. !
!! Specifies the hostname or IP address for mutual monitoring of watchdog processes. !
!! Specifies the port number for mutual monitoring of watchdog processes. !
!! Specifies the hostname pgpool-II server to be monitored. ! The number at the end of the parameter name is referred as "server id", ! and it starts from 0. !
!! Specifies the port number for pgpool on pgpool-II server to be monitored. ! The number at the end of the parameter name is referred as "server id", ! and it starts from 0. !
!! Specifies the port number for watchdog on pgpool-II server to be monitored. ! The number at the end of the parameter name is referred as "server id", ! and it starts from 0. !
!This section describes problems and their workarounds while you are using pgpool-II.
!! Pgpool-II's health checking feature detects DB nodes failure. !
!2010-07-23 16:42:57 ERROR: pid 20031: health check failed. 1 th host foo at port 5432 is down 2010-07-23 16:42:57 LOG: pid 20031: set 1 th backend down status *************** *** 2562,2661 **** 2010-07-23 16:42:58 LOG: pid 20031: failover_handler: set new master node: 0 2010-07-23 16:42:58 LOG: pid 20031: failover done. shutdown host foo(5432)- -
- The log shows that the DB node 1 (host foo) goes down and disconnected (shutdown) - from pgpool, and then that DB node 0 becomes new master. - Check DB node 1 and remove the cause of failure. After that perform an online - recovery against DB node 1 if possible. -
!
2010-07-26 18:43:24 LOG: pid 24161: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited!
! This log indicates that the frontend program didn't disconnect properly from ! pgpool-II. ! The possible causes are: bugs of client applications, forced termination (kill) ! of a client application, or temporary network failure. ! This kind of events don't lead to a DB destruction or data consistency problem. ! It's only a warning about a protocol violation. ! It is advised that you check the applications and networks if the ! message keeps on occurring. !
!! It is possible that you get this error when pgpool-II operates in ! replication mode. !
2010-07-22 14:18:32 ERROR: pid 9966: kind mismatch among backends. Possible last query was: "FETCH ALL FROM c;" kind details are: 0[T] 1[E: cursor "c" does not exist]!
! Pgpool-II waits for responses from the DB nodes after sending an SQL command ! to them. ! This message indicates that not all DB nodes returned the same kind of response. ! You'll get the SQL statement which possibly caused the error after "Possible last query ! was:". ! Then the kind of response follows. If the response indicates an error, the ! error message from PostgreSQL is shown. ! Here you see "0[T]" displaying the DB node responses: "0[T]"(starting ! to send row description), and "1[E" indicates that DB node 1 returns an ! error with message "cursor "c" does not exist", while DB node 0 sends a row ! description. !
! Caution: You will see this error when operating in master/slave mode as ! well. ! For example, even in the master/slave mode, SET command will be basically sent ! to all DB nodes to keep all the DB nodes in the same state. !
!! Check the databases and resync them using online recovery if you find ! that they are out of sync. !
!!
! In replication mode, pgpool-II detects a different number of ! INSERT/UPDATE/DELETE rows on affected nodes. !
!
2010-07-22 11:49:28 ERROR: pid 30710: pgpool detected difference of the number of inserted, updated or deleted tuples. Possible last query was: "update t1 set i = 1;" 2010-07-22 11:49:28 LOG: pid 30710: ReadyForQuery: Degenerate backends: 1 2010-07-22 11:49:28 LOG: pid 30710: ReadyForQuery: Affected tuples are: 0 1!
! In the example above, the returned number of updated rows by "update t1 set i = 1" ! was different among DB nodes. ! The next line indicates that DB 1 got degenerated (disconnected) as a ! consequence, and that the number of affected rows for DB node 0 was 0, while ! for DB node 1 that was 1. !
!! Stop the DB node which is suspected of having wrong data and do an online ! recovery. !
!
--- 4028,4129 ---- 2010-07-23 16:42:58 LOG: pid 20031: failover_handler: set new master node: 0 2010-07-23 16:42:58 LOG: pid 20031: failover done. shutdown host foo(5432) !
! The log shows that the DB node 1 (host foo) goes down and disconnected (shutdown) ! from pgpool, and then that DB node 0 becomes new master. ! Check DB node 1 and remove the cause of failure. After that perform an online ! recovery against DB node 1 if possible. !
! ! !2010-07-26 18:43:24 LOG: pid 24161: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited!
! This log indicates that the frontend program didn't disconnect properly from ! pgpool-II. ! The possible causes are: bugs of client applications, forced termination (kill) ! of a client application, or temporary network failure. ! This kind of events don't lead to a DB destruction or data consistency problem. ! It's only a warning about a protocol violation. ! It is advised that you check the applications and networks if the ! message keeps on occurring. !
!! It is possible that you get this error when pgpool-II operates in replication mode. !
2010-07-22 14:18:32 ERROR: pid 9966: kind mismatch among backends. Possible last query was: "FETCH ALL FROM c;" kind details are: 0[T] 1[E: cursor "c" does not exist]!
! Pgpool-II waits for responses from the DB nodes after sending an SQL command to them. ! This message indicates that not all DB nodes returned the same kind of response. ! You'll get the SQL statement which possibly caused the error after "Possible last query was:". ! Then the kind of response follows. If the response indicates an error, the ! error message from PostgreSQL is shown. ! Here you see "0[T]" displaying the DB node responses: "0[T]" ! (starting to send row description), and "1[E" indicates that DB node 1 returns an error ! with message "cursor "c" does not exist", while DB node 0 sends a row description. !
!! Caution: You will see this error when operating in master/slave mode as well. ! For example, even in the master/slave mode, SET command will be basically sent to all DB nodes ! to keep all the DB nodes in the same state. !
!! Check the databases and re-sync them using online recovery if you find ! that they are out of sync. !
!! In replication mode, pgpool-II detects a different number of ! INSERT/UPDATE/DELETE rows on affected nodes. !
2010-07-22 11:49:28 ERROR: pid 30710: pgpool detected difference of the number of inserted, updated or deleted tuples. Possible last query was: "update t1 set i = 1;" 2010-07-22 11:49:28 LOG: pid 30710: ReadyForQuery: Degenerate backends: 1 2010-07-22 11:49:28 LOG: pid 30710: ReadyForQuery: Affected tuples are: 0 1!
! In the example above, the returned number of updated rows by "update t1 set i = 1" ! was different among DB nodes. ! The next line indicates that DB 1 got degenerated (disconnected) as a consequence, ! and that the number of affected rows for DB node 0 was 0, ! while for DB node 1 that was 1. !
!! Stop the DB node which is suspected of having wrong data and do an online recovery. !
!!
*************** *** 2663,2735 ****
! pgpool-II 2.3.2 or later supports large object replication if the backend is PostgreSQL 8.1 or later. For this, you need to enable lobj_lock_table directive in pgpool.conf. Large object replication using backend function lo_import is not supported, however.
!Creating/inserting/updating/deleting temporary tables are always executed on the master(primary). With pgpool-II 3.0 or later, SELECT on these tables is executed on master as well. ! However if the temporary table name is used as a literal in SELECT, there's no way to detect it, and the SELECT will be load balanced. ! That will trigger a "not found the table" error or will find another table having same name. To avoid the problem, use /*NO LOAD BALANCE*/ SQL comment.
!
Sample SELECT which causes a problem: SELECT 't1'::regclass::oid;!
psql's \d command uses literal table names. ! pgpool-II 3.0 or later checks if the SELECT includes any access to system catalogs and always send these queries to the master. Thus we avoid the problem.
!There is no guarantee that any data provided using a context-dependent mechanism (e.g. random number, transaction ID, OID, SERIAL, sequence), will be replicated correctly on multiple backends.
For SERIAL, enabling insert_lock will help replicating data. --- 4131,4200 ----
! pgpool-II 2.3.2 or later supports large object replication if the backend is PostgreSQL 8.1 or later. For this, you need to enable lobj_lock_table directive in pgpool.conf. Large object replication using backend function lo_import is not supported, however.
!Creating/inserting/updating/deleting temporary tables are always executed on the master(primary). With pgpool-II 3.0 or later, SELECT on these tables is executed on master as well. ! However if the temporary table name is used as a literal in SELECT, there's no way to detect it, and the SELECT will be load balanced. ! That will trigger a "not found the table" error or will find another table having same name. To avoid the problem, use /*NO LOAD BALANCE*/ SQL comment.
!Sample SELECT which causes a problem: SELECT 't1'::regclass::oid;!
psql's \d command uses literal table names. ! pgpool-II 3.0 or later checks if the SELECT includes any access to system catalogs and always send these queries to the master. Thus we avoid the problem.
!There is no guarantee that any data provided using a context-dependent mechanism (e.g. random number, transaction ID, OID, SERIAL, sequence), will be replicated correctly on multiple backends.
For SERIAL, enabling insert_lock will help replicating data. *************** *** 2739,2793 **** CURRENT_DATE, now() will be replicated correctly. INSERT/UPDATE for tables using CURRENT_TIMESTAMP, CURRENT_DATE, now() as their DEFAULT values will also be replicated correctly. ! This is done by replacing those functions by constants fetched from master at query execution time. There are a few limitations however:
CREATE TABLE rel1( d1 date DEFAULT CURRENT_DATE + 1 )! is treated the same as:
CREATE TABLE rel1( d1 date DEFAULT CURRENT_DATE )!
! Please note that if the column type is not a temporal one, rewriting is ! not performed. Such example:
foo bigint default (date_part('epoch'::text,('now'::text)::timestamp(3) with time zone) * (1000)::double precision)! !
CREATE TABLE rel1( c1 int, c2 timestamp default now() )! We can replicate
! INSERT INTO rel1(c1) VALUES(1)! since this turn into
INSERT INTO rel1(c1, c2) VALUES(1, '2009-01-01 23:59:59.123456+09')! However,
INSERT INTO rel1(c1) SELECT 1! cannot to be transformed, thus cannot be properly replicated in the current implementation. ! Values will still be inserted, with no transformation at all.
Tables created by CREATE TEMP TABLE
will be deleted at the end of
! the session by specifying DISCARD ALL in reset_query_list if you are using
PostgreSQL 8.3 or later.
--- 4204,4272 ---- CURRENT_DATE, now() will be replicated correctly. INSERT/UPDATE for tables using CURRENT_TIMESTAMP, CURRENT_DATE, now() as their DEFAULT values will also be replicated correctly. ! This is done by replacing those functions by constants fetched from master at query execution time. There are a few limitations however: +
+CREATE TABLE rel1( d1 date DEFAULT CURRENT_DATE + 1 )! is treated the same as:
CREATE TABLE rel1( d1 date DEFAULT CURRENT_DATE )!
! pgpool-II 3.1 or later handles these cases correctly. ! Thus the column "d1" will have tomorrow as the default value. ! However this enhancement does not apply if extended protocols(used in JDBC, ! PHP PDO for example) or PREPARE are used. !
!! Please note that if the column type is not a temporal one, rewriting is ! not performed. Such example: !
foo bigint default (date_part('epoch'::text,('now'::text)::timestamp(3) with time zone) * (1000)::double precision)!
CREATE TABLE rel1( c1 int, c2 timestamp default now() )! ! We can replicate
! INSERT INTO rel1(c1) VALUES(1)! ! since this turn into
INSERT INTO rel1(c1, c2) VALUES(1, '2009-01-01 23:59:59.123456+09')! ! However,
INSERT INTO rel1(c1) SELECT 1! cannot to be transformed, thus cannot be properly replicated in the current implementation. ! Values will still be inserted, with no transformation at all. !
Tables created by CREATE TEMP TABLE
will be deleted at the end of
! the session by specifying DISCARD ALL in reset_query_list if you are using
PostgreSQL 8.3 or later.
***************
*** 2798,2812 ****
temporary tables by issuing DROP TABLE
, or use CREATE TEMP
TABLE ... ON COMMIT DROP
inside the transaction block.
Here are the queries which cannot be processed by pgpool-II
!You cannot use DEFAULT
with the
partitioning key column. For example, if the column x in the table t
was the partitioning key column,
INSERT INTO t(x) VALUES (DEFAULT); --- 4277,4292 ---- temporary tables by issuing!DROP TABLE
, or useCREATE TEMP TABLE ... ON COMMIT DROP
inside the transaction block. !Queries
Here are the queries which cannot be processed by pgpool-II
!INSERT (for parallel mode)
You cannot use
DEFAULT
with the partitioning key column. For example, if the column x in the table t was the partitioning key column, +INSERT INTO t(x) VALUES (DEFAULT); *************** *** 2824,2830 **** are also not supported. !!UPDATE (for parallel mode)
Data consistency between the backends may be lost if the partitioning key column values are updated. pgpool-II does not --- 4304,4310 ---- are also not supported.
!UPDATE (for parallel mode)
Data consistency between the backends may be lost if the partitioning key column values are updated. pgpool-II does not *************** *** 2840,2846 ****
SELECT ... FOR UPDATE (for parallel mode)
If a function is called in the
WHERE
clause, that query might not be executed correctly. For example: --- 4320,4326 ----
If a function is called in the WHERE
clause, that
query might not be executed correctly. For example:
***************
*** 2849,2866 ****
COPY BINARY
is not supported. Copying from files is
also not supported. Only COPY FROM STDIN
and COPY
TO STDOUT
are supported.
To update the partitioning rule, pgpool-II must be restarted in order to read them from the System DB.
!SELECT
statements executed inside a transaction block
will be executed in a separate transaction. Here is an example:
--- 4329,4346 ----
COPY BINARY
is not supported. Copying from files is
also not supported. Only COPY FROM STDIN
and COPY
TO STDOUT
are supported.
To update the partitioning rule, pgpool-II must be restarted in order to read them from the System DB.
!SELECT
statements executed inside a transaction block
will be executed in a separate transaction. Here is an example:
***************
*** 2868,2878 ****
BEGIN; INSERT INTO t(a) VALUES (1); ! SELECT * FROM t ORDER BY a; <-- INSERT above is not visible from this SELECT statement END;!
The same definition will be created on all the backends for views and rules. --- 4348,4358 ----
BEGIN; INSERT INTO t(a) VALUES (1); ! SELECT * FROM t ORDER BY a; <-- INSERT above is not visible from this SELECT statement END;!
The same definition will be created on all the backends for views and rules. *************** *** 2886,2897 **** then merged with the results returned by each backend. Views and Rules that join across the nodes cannot be created. However, to JOIN tables that access data only in the same node, a VIEW can be made. ! This VIEW has to be registered in the pgpool_catalog.dist_def table. A col_name and a dist_def_func will have to be registered too. These are used when an insert is performed on the view.
!The same definition will be created on all the backends for functions. Joining across the nodes, and accessing data on the other nodes cannot --- 4366,4377 ---- then merged with the results returned by each backend. Views and Rules that join across the nodes cannot be created. However, to JOIN tables that access data only in the same node, a VIEW can be made. ! This VIEW has to be registered in the pgpool_catalog.dist_def table. A col_name and a dist_def_func will have to be registered too. These are used when an insert is performed on the view.
!The same definition will be created on all the backends for functions. Joining across the nodes, and accessing data on the other nodes cannot *************** *** 2903,2961 **** supported. The simple query protocol must be used. This means you can't use prepared statements.
!The Natural Join is not supported. "ON join condition" or "USING (join_column)" must be used.
!The USING CLAUSE is converted to an ON CLAUSE by the query rewrite process. Therefore, when "*" is used at target list, the joined column(s) appear twice.
Here is an example:
!! =# SELECT * FROM t1 JOIN t2 USING(id); ! id | t | t ! ----+-----+------- ! 1 | 1st | first ! (1 row) !
In the rewrite process "USING" is translated into "ON". So the effective result is as follows: -
- =# SELECT * FROM t1 JOIN t2 ON t1.id = t2.id; - id | t | id | t - ----+-----+----+------- - 1 | 1st | 1 | first - (1 row) -
Notice that column "t" is duplicated.
!pgpool-II does not translate between different multi-byte characters. The encoding for the client, backend and System DB must be the same.
!pgpool-II cannot process multi-statement queries.
!Deadlocks across the backends cannot be detected. For example:
(tellers table is partitioned using the following rule) ! tid <= 10 --> node 0 ! tid >= 10 --> node 1 A) BEGIN; B) BEGIN; --- 4383,4442 ---- supported. The simple query protocol must be used. This means you can't use prepared statements. !!Natural Join (for parallel mode)
The Natural Join is not supported. "ON join condition" or "USING (join_column)" must be used.
!USING CLAUSE (for parallel mode)
The USING CLAUSE is converted to an ON CLAUSE by the query rewrite process. Therefore, when "*" is used at target list, the joined column(s) appear twice.
Here is an example:
!! =# SELECT * FROM t1 JOIN t2 USING(id); ! id | t | t ! ----+-----+------- ! 1 | 1st | first ! (1 row) !In the rewrite process "USING" is translated into "ON". So the effective result is as follows:
++ =# SELECT * FROM t1 JOIN t2 ON t1.id = t2.id; + id | t | id | t + ----+-----+----+------- + 1 | 1st | 1 | first + (1 row) ++Notice that column "t" is duplicated.
!Multi-byte Characters (for all modes)
pgpool-II does not translate between different multi-byte characters. The encoding for the client, backend and System DB must be the same.
!Multi-statement Query (for all modes)
pgpool-II cannot process multi-statement queries.
!Deadlocks (for parallel mode)
Deadlocks across the backends cannot be detected. For example:
(tellers table is partitioned using the following rule) ! tid <= 10 --> node 0 ! tid >= 10 --> node 1 A) BEGIN; B) BEGIN; *************** *** 2977,2983 **** pool_read_kind: kind does not match between master(84) slot[1] (69)!Schemas (for parallel mode)
Objects in a schema other than public must be fully qualified like:
--- 4458,4464 ---- pool_read_kind: kind does not match between master(84) slot[1] (69)
Objects in a schema other than public must be fully qualified like:
*************** *** 2994,3010 ****and the schema name is omitted in a query.
!Limitation object:Parallel mode
A table or a column name cannot start by pool_. ! When rewriting queries, these names are used by internal processing.
! !and the schema name is omitted in a query.
! !Limitation object:Parallel mode
A table or a column name cannot start by pool_. ! When rewriting queries, these names are used by internal processing.
!Currently, the query cache must be deleted manually. pgpool-II does not invalidate old query cache automatically when the data is updated.
PCP commands are UNIX commands which manipulate pgpool-II via the network. !
! * pcp_node_count - retrieves the number of nodes ! * pcp_node_info - retrieves the node information ! * pcp_proc_count - retrieves the process list ! * pcp_proc_info - retrieves the process information ! * pcp_systemdb_info - retrieves the System DB information ! * pcp_detach_node - detaches a node from pgpool-II ! * pcp_attach_node - attaches a node to pgpool-II ! * pcp_stop_pgpool - stops pgpool-II !! !
There are five arguments common to all of the PCP commands. They give information about pgpool-II and authentication. Extra --- 4500,4548 ---- version must be 3.0. Building pgpool-II with libpq version 2.0 will fail. Also, the System DB must be PostgreSQL 7.4 or later.
!Currently, the query cache must be deleted manually. pgpool-II does not invalidate old query cache automatically when the data is updated.
PCP commands are UNIX commands which manipulate pgpool-II via the network. !
pcp_node_count | !retrieves the number of nodes |
---|---|
pcp_node_info | !retrieves the node information |
pcp_proc_count | !retrieves the process list |
pcp_proc_info | !retrieves the process information |
pcp_pool_status V3.1 - | !retrieves parameters in pgpool.conf |
pcp_systemdb_info | !retrieves the System DB information |
pcp_detach_node | !detaches a node from pgpool-II |
pcp_attach_node | !attaches a node to pgpool-II |
pcp_promote_node V3.1 - | !promote a new master node to pgpool-II |
pcp_stop_pgpool | !stops pgpool-II |
There are five arguments common to all of the PCP commands. They give information about pgpool-II and authentication. Extra *************** *** 3075,3081 ****
All PCP commands display the results to the standard output.
!Format: --- 4571,4577 ----All PCP commands display the results to the standard output.
!pcp_node_count
Format: *************** *** 3083,3094 ****! Displays the total number of nodes defined in
!pgpool.conf
. It does not distinguish between nodes status, ie attached/detached. ALL nodes are counted.pcp_node_info
Format: --- 4579,4590 ----! Displays the total number of nodes defined in
!pgpool.conf
. It does not distinguish between nodes status, ie attached/detached. ALL nodes are counted.pcp_node_info
Format: *************** *** 3134,3140 ****Specifying an invalid node ID will result in an error with exit status 12, and BackendError will be displayed.
!pcp_proc_count
Format: --- 4630,4636 ----Specifying an invalid node ID will result in an error with exit status 12, and BackendError will be displayed.
!pcp_proc_count
Format: *************** *** 3144,3150 **** Displays the list of pgpool-II children process IDs. If there is more than one process, IDs will be delimited by a white space. !pcp_proc_info
Format: --- 4640,4647 ---- Displays the list of pgpool-II children process IDs. If there is more than one process, IDs will be delimited by a white space. ! !pcp_proc_info
Format: *************** *** 3206,3212 ****Specifying an invalid node ID will result in an error with exit status 12, and BackendError will be displayed.
!pcp_systemdb_info
Format: --- 4703,4736 ----Specifying an invalid node ID will result in an error with exit status 12, and BackendError will be displayed.
!pcp_pool_status V3.1 -
!! Format: ! pcp_pool_status _timeout_ _host_ _port_ _userid_ _passwd_ !!! Displays parameters in pgpool.conf. The output example is as follows: !
!! $ pcp_pool_status 10 localhost 9898 postgres hogehoge ! name : listen_addresses ! value: localhost ! desc : host name(s) or IP address(es) to listen to ! ! name : port ! value: 9999 ! desc : pgpool accepting port number ! ! name : socket_dir ! value: /tmp ! desc : pgpool socket directory ! ! name : pcp_port ! value: 9898 ! desc : PCP port # to bind !! !pcp_systemdb_info
Format: *************** *** 3253,3259 **** BackendError will be displayed. !pcp_detach_node
Format: pcp_detach_node [-g] _timeout_ _host_ _port_ _userid_ _passwd_ _nodeid_ --- 4777,4783 ---- BackendError will be displayed. !pcp_detach_node
Format: pcp_detach_node [-g] _timeout_ _host_ _port_ _userid_ _passwd_ _nodeid_ *************** *** 3265,3271 **** !pcp_attach_node
Format: --- 4789,4795 ---- !pcp_attach_node
Format: *************** *** 3275,3281 ****!pcp_stop_pgpool
Format: pcp_stop_pgpool _timeout_ _host_ _port_ _userid_ _passwd_ _mode_ --- 4799,4817 ----!pcp_promote_node V3.1 -
!!
! Format: ! pcp_promote_node [-g] _timeout_ _host_ _port_ _userid_ _passwd_ _nodeid_ ! ! Promotes the given node as new master to pgpool-II. In master/slave streaming replication only. ! If -g is given, wait until all clients are disconnected (unless ! client_idle_limit_in_recovery is -1 or recovery_timeout is expired). !! ! !pcp_stop_pgpool
Format: pcp_stop_pgpool _timeout_ _host_ _port_ _userid_ _passwd_ _mode_ *************** *** 3287,3295 ****! s - smart mode ! f - fast mode ! i - immediate modeIf the pgpool-II process does not exist, it results in error with exit --- 4823,4831 ----
! s - smart mode ! f - fast mode ! i - immediate modeIf the pgpool-II process does not exist, it results in error with exit *************** *** 3300,3344 **** mode. pgpool-II terminates all the processes whether there are clients connected to the backends or not.
!Exit Status
PCP commands exits with status 0 when everything goes well. If any error had occurred, it will exit with the following error status. !
! UNKNOWNERR 1 Unknown Error (should not occur) ! EOFERR 2 EOF Error ! NOMEMERR 3 Memory shortage ! READERR 4 Error while reading from the server ! WRITEERR 5 Error while writing to the server ! TIMEOUTERR 6 Timeout ! INVALERR 7 Argument(s) to the PCP command was invalid ! CONNERR 8 Server connection error ! NOCONNERR 9 No connection exists ! SOCKERR 10 Socket error ! HOSTERR 11 Hostname resolution error ! BACKENDERR 12 PCP process error on the server (specifying an invalid ID, etc.) ! AUTHERR 13 Authorization failure !!Inside information
! pgpool-II version 2.0.x brings extensive modifications, compared with the version 1.x Please note that what follows doesn't apply to version 1.x.
!Parallel execution engine
! The parallel execution engine is built into pgpool-II. This engine performs the same Query on each node, and drives the engine that transmits the result to the front end, depending on the nodes' answers.
!Query Rewriting
! This explains the Query rewriting that pgpool-II does in parallel mode.
In parallel mode, a query transmitted by the client goes through two stages of --- 4836,4904 ---- mode. pgpool-II terminates all the processes whether there are clients connected to the backends or not.
+pcp_recovery_node
+ ++ 書式: + pcp_recovery_node _timeout_ _host_ _port_ _userid_ _passwd_ _nodeid_ + + Attaches the given backend node with recovery. +!Exit Status
PCP commands exits with status 0 when everything goes well. If any error had occurred, it will exit with the following error status. !
UNKNOWNERR | !1 | Unknown Error (should not occur) |
---|---|---|
EOFERR | !2 | EOF Error |
NOMEMERR | !3 | Memory shortage |
READERR | !4 | Error while reading from the server |
WRITEERR | !5 | Error while writing to the server |
TIMEOUTERR | !6 | Timeout |
INVALERR | !7 | Argument(s) to the PCP command was invalid |
CONNERR | !8 | Server connection error |
NOCONNERR | !9 | No connection exists |
SOCKERR | !10 | Socket error |
HOSTERR | !11 | Hostname resolution error |
BACKENDERR | !12 | PCP process error on the server (specifying an invalid ID, etc.) |
AUTHERR | !13 | Authorization failure |
! pgpool-II version 2.0.x brings extensive modifications, compared with the version 1.x Please note that what follows doesn't apply to version 1.x.
!! The parallel execution engine is built into pgpool-II. This engine performs the same Query on each node, and drives the engine that transmits the result to the front end, depending on the nodes' answers.
!! This explains the Query rewriting that pgpool-II does in parallel mode.
In parallel mode, a query transmitted by the client goes through two stages of *************** *** 3351,3370 ****
What follows explains these two processing steps:
!Introduction
! The retrieval query submitted by the client goes through the SQL parser. It is then analyzed ! using information stored in the system DB. Execution status of each part ! of this query is updated using this information. This execution status stores ! where this node can be treated. For instance, if a table's data is distributed ! on several nodes (as declared in the catalog's pgpool_catalog.dist_def table), it has to be retrieved from all nodes. On the other hand, data from a table registered in pgpool_catalog.replicate_def is replicated, and can therefore ! be retrieved from any node. These states are 'P' when data has to be processed by ! all nodes, 'L' when it should be processed by one node. The 'S' status is a special ! case: it means that there is another step to be performed on the data after ! retrieving it from all nodes. For example, sorting data coming from a table registered in the pgpool_catalog.dist_def table.
--- 4911,4937 ----What follows explains these two processing steps:
! !! The retrieval query submitted by the client goes through the SQL parser. ! It is then analyzed using information stored in the system DB. Execution status of each part ! of this query is updated using this information. !
! This execution status stores where this node can be treated. ! For instance, if a table's data is distributed on several nodes ! (as declared in the catalog's pgpool_catalog.dist_def table), it has to be retrieved from all nodes. On the other hand, data from a table registered in pgpool_catalog.replicate_def is replicated, and can therefore ! be retrieved from any node. !
!! These states are 'P' when data has to be processed by all nodes, ! 'L' when it should be processed by one node. ! The 'S' status is a special case: it means that there is another step to be performed ! on the data after retrieving it from all nodes. For example, sorting data coming from a table registered in the pgpool_catalog.dist_def table.
*************** *** 3375,3411 ****! The relation between the final execution status of SELECT and the processing ! place is as follows.
Execution status | Processed place |
L | Query issued on either node. |
P | Returns data to the client by running the ! same query on all nodes and using the parallel execution engine. |
S | After processing using the system DB, data ! is returned to the client. |
! The above-mentioned rule also applies to Sub-Query. In the simple following ! Query, if p1-table is registered in pgpool_catalog.dist_def table of the system ! DB (p1-table is distributed), the final execution status of the subquery becomes P, and as a result, the parent of the subquery, SELECT, also becomes P.
--- 4942,4978 ----! The relation between the final execution status of SELECT and the processing ! place is as follows.
Execution status | Processed place |
---|---|
L | Query issued on either node. |
P | Returns data to the client by running the ! same query on all nodes and using the parallel execution engine. |
S | After processing using the system DB, data ! is returned to the client. |
! The above-mentioned rule also applies to Sub-Query. In the simple following ! Query, if p1-table is registered in pgpool_catalog.dist_def table of the system ! DB (p1-table is distributed), the final execution status of the subquery becomes P, and as a result, the parent of the subquery, SELECT, also becomes P.
*************** *** 3418,3526 **** Let's start with an example, to explain the FROM status. !Execution status of FROM clause
! This is a retrieval Query (SELECT). The data set and its status (P,L and S) is defined according to the FROM clause The execution status of the table is as follows: when there is only one table in the from clause, the execution status of the entire dataset is this table's execution status. When there are two or more tables or sub-queries in the FROM clause, the execution is decided according to the join method and the execution ! statuses, as show in the following table.
JOIN type | LEFT OUTER JOIN | RIGHT OUTER JOIN | FULL OUTER JOIN | Other | ||||||||
left/right | P | L | S | P | L | S | P | L | S | P | L | S |
P | S | P | S | S | S | S | S | S | S | S | P | S |
L | S | L | S | P | L | S | S | L | S | P | L | S |
S | S | S | S | S | S | S | S | S | S | S | S | S |
In the following examples, P1-table is in the P status. L1-table and L2-table are in the L status.
SELECT * FROM P1-table,L1-table,L2-table;! P1-table (left) and L1-table (right) are joined, and according to the table they get the P status. With this P status, they are joined with the L2-table, in the L status, which is now in the P status too. !
Changes in execution status because of TARGETLIST and WHERE clause
In a basic Query, the execution status is the one from the FROM clause. However, if there is a TARGETLIST, the execution status of the WHERE clause can change in the following cases.
! In these cases, the final execution status of the subquery, the execution status of ! TARGETLIST and the WHERE clause get the S status if the initial status was ! P or S.
! In the following example, when the table used by the subquery is P, the final ! execution status of the subquery gets the P status. Therefore, The execution ! status of the WHERE clause gets the S status, whatever the execution status of ! L1 is, and this query is run in the system DB.
! SELECT * FROM L1-table where L1-table.column IN (SELECT * FROM P1-table);
! The FROM clause changes to the S status when there is an aggregate function in a 'P' TARGETLIST, in order to perform the aggregate after all data has been acquired. ! Some optimization on the aggregate function is done under specific conditions.
A column that does not exist in a table can be used in a query. For instance, in the following correlated subquery:
! SELECT * FROM L1-table WHERE L1-table.col1 IN (SELECT * FROM P1-table WHERE P1-table.col = L1-table.col1);
! This subquery refers to L1-table.col1, from the L1-table. The execution status ! of the WHERE clause of the subquery is 'S'.
!Change in execution status because of GROUP BY, HAVING, ORDER BY and LIMIT/OFFSET
! The execution status of the WHERE clause is changed to 'S' when there is any GROUP BY, HAVING, or ORDER BY clause, or LIMIT/OFFSET predicate and status is 'P'. A query with no GROUP BY clause gets the execution status of the WHERE clause. ! In the same way, the execution status of the GROUP BY clause is used when there is no HAVING clause. The ORDER BY clause and the LIMIT/OFFSET predicate are also similar in this behavior.
!When UNION, EXTRACT, and INTERSECT are used
! UNION, EXTRACT, and INTERSECT queries' status depends on the final execution status of both the left and right SELECT statements. If both statements are L, the combined statement is L. If both statements are P, and the query is a UNION ALL ! the combined statement is P. For any other combination, the resulting status is ! S.
!Acquisition of the final execution status of SELECT
If everything in the SELECT has a status of L, then the final execution status is L. The same rule applies for P. For any other combination, the final status is S. --- 4985,5099 ---- Let's start with an example, to explain the FROM status.
!! This is a retrieval Query (SELECT). The data set and its status (P,L and S) is defined according to the FROM clause The execution status of the table is as follows: when there is only one table in the from clause, the execution status of the entire dataset is this table's execution status. When there are two or more tables or sub-queries in the FROM clause, the execution is decided according to the join method and the execution ! statuses, as show in the following table.
JOIN type | !LEFT OUTER JOIN | !RIGHT OUTER JOIN | !FULL OUTER JOIN | !Other | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
left/right | P | L | S | P | L | S | P | L | S | P | L | S |
P | S | P | S | S | S | S | S | S | S | S | P | S |
L | S | L | S | P | L | S | S | L | S | P | L | S |
S | S | S | S | S | S | S | S | S | S | S | S | S |
In the following examples, P1-table is in the P status. L1-table and L2-table are in the L status. +
SELECT * FROM P1-table,L1-table,L2-table;!
P1-table (left) and L1-table (right) are joined, and according to the table they get the P status. With this P status, they are joined with the L2-table, in the L status, which is now in the P status too.
!In a basic Query, the execution status is the one from the FROM clause. However, if there is a TARGETLIST, the execution status of the WHERE clause can change in the following cases.
! In these cases, the final execution status of the subquery, the execution status of TARGETLIST ! and the WHERE clause get the S status if the initial status was P or S. !
! In the following example, when the table used by the subquery is P, ! the final execution status of the subquery gets the P status. ! Therefore, The execution status of the WHERE clause gets the S status, ! whatever the execution status of L1 is, and this query is run in the system DB. !
! SELECT * FROM L1-table where L1-table.column IN (SELECT * FROM P1-table);
! The FROM clause changes to the S status when there is an aggregate function in a 'P' TARGETLIST, in order to perform the aggregate after all data has been acquired. ! Some optimization on the aggregate function is done under specific conditions.
A column that does not exist in a table can be used in a query. For instance, in the following correlated subquery:
! SELECT * FROM L1-table WHERE L1-table.col1 IN (SELECT * FROM P1-table WHERE P1-table.col = L1-table.col1);
! This subquery refers to L1-table.col1, from the L1-table. The execution status ! of the WHERE clause of the subquery is 'S'.
! !! The execution status of the WHERE clause is changed to 'S' when there is any GROUP BY, HAVING, or ORDER BY clause, or LIMIT/OFFSET predicate and status is 'P'. A query with no GROUP BY clause gets the execution status of the WHERE clause. ! In the same way, the execution status of the GROUP BY clause is used when there is no HAVING clause. The ORDER BY clause and the LIMIT/OFFSET predicate are also similar in this behavior.
!! UNION, EXTRACT, and INTERSECT queries' status depends on the final execution status of both the left and right SELECT statements. If both statements are L, the combined statement is L. If both statements are P, and the query is a UNION ALL ! the combined statement is P. For any other combination, the resulting status is S.
! !If everything in the SELECT has a status of L, then the final execution status is L. The same rule applies for P. For any other combination, the final status is S. *************** *** 3529,3545 **** the parallel execution engine. For S, the query rewriting presented below is done.
!The Query is rewritten by using the execution status acquired while analyzing the query. Here is an example. The P1-table has the P status, the L1-table has the L status. -
! SELECT P1-table.col, L1-table.col ! FROM P1-table,L1-table ! where P1-table.col = L1-table.col order by P1-table.col;--- 5102,5117 ---- the parallel execution engine. For S, the query rewriting presented below is done. !
The Query is rewritten by using the execution status acquired while analyzing the query. Here is an example. The P1-table has the P status, the L1-table has the L status.
! SELECT P1-table.col, L1-table.col ! FROM P1-table,L1-table ! where P1-table.col = L1-table.col order by P1-table.col;*************** *** 3551,3558 ****
SELECT P1-table.col, L1-table.col FROM ! dblink(select pool_parallel(SELECT P1-table.col, L1-table.col FROM P1-table,L1-table where P1-table.col = L1-table.col)) ! order by P1-table.col;
dblink transmits the query to pgpool-II here. --- 5123,5130 ----
SELECT P1-table.col, L1-table.col FROM ! dblink(select pool_parallel(SELECT P1-table.col, L1-table.col FROM P1-table,L1-table where P1-table.col = L1-table.col)) ! order by P1-table.col;
dblink transmits the query to pgpool-II here. *************** *** 3572,3602 ****
! In this example, the FROM and WHERE clause and the TARGETLIST are in the L status. Because the subquery is in the P status, the query itself is in the S status. The rewriting is, as a consequence, performed as follows.
! SELECT L1-table.col ! FROM dblink(SELECT loadbalance(SELECT L1-table.col ! FROM L1-table ! WHERE L1-table.col % 2 = 0 ! AND TRUE)) ! WHERE ! L1-table.col %2 = 0 AND ! L1-table.col IN ! ( ! SELECT P1-Table FROM ! dblink(select pool_parallel(SELECT P1-table FROM P1-table)) ! ) ;
! pool_loadbalance is a function responsible for dispatching queries to either node.
! For grouping queries (aggregate functions and GROUP BY), rewriting tries to reduce the load on the system DB by performing a first aggregate on each node.
--- 5144,5174 ----
! In this example, the FROM and WHERE clause and the TARGETLIST are in the L status. Because the subquery is in the P status, the query itself is in the S status. The rewriting is, as a consequence, performed as follows.
! SELECT L1-table.col ! FROM dblink(SELECT loadbalance(SELECT L1-table.col ! FROM L1-table ! WHERE L1-table.col % 2 = 0 ! AND TRUE)) ! WHERE ! L1-table.col %2 = 0 AND ! L1-table.col IN ! ( ! SELECT P1-Table FROM ! dblink(select pool_parallel(SELECT P1-table FROM P1-table)) ! ) ;
! pool_loadbalance is a function responsible for dispatching queries to either node.
! For grouping queries (aggregate functions and GROUP BY), rewriting tries to reduce the load on the system DB by performing a first aggregate on each node.
*************** *** 3609,3635 ****
select count(*) from P1-table; ! -> rewrite SELECT sum(pool_c$1) as count FROM dblink(select pool_parallel('select count(*) from P1-table')) ! AS pool_$1g (pool_c$1 bigint);
Query rewriting like above is done in these conditions.
The column names and types are needed when a query is analyzed in parallel mode. Therefore, when an expression or a function is used in the TARGETLIST of --- 5181,5208 ----
select count(*) from P1-table; ! -> rewrite SELECT sum(pool_c$1) as count FROM dblink(select pool_parallel('select count(*) from P1-table')) ! AS pool_$1g (pool_c$1 bigint);
Query rewriting like above is done in these conditions.
The column names and types are needed when a query is analyzed in parallel mode. Therefore, when an expression or a function is used in the TARGETLIST of *************** *** 3639,3672 **** numeric. For min()/max(), if the argument is a date type, returned datatype is date, else, it is assumed numeric. avg() is processed as sum()/count() (sum divided by count). !
Here is a rough estimate of the query performance versus execution status:
!
Execution status | Performance |
L | There is no performance deterioration with a ! single node, excluding the overhead of pgpool-II, because there is no parallel ! querying done at all. |
P | Parallel processing is fast, especially the ! sequential scans. It's easy to get speedups because the scan of a big table ! becomes the parallel scan of much smaller ones by being distributed on several ! nodes. ! |
S | When aggregate functions can be rewritten in a ! parallel fashion, they are fast. |
A tutorial for pgpool-II is available.
!Here is a rough estimate of the query performance versus execution status:
!Execution status | Performance |
---|---|
L | !There is no performance deterioration with a single node, ! excluding the overhead of pgpool-II, because there is no parallel querying done at all. ! |
P | !Parallel processing is fast, especially the sequential scans. ! It's easy to get speedups because the scan of a big table becomes the parallel scan ! of much smaller ones by being distributed on several nodes. ! |
S | !When aggregate functions can be rewritten in a parallel fashion, they are fast. |
A tutorial for pgpool-II is available.
! ! !