diff --git a/doc/pgpool-en.html b/doc/pgpool-en.html new file mode 100644 index 80b2931..a611fee *** a/doc/pgpool-en.html --- b/doc/pgpool-en.html *************** *** 10,15 **** --- 10,16 ---- Last modified: Fri Mar 15 18:02:22 JST 2013 +
Welcome to pgpool -II page
*************** Last modified: Fri Mar 15 18:02:22 JST 2 *** 28,33 **** --- 29,36 ----
  • Parallel Mode
  • Client authentication
  • Setting Query cache method
  • + +
  • On memory query Cache
  • Starting/Stopping pgpool-II
  • Reloading pgpool-II configuration files
  • *************** can collect pgpool-II status, and termin *** 374,380 **** authentication by this interface. All operation modes require the pcp.conf file to be set. A $prefix/etc/pcp.conf.sample file is created during the installation of pgpool-II. Rename the file to ! pcp.conf and add your username and password to it.

    --- 377,383 ----
      authentication by this interface. All operation modes require the
      pcp.conf file to be set. A $prefix/etc/pcp.conf.sample
      file is created during the installation of pgpool-II. Rename the file to
    ! pcp.conf and add your user name and password to it.
      

    *************** cp $prefix/etc/pcp.conf.sample $prefix/e
    *** 382,388 ****
      

    An empty line or a line starting with "#" is treated as a ! comment and will be ignored. A username and its associated password must be written as one line using the following format:

    --- 385,391 ----
      

    An empty line or a line starting with "#" is treated as a ! comment and will be ignored. A user name and its associated password must be written as one line using the following format:

    *************** max_pool*num_init_children*2 <= (max_
    *** 564,569 ****
    --- 567,573 ----
          that have not accepted any connection yet.
          

    + You need to reload pgpool.conf if you change this value.

    *************** Following parameters take effect to conn *** 1209,1215 ****

    The maximum number of cached connections in pgpool-II children processes. pgpool-II reuses the cached connection if an incoming connection is ! connecting to the same database with the same username. If not, pgpool-II creates a new connection to the backend. If the number of cached connections exceeds max_pool, the oldest connection will be discarded, and uses that slot for the new connection. --- 1213,1219 ----

    The maximum number of cached connections in pgpool-II children processes. pgpool-II reuses the cached connection if an incoming connection is ! connecting to the same database with the same user name. If not, pgpool-II creates a new connection to the backend. If the number of cached connections exceeds max_pool, the oldest connection will be discarded, and uses that slot for the new connection. *************** black_function_list = 'nextval,setval,la *** 1387,1393 ****

    ! --- 1391,1397 ----

    SELECT is inside a transaction block Y
    ! *************** COMMIT; *** 1477,1483 **** (more precisely, access error on pg_clog, which keeps transaction status). To prevent this, PostgreSQL core developers decided to disallow row locking against sequences and this will break pgpool-II of course ! (the "fixed" version of PostgreSQL will be released as 9.0.5, 8.4.9, 8.3.16 and 8.2.22).

    --- 1481,1487 ---- (more precisely, access error on pg_clog, which keeps transaction status). To prevent this, PostgreSQL core developers decided to disallow row locking against sequences and this will break pgpool-II of course ! (the "fixed" version of PostgreSQL was released as 9.0.5, 8.4.9, 8.3.16 and 8.2.22).

    *************** COMMIT; *** 1485,1490 **** --- 1489,1496 ---- table because new PostgreSQL disallows a row lock against the sequence relation. So creating insert_lock table in all databases which are accessed via pgpool-II beforehand is required. See Creating insert_lock table for more details. + If does not exist insert_lock table, pgpool-II locks the insert target table. This behavior is same as pgpool-II 2.2 and 2.3 series. If you want to use insert_lock which is compatible with older releases, *************** end of transaction block *** 1526,1532 ****

    recovery_user

    ! This parameter specifies a PostgreSQL username for online recovery. It can be changed without restarting.

    --- 1532,1538 ----
    recovery_user

    ! This parameter specifies a PostgreSQL user name for online recovery. It can be changed without restarting.

    *************** primary node. *** 2100,2105 **** --- 2106,2112 ----

    Online recovery with Streaming Replication

    + In master/slave mode with streaming replication, online recovery can be performed. In the online recovery procedure, primary server acts as a master server and recovers specified standby server. Thus the recovery procedure requires that the primary server is up and running. *************** pgpool-II. The System DB's configuration *** 2258,2264 ****

    system_db_user
    !

    The username to connect to the System DB.

    system_db_password
    --- 2265,2271 ----
    system_db_user
    !

    The user name to connect to the System DB.

    system_db_password
    *************** Here are the limitations of pool_hba. *** 2429,2434 **** --- 2436,2442 ----
    • "hostssl" connection type is not supported
    • + Though "hostssl" is invalid, pgpool-II 2.3 or later supports SSL. See SSL for more details.

      *************** benchs2=# show pool_processes; *** 3078,3083 **** --- 3086,3092 ----

      pool_pools V3.0 -

      +

      "SHOW pool_pools" sends back a list of pools handled by pgPool-II. their name, value, and description. Here is an excerpt of the result:

      *************** pgpoolAdmin, set "_PGPOOL2_PCP_TIMEOUT " *** 3525,3530 **** --- 3534,3540 ---- pgmgt.conf.php.

      +

      PostgreSQL version up using online recovery

      replication mode case

      *************** The procedure to update primary PostgreS *** 3650,3665 ****

      Backup

      ! To back up backend PostgreSQL servers and system DB, you can physical ! backup, logical backup (pg_dump, pg_dumpall) and PITR in the same manner as PostgreSQL. ! Please note that using logical backup and PITR should be performed directory PostgreSQL, rather than via pgpool-II to avoid errors caused by load_balance_mode and replicate_select.

      replication mode and master/slave mode

      ! If pgpool-II is operated in replication mode or master/slave mode, take a backup on one DB nodes in the cluster.

      --- 3660,3675 ----

      Backup

      ! To back up backend PostgreSQL servers and system DB, you can use physical ! backup, logical backup (pg_dump, pg_dumpall), and PITR in the same manner as PostgreSQL. ! Please note that using logical backup and PITR should be performed directly with PostgreSQL, rather than via pgpool-II to avoid errors caused by load_balance_mode and replicate_select.

      replication mode and master/slave mode

      ! If pgpool-II is operating in replication mode or master/slave mode, take a backup on one DB nodes in the cluster.

      *************** to make those deployments and pros and c *** 3720,3729 ****

      Pgpool-II is running on a dedicated server. ! It's simple and pgpool-II is not affected by other server softwares. ! Obvious cons is you need to by more hardware. Also pgpool-II can be a single point of failure with this configuration ! (you can avoid this by using pgpool-HA described somewhere below).

      --- 3730,3739 ----

      Pgpool-II is running on a dedicated server. ! It's simple and pgpool-II is not affected by other server software. ! Obvious cons is you need to buy more hardware. Also pgpool-II can be a single point of failure with this configuration ! (you can avoid this by using pgpool-HA described below).

      *************** to make those deployments and pros and c *** 3733,3740 **** Deploying pgpool-II on a server where Apache, JBoss, Tomcat or other web server and application servers. Since communication between pgpool-II and web servers and application servers ! is within a local machine, socket communication can be faster than inter sever communication. ! Also if you are using multiple web serves or application servers, you can avoid the single point of failure problem (in this case you must have identical pgpool.conf on each pgpool-II instance). There are several things you should care about in this configuration: --- 3743,3750 ---- Deploying pgpool-II on a server where Apache, JBoss, Tomcat or other web server and application servers. Since communication between pgpool-II and web servers and application servers ! is within a local machine, socket communication can be faster than inter-server communication. ! Also if you are using multiple web servers or application servers, you can avoid the single point of failure problem (in this case you must have identical pgpool.conf on each pgpool-II instance). There are several things you should care about in this configuration: *************** to make those deployments and pros and c *** 3759,3766 ****

      Running pgpool-II on the server as PostgreSQL is running on. You can avoid the single point of failure problem of pgpool-II with configuration. ! And obviously you do need to buy additional dedicated server. ! Problem with this configuration is, application need to aware that which DB server they should connect to. To solve the problem you can use virtual IP with pgpool-HA.

      --- 3769,3776 ----

      Running pgpool-II on the server as PostgreSQL is running on. You can avoid the single point of failure problem of pgpool-II with configuration. ! And obviously you do need to buy an additional dedicated server. ! Problem with this configuration is, application need to be aware of which DB server they should connect to. To solve the problem you can use virtual IP with pgpool-HA.

      *************** to make those deployments and pros and c *** 3768,3776 ****

      About pgpool-HA

      ! Pgpool-HA is a high availability software for pgpool-II using hearbeat. ! Pgpool-HA is a sub project of pgpool project as well as pgpool-II. ! Pgpool-HA can be available from pgpool development site as an open source software.

      --- 3778,3786 ----

      About pgpool-HA

      ! Pgpool-HA is a high availability software for pgpool-II using heartbeat. ! Pgpool-HA is a sub-project of the pgpool project as well as pgpool-II. ! Pgpool-HA is available from the pgpool development site as open source software.

      *************** Pgpool-HA can be available from pgpool d *** 3780,3790 ****

      Watchdog V3.2 -

      ! "Watchdog" is a sub process of pgpool-II aiming for adding high availability feature to it. Features added by watchdog include:

      Life checking of pgpool service

      --- 3790,3801 ----

      Watchdog V3.2 -

      ! "Watchdog" is a subprocess of pgpool-II to add high availability. Features added by watchdog include:

      +
      Life checking of pgpool service

      *************** Features added by watchdog include: *** 3793,3801 **** by watchdog and watchdog checks the response.

      ! Also watchdog monitors connections to up stream servers (application servers etc.) from the pgpool. ! The connection between the up stream servers and the pgpool is monitored as service of pgpool.

      --- 3804,3813 ---- by watchdog and watchdog checks the response.

      ! Also watchdog monitors connections to upstream servers (application servers etc.) from the pgpool. ! The connection between the upstream servers and pgpool is monitored as ! a service of pgpool.

      *************** Features added by watchdog include: *** 3812,3818 ****

      When a fault is detected in the pgpool service, watchdog notifies the other watchdogs of it. ! Watchdogs decide the new active pgpool if previous active pgpool is broken by voting and change active/standby state.

      --- 3824,3830 ----

      When a fault is detected in the pgpool service, watchdog notifies the other watchdogs of it. ! Watchdogs then decide on the new active pgpool if the previous active pgpool is broken by voting and change active/standby state.

      *************** Features added by watchdog include: *** 3822,3828 ****

      When a standby pgpool server promotes to active, the new active server brings up virtual IP interface. Meanwhile, the previous active server brings down the virtual IP interface. ! This enables the active pgpool to work using the same IP address even when servers is switched over.

      --- 3834,3840 ----

      When a standby pgpool server promotes to active, the new active server brings up virtual IP interface. Meanwhile, the previous active server brings down the virtual IP interface. ! This enables the active pgpool to work using the same IP address even when servers are switched.

      *************** Features added by watchdog include: *** 3832,3838 **** When broken server recovers or new server is attached, the watchdog process notifies the other watchdog process along with information of the new server, and the watchdog process receives information on the active server and other servers. ! Then, the attached server is registered as standby.

      --- 3844,3850 ---- When broken server recovers or new server is attached, the watchdog process notifies the other watchdog process along with information of the new server, and the watchdog process receives information on the active server and other servers. ! Then, the attached server is registered as a standby.

      *************** The life checking starts after waiting f *** 3857,3863 ****

      Configuring watchdog (pgpool.conf)

      ! Watchdog configuration parameters is described in pgpool.conf. There is sample configuration in WATCHDOG section in pgpool.conf.sample file.

      --- 3869,3875 ----

      Configuring watchdog (pgpool.conf)

      ! Watchdog configuration parameters are described in pgpool.conf. There is sample configuration in WATCHDOG section in pgpool.conf.sample file.

      *************** Large object replication using backend f *** 4302,4307 **** --- 4314,4320 ---- Creating/inserting/updating/deleting temporary tables are always executed on the master(primary). With pgpool-II 3.0 or later, SELECT on these tables is executed on master as well. + However if the temporary table name is used as a literal in SELECT, there's no way to detect it, and the SELECT will be load balanced. That will trigger a "not found the table" error or will find another table *************** order to read them from the System DB.Transaction (for parallel mode) +

      SELECT statements executed inside a transaction block will be executed in a separate transaction. Here is an example: *************** be performed inside the functions.

      *** 4508,4519 ****

      Extended Query Protocol (for parallel mode)

      The extended query protocol used by JDBC drivers, etc. is not ! supported. The simple query protocol must be used. This means you can't use prepared statements.

      Natural Join (for parallel mode)

      The Natural Join is not supported. "ON join condition" or "USING (join_column)" must be used.

      --- 4522,4535 ----

      Extended Query Protocol (for parallel mode)

      +

      The extended query protocol used by JDBC drivers, etc. is not ! supported. The simple query protocol must be used. This means you cannot use prepared statements.

      Natural Join (for parallel mode)

      +

      The Natural Join is not supported. "ON join condition" or "USING (join_column)" must be used.

      *************** fail. Also, the System DB must be Postgr *** 4632,4638 ****

      Query Cache

      With on disk query cache, invalidation must be done manually. ! This does not apply to on memory query cache. Cacche invalidation is automatically done with on memory query cache.

      Compatibility with pgpool

      --- 4648,4654 ----

      Query Cache

      With on disk query cache, invalidation must be done manually. ! This does not apply to on memory query cache. Cache invalidation is automatically done with on memory query cache.

      Compatibility with pgpool

      *************** First argument - timeout value in sec *** 4684,4694 **** pgpool-II does not respond in this many seconds. Second argument - pgpool-II's hostname Third argument - PCP port number ! Fourth argument - PCP username Fifth argument - PCP password !

      PCP usernames and passwords must be declared in pcp.conf in $prefix/etc directory. -F option can be used when starting pgpool-II if pcp.conf is placed somewhere else. The password does --- 4700,4710 ---- pgpool-II does not respond in this many seconds. Second argument - pgpool-II's hostname Third argument - PCP port number ! Fourth argument - PCP user name Fifth argument - PCP password !

      PCP user names and passwords must be declared in pcp.conf in $prefix/etc directory. -F option can be used when starting pgpool-II if pcp.conf is placed somewhere else. The password does *************** postgres_db postgres 1150769932 11507673 *** 4787,4793 **** The result is in the following order: 1. connected database name ! 2. connected username 3. process start-up timestamp 4. connection created timestamp 5. protocol major version --- 4803,4809 ---- The result is in the following order: 1. connected database name ! 2. connected user name 3. process start-up timestamp 4. connection created timestamp 5. protocol major version *************** First, the System DB information will be *** 4879,4885 **** line. The result is in the following order: 1. hostname 2. port number ! 3. username 4. password. '' for no password. 5. schema name 6. database name --- 4895,4901 ---- line. The result is in the following order: 1. hostname 2. port number ! 3. user name 4. password. '' for no password. 5. schema name 6. database name *************** error had occurred, it will exit with th *** 5012,5018 **** !

      Inside information

      pgpool-II version 2.0.x brings extensive modifications, compared with the version 1.x Please note that what follows doesn't apply to version 1.x. --- 5028,5034 ---- !

      Internal information

      pgpool-II version 2.0.x brings extensive modifications, compared with the version 1.x Please note that what follows doesn't apply to version 1.x. diff --git a/doc/tutorial-en.html b/doc/tutorial-en.html new file mode 100644 index 815d3d2..480ea98 *** a/doc/tutorial-en.html --- b/doc/tutorial-en.html *************** *** 13,19 ****

      Welcome to the Tutorial for pgpool-II. From here, you can learn how to install, setup, and run parallel queries or do replication using pgpool-II. We assume that you already know basic PostreSQL operations, ! so please refer to the PostgreSQL document if needed.

      Table of Contents
      --- 13,19 ----

      Welcome to the Tutorial for pgpool-II. From here, you can learn how to install, setup, and run parallel queries or do replication using pgpool-II. We assume that you already know basic PostreSQL operations, ! so please refer to the PostgreSQL document if necessary.

      Table of Contents
      *************** so please refer to the PostgreSQL docume *** 37,43 ****

      1. Let's Begin!

      !

      First, we must learn how to install, configure pgpool-II and database nodes before using replication or parallel query.

      1.1. Installing pgpool-II

      --- 37,43 ----

      1. Let's Begin!

      !

      First, we must learn how to install and configure pgpool-II and database nodes before using replication or parallel query.

      1.1. Installing pgpool-II

      *************** permission on the installation directory *** 65,83 **** /usr/local directory.

      Note: pgpool-II requires libpq library in PostgreSQL 7.4 ! or later (version 3 protocol). If configure script ! displays the following error message, libpq library may not be installed, or it is not of version 3.

      configure: error: libpq is not installed or libpq is old
      !

      If the library is version 3, but above message is still displayed, your libpq library is probably not recognized by ! configure script.

      !

      configure script searches for libpq library under ! /usr/local/pgsql libaray. If you have installed ! PostgreSQL to a directory other than /usr/local/pgsql, use --with-pgsql, or --with-pgsql-includedir and --with-pgsql-libdir command line options when you execute configure.

      --- 65,83 ---- /usr/local directory.

      Note: pgpool-II requires libpq library in PostgreSQL 7.4 ! or later (version 3 protocol). If the configure script ! displays the following error message, the libpq library may not be installed, or it is not of version 3.

      configure: error: libpq is not installed or libpq is old
      !

      If the library is version 3, but the above message is still displayed, your libpq library is probably not recognized by ! the configure script.

      !

      The configure script searches for the libpq library under ! /usr/local/pgsql. If you have installed ! PostgreSQL in a directory other than /usr/local/pgsql, use --with-pgsql, or --with-pgsql-includedir and --with-pgsql-libdir command line options when you execute configure.

      *************** execute configure.

      *** 85,91 ****

      1.2. Configuration Files

      pgpool-II configuration parameters are saved in ! pgpool.conf file. The file is in "parameter = value" per line format. When you install pgpool-II, pgpool.conf.sample is automatically created. We recommend copying and renaming it to pgpool.conf, and edit it as --- 85,91 ----

      1.2. Configuration Files

      pgpool-II configuration parameters are saved in ! the pgpool.conf file. The file is in "parameter = value" per line format. When you install pgpool-II, pgpool.conf.sample is automatically created. We recommend copying and renaming it to pgpool.conf, and edit it as *************** port = 9999 *** 106,117 ****

      1.3. Configuring PCP Commands

      !

      pgpool-II has an interface for administration purpose to retrieve information on database nodes, shutdown pgpool-II, etc. via network. To use PCP commands, user authentication is required. This authentication is different from PostgreSQL's user authentication. A ! username and password need to be defined in pcp.conf ! file. In the file, a username and password are listed as a pair on each line, and they are separated by a colon (:). Passwords are encrypted in md5 hash format.

      --- 106,117 ----

      1.3. Configuring PCP Commands

      !

      pgpool-II has an interface for administrative purposes to retrieve information on database nodes, shutdown pgpool-II, etc. via network. To use PCP commands, user authentication is required. This authentication is different from PostgreSQL's user authentication. A ! username and password need to be defined in the pcp.conf ! file. In the file, a user name and password are listed as a pair on each line, and they are separated by a colon (:). Passwords are encrypted in md5 hash format.

      *************** automatically created. We recommend copy *** 123,135 ****

      $ cp /usr/local/etc/pcp.conf.sample /usr/local/etc/pcp.conf
      !

      To encrypt your password into md5 hash format, use pg_md5 command, ! which is installed as a part of pgpool-II ! executables. pg_md5 takes text as an command line argument, and displays its md5-hashed text.

      !

      For example, give "postgres" as the command line argument, at ! pg_md5 displays md5-hashed text to the standard output.

      --- 123,135 ----
        

      $ cp /usr/local/etc/pcp.conf.sample /usr/local/etc/pcp.conf
      !

      To encrypt your password into md5 hash format, use the pg_md5 command, ! which is installed as one of pgpool-II's ! executables. pg_md5 takes text as a command line argument, and displays its md5-hashed text.

      !

      For example, give "postgres" as the command line argument, and ! pg_md5 displays md5-hashed text on its standard output.

      *************** SELECT queries are equally distributed a
      *** 189,195 ****
        
        

      The above command, however, prints no log messages because pgpool detaches the terminal. If you want to show pgpool log messages, you ! pass -n option to pgpool command. pgpool-II is executed as non-daemon process, and the terminal will not be detached.

      --- 189,195 ----

      The above command, however, prints no log messages because pgpool detaches the terminal. If you want to show pgpool log messages, you ! pass -n option to pgpool command so pgpool-II is executed as non-daemon process, and the terminal will not be detached.

      *************** $ pgpool -n & *** 199,235 ****

      The log messages are printed on the terminal, so the recommended ! options to use are like the following.

      $ pgpool -n -d > /tmp/pgpool.log 2>&1 &
      !

      -d option enables debug messages to be generated.

      The above command keeps appending log messages to /tmp/pgpool.log. If you need to rotate log files, pass the logs to a external command ! which have log rotation function. ! For example, you can use rotatelogs coming with Apache2:

        $ pgpool -n 2>&1 | /usr/local/apache2/bin/rotatelogs \
          -l -f /var/log/pgpool/pgpool.log.%A 86400 &
        
      ! This will generate a log file named "pgpool.log.Thursday" everyday then ! rotate it 00:00 at midnight. Rotatelogs adds log to a file if it already exists. To delete old log files before rotation, you could use cron:
        55 23 * * * /usr/bin/find /var/log/pgpool -type f -mtime +5 -exec /bin/rm -f '{}' \;
        
      ! Please note that rotaelogs may exist as /usr/sbin/rotatelogs2 in some distributions. -f option generates a log file as soon as rotatelogs starts and is ! available apache2 2.2.9 or greater.

      Also cronolog helps you.

      --- 199,235 ----
        
        

      The log messages are printed on the terminal, so the recommended ! options to use the following.

      $ pgpool -n -d > /tmp/pgpool.log 2>&1 &
      !

      The -d option enables debug messages to be generated.

      The above command keeps appending log messages to /tmp/pgpool.log. If you need to rotate log files, pass the logs to a external command ! which has log rotation functionality. ! For example, you can use rotatelogs from Apache2:

        $ pgpool -n 2>&1 | /usr/local/apache2/bin/rotatelogs \
          -l -f /var/log/pgpool/pgpool.log.%A 86400 &
        
      ! This will generate a log file named "pgpool.log.Thursday" then ! rotate it 00:00 at midnight. Rotatelogs adds logs to a file if it already exists. To delete old log files before rotation, you could use cron:
        55 23 * * * /usr/bin/find /var/log/pgpool -type f -mtime +5 -exec /bin/rm -f '{}' \;
        
      ! Please note that rotatelogs may exist as /usr/sbin/rotatelogs2 in some distributions. -f option generates a log file as soon as rotatelogs starts and is ! available in apache2 2.2.9 or later.

      Also cronolog can be used.

      *************** $ pgpool -n 2>&1 | /usr/sbin/cronolog \
      *** 238,249 ****
          '/var/log/pgsql/%Y-%m-%d-pgpool.log' &
        
      !

      To stop pgpool-II process, execute the following command.

      $ pgpool stop
      !

      If any client is still connected, pgpool-II waits for them to ! disconnect, and then terminate itself. Execute the following command instead if you want to shutdown pgpool-II forcibly.

      $ pgpool -m fast stop
      --- 238,249 ---- '/var/log/pgsql/%Y-%m-%d-pgpool.log' &
      !

      To stop pgpool-II, execute the following command.

      $ pgpool stop
      !

      If any client is still connected, pgpool-II waits for it to ! disconnect, and then terminates itself. Run the following command instead if you want to shutdown pgpool-II forcibly.

      $ pgpool -m fast stop
      *************** nodes.

      *** 256,262 ****

      In this section, we'll use three database nodes, which we have already set up in section "1. Let's Begin!", and takes you step by step to create a database replication system. Sample ! data to be replicated will be generated by pgbench benchmark program.

      2.1. Configuring Replication

      --- 256,262 ----

      In this section, we'll use three database nodes, which we have already set up in section "1. Let's Begin!", and takes you step by step to create a database replication system. Sample ! data to be replicated will be generated by the pgbench benchmark program.

      2.1. Configuring Replication

      *************** href="#start-shutdown">Starting/Stopping *** 286,296 ****

      After configuring pgpool.conf and restarting pgpool-II, let's try the actual replication and see if everything is ! working OK.

      First, we need to create a database to be replicated. We will name it "bench_replication". This database needs to be created on all the ! nodes. Use createdb commands through pgpool-II, and the database will be created on all the nodes.

      $ createdb -p 9999 bench_replication
      --- 286,296 ----

      After configuring pgpool.conf and restarting pgpool-II, let's try the actual replication and see if everything is ! working.

      First, we need to create a database to be replicated. We will name it "bench_replication". This database needs to be created on all the ! nodes. Use the createdb command through pgpool-II, and the database will be created on all the nodes.

      $ createdb -p 9999 bench_replication
      *************** branches, tellers, accounts, and history *** 344,352 ****

      3. Your First Parallel Query

      ! Data within the different range is stored in two or more data base nodes in a parallel Query. This is called a partitioning. Moreover you could replicate some of tables among database nodes even ! in the parallel query mode.

      To enable parallel query in pgpool-II, you must set up another --- 344,352 ----

      3. Your First Parallel Query

      ! Data within the range is stored in two or more data base nodes in a parallel Query. This is called a partitioning. Moreover you could replicate some of tables among database nodes even ! in parallel query mode.

      To enable parallel query in pgpool-II, you must set up another *************** pgbench again to create sample data.

      *** 368,374 ****
      parallel_mode = true
      !

      Setting paralle_mode to true does not start parallel query automatically. pgpool-II needs SystemDB and the rules to know how to distribute data to the database nodes.

      --- 368,374 ----
      parallel_mode = true
      !

      Setting parallel_mode to true does not start parallel query automatically. pgpool-II needs SystemDB and the rules to know how to distribute data to the database nodes.

      *************** configured so that pgpool-II accepts tho *** 379,391 ****
      listen_addresses = '*'

      Attention: The replication is not done for the table that does the partitioning though a parallel Query and the replication can be made effective at the same time. ! Attention: You can have both partitioned tables and replicated ! tables. However a table cannot be a partioned and replicated one at the same time. Because the data structure of partioned tables and ! replicated tables are different, "bench_replication" database created in section "2. Your First Replication" cannot be reused in parallel query mode.

      --- 379,392 ----
      listen_addresses = '*'

      + Attention: The replication is not done for the table that does the partitioning though a parallel Query and the replication can be made effective at the same time. !

      Attention: You can have both partitioned tables and replicated ! tables. However a table cannot be a partioned and replicated at the same time. Because the data structure of partioned tables and ! replicated tables is different, "bench_replication" database created in section "2. Your First Replication" cannot be reused in parallel query mode.

      *************** load_balance_mode = true *** 407,419 ****

      3.2. Configuring SystemDB

      "System database" is just an ordinaly database. The only ! requirement is, dblink functions and the dist_def table, which describes partioning rule, must be installed in the system ! database. You could have system database on a database node, or you ! could have multiple node having system database by using cascade ! configuration of pgpool-II.

      !

      In this section, we will create SystemDB within the 5432 port node. The following list is the configuration parameters for SystemDB

      --- 408,420 ----

      3.2. Configuring SystemDB

      "System database" is just an ordinaly database. The only ! requirement is that dblink functions and the dist_def table, which describes partioning rule, must be installed in the system ! database. You could have a system database on a database node, or you ! could have multiple nodes having system databases by using a cascade ! configuration in pgpool-II.

      !

      In this section, we will create SystemDB on the 5432 port node. The following list is the configuration parameters for SystemDB

      *************** $ createdb -p 5432 -O pgpool pgpool *** 438,444 ****

      3.2.1. Installing dblink

      Next, we must install dblink into "pgpool" database. dblink is one ! of the tools included in contrib directory in the PostgreSQL source code.

      To install dblink to your system, execute the following commands.

      --- 439,445 ----

      3.2.1. Installing dblink

      Next, we must install dblink into "pgpool" database. dblink is one ! of the tools included in the contrib directory in the PostgreSQL source code.

      To install dblink to your system, execute the following commands.

      *************** $ USE_PGXS=1 make -C contrib/dblink inst *** 449,455 ****

      After dblink has been installed into your system, we will define ! dblink functions in "pgpool" database. If PostgreSQL is installed in /usr/local/pgsql, dblink.sql (a file with function definitions) should have been installed in /usr/local/pgsql/share/contrib. Now, execute the --- 450,456 ----

      After dblink has been installed into your system, we will define ! dblink functions in the "pgpool" database. If PostgreSQL is installed in /usr/local/pgsql, dblink.sql (a file with function definitions) should have been installed in /usr/local/pgsql/share/contrib. Now, execute the *************** CREATE TABLE pgpool_catalog.dist_def ( *** 498,504 ****

      A distribution rule decides how to distribute data to a particular node. Data will be distributed depending on the value of ! "col_name" column. "dist_def_func" is a function that takes the value of "col_name" as its argument, and returns an integer which points to the appropriate database node ID where the data should be stored.

      --- 499,505 ----

      A distribution rule decides how to distribute data to a particular node. Data will be distributed depending on the value of ! the "col_name" column. "dist_def_func" is a function that takes the value of "col_name" as its argument, and returns an integer which points to the appropriate database node ID where the data should be stored.

      *************** nodes can be merged into one result.

      *** 510,516 ****

      3.2.3. Defining replicate_def table

      If you want to use replicated tables in SELECT in parallel mode, you ! need to register information of such tables(replication rule) to a table called replicate_def. The replicate_def table has already been made when making it from the system_db.sql file when dist_def is defined. --- 511,517 ----

      3.2.3. Defining replicate_def table

      If you want to use replicated tables in SELECT in parallel mode, you ! need to register information about such tables (replication rule) in a table called replicate_def. The replicate_def table has already been made when making it from the system_db.sql file when dist_def is defined. *************** CREATE TABLE pgpool_catalog.replicate_de *** 530,536 **** replicate_def includes table's meta data information(dbname, schema_name, table_name, col_list, type_list).

      ! All the query analysis and query rewriting process are depending on the information (table, column and type) stored in dist_def and/or replicate_def table. If the information is not correct, analysis and query rewriting process will produce wrong results. --- 531,537 ---- replicate_def includes table's meta data information(dbname, schema_name, table_name, col_list, type_list).

      ! All the query analysis and query rewriting process are dependent on the information (table, column and type) stored in dist_def and/or replicate_def table. If the information is not correct, analysis and query rewriting process will produce wrong results. *************** sample data into three database nodes. T *** 543,549 **** by "pgbench -i -s 3" (i.e. scale factor of 3). We will create a new database called "bench_parallel" for this section.

      !

      In pgpool-II's source code, you can find dist_def_pgbench.sql file in sample directoy. We will use this sample file here to create distribution rules for pgbench. Execute the following command in extracted --- 544,550 ---- by "pgbench -i -s 3" (i.e. scale factor of 3). We will create a new database called "bench_parallel" for this section.

      !

      In pgpool-II's source code, you can find the dist_def_pgbench.sql file in sample directoy. We will use this sample file here to create distribution rules for pgbench. Execute the following command in extracted *************** pgpool-II source code directory.

      *** 556,562 ****

      Inside dist_def_pgbench.sql, we are inserting one row into "dist_def" table. There is a distribution function for accounts table. ! For key-column, aid is defined for accounts respectively (which is primary keys)

        INSERT INTO pgpool_catalog.dist_def VALUES (
      --- 557,564 ----
        

      Inside dist_def_pgbench.sql, we are inserting one row into "dist_def" table. There is a distribution function for accounts table. ! For the key-column, aid is defined for accounts respectively (which is ! the primary key)

        INSERT INTO pgpool_catalog.dist_def VALUES (
      *************** INSERT INTO pgpool_catalog.dist_def VALU
      *** 570,587 ****
        );
        
      !

      Now, we must define the distribution function for accounts table. Note that you can use the same function from different tables. Also, you can define functions using languages other than SQL (e.g. PL/pgSQL, PL/Tcl, etc.).

      The accounts table when data is initialized specifying 3 scale factor, The value of the aid is 1 to 300000. The function is defined so that data is evenly distributed to three data base nodes.

      ! SQL function will be defined as the return of the number of the data base node.

      --- 572,590 ---- );
      !

      Now, we must define the distribution function for the accounts table. Note that you can use the same function from different tables. Also, you can define functions using languages other than SQL (e.g. PL/pgSQL, PL/Tcl, etc.).

      + The accounts table when data is initialized specifying 3 scale factor, The value of the aid is 1 to 300000. The function is defined so that data is evenly distributed to three data base nodes.

      ! An SQL function will be defined to return of the number of the data base node.

      *************** $$ LANGUAGE sql; *** 596,605 **** --- 599,610 ----

      3.4. Defining Replication Rules

      + The replication rule is the one that which table decides the replication whether to be done.

      + Here, it is made with pgbench With the branches table and tellers table are registered. As a result, the accounts table and the inquiry that uses the branches table *************** INSERT INTO pgpool_catalog.replicate_def *** 623,630 **** --- 628,637 ---- );

      + Replicate_def_pgbench.sql is prepared in sample directory. + In the directory that progresses the source code to define a replicate rule by using this as follows The psql command is executed.

      *************** be restarted. Please refer to section "1
      *** 638,649 ****
        href="#start-shutdown">Starting/Stopping pgpool-II".

      After configuring pgpool.conf and restarting ! pgpool-II, let's try and see if parallel query is working OK.

      First, we need to create a database to be distributed. We will name it "bench_parallel". This database needs to be created on all the ! nodes. Use createdb commands through pgpool-II, and the ! database will be created on all the nodes.

      $ createdb -p 9999 bench_parallel
      --- 645,656 ---- href="#start-shutdown">Starting/Stopping pgpool-II".

      After configuring pgpool.conf and restarting ! pgpool-II, let's try and see if parallel query is working.

      First, we need to create a database to be distributed. We will name it "bench_parallel". This database needs to be created on all the ! nodes. Use the createdb command through pgpool-II, and the ! databases will be created on all the nodes.

      $ createdb -p 9999 bench_parallel
    SELECT is inside a transaction block Y