[pgpool-hackers: 3244] Re: Some performance improvements for Pgpool-II

Mon Feb 18 15:59:32 JST 2019

On Mon, Feb 18, 2019 at 9:32 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> > Hi Ishii San
> >
> > Can you have a look at the attached patch which tries to extract some
> > performance in the area of query parsing and query analysis for routing
> > decisions. Most of the performance gains from the changes in the patch
> can
> > be observed in large data INSERT statements.
> >
> > Patch contains the following changes
> > ==========
> > 1-- The idea here is since Pgpool-II only needs a very little information
> > about the queries especially for the insert queries to decide where it
> > needs to send the query,
> > for example: For the INSERT queries we only need the type of query and
> the
> > relation name.
> > But since the parser we use in Pgpool-II is taken from PostgreSQL source
> > which parses the complete query including the value lists ( which is not
> > required by Pgpool).
> > This parsing of value part seems very harmless in small statements but in
> > case of INSERTs with lots of column values and large data in each value
> > item, this becomes significant.
> > So this patch adds a smaller bison grammar rule to short circuit the
> INSERT
> > statement parsing when it gets the enough information required for
> > Pgpool-II.
> >
> > 2-- The patch also re-arranges some of the if statements in
> > pool_where_to_send() function and tries to make sure the pattern_compare
> > and pool_has_function_call calls should only be made when they are
> > absolutely necessary.
> >
> > 3--Another thing this patch does is, it tries to save the raw_parser()
> > calls in case of un-recognised queries. Instead of invoking the parser of
> > "dummy read" and "dummy write" queries in case of syntax error in
> original
> > query, the patch adds the functions to get pre-built parse_trees for
> these
> > dummy queries.
> >
> > 4-- strlen() call is removed from scanner_init() function and is passed
> in
> > as an argument to it, and the reason is we already have the query length
> in
> > most cases before invoking the parser so why waste CPU cycles on it.
> Again
> > this becomes significant in case of large query strings.
> >
> > Finally the patch tries to remove the unnecessary calls of
> > pool_is_likely_select()
> >
> > As mentioned above the area of improvements in this patch are mostly
> around
> > writing queries and for the testing purpose I used a INSERT query with
> > large binary insert data and I am getting a very huge performance gains
> > with this patch
> >
> > *Current Master Branch*
> > ================
> > usama=# \i sample.sql
> >  id
> > -----
> >  104
> > (1 row)
> >
> > INSERT 0 1
> > Time: *2059.807* ms (00:02.060)
> >
> > *WITH PATCH*
> > ===============
> > usama=# \i sample.sql
> >  id
> > -----
> >  102
> > (1 row)
> >
> > INSERT 0 1
> > Time: *314.237* ms
> >
> >
> > Performance gain* 655.50 %*
> >
> >
> > Comments and suggestions?
> >
> > Please let me know if you also want the test data I used for the INSERT
> test
>
> Yes, please share it with me.
>

Sure, Please find the attached sample.sql.zip file.

To run the test

usama=# create table lsa_backend_file(content bytea, checksum text, id
serial);
usama=# \i sample.sql

Thanks
Best Regards
Muhammad Usama

>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190218/38e15225/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample.sql.zip
Type: application/zip
Size: 27573 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190218/38e15225/attachment-0001.zip>