<div dir="ltr"><div>Hi Ishii-San</div><div><br></div><div>Thanks for looking into the patch and your observation is correct, the patch I sent would break</div><div>the timestamp overwriting functionality,</div><div><br></div><div>So here is another go at this, See the new attached patch, which uses two parsers, </div><div>1- src/parser/gram.y which is the standard parser brought in from PG code</div><div>2- src/parser/gram_minimal.y modified parser that short-circuits the INSERT and UPDATE statements</div><div><br></div><div>The idea here is when replication mode is enabled we use the standard parser and parse everything since</div><div>we need the information in case of timestamp or other things. but when we are operating in the master-slave</div><div>mode we don't need any extra information for WRITE queries, since we will always be sending them to the primary</div><div>anyways, so the newly added minimal parser which short-circuits the INSERT and UPDATE queries</div><div>(can also be extended to short-circuit all the write queries like CREATE and ALTER where ever we can)</div><div>and try to enhance the write query performance of the Pgpool-II,</div><div><br></div><div>The parser selection is made in raw_parser() function and is controlled by the "bool use_minimal" argument</div><div>and currently in the patch we always use the minimal parser whenever the replication mode is disabled in the</div><div>pgpool.conf</div><div><br></div><div>This is the very radical change in the parsing function of Pgpool-II, but it can provide us with the platform to minimise</div><div>the parsing overhead of Pgpool-II for master-slave mode.</div><div><br></div><div>Your thoughts and comments..</div><div><br></div><div>Thanks</div><div>Best Regards</div><div>Muhammad Usama</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 27, 2019 at 3:58 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Usama,<br>
<br>
I think this patch breaks replication mode because it throws away<br>
information needed for time stamp rewriting. What do you think?<br>
<br>
Best regards,<br>
--<br>
Tatsuo Ishii<br>
SRA OSS, Inc. Japan<br>
English: <a href="http://www.sraoss.co.jp/index_en.php" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br>
Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>
<br>
> Hi Ishii San<br>
> <br>
> Can you have a look at the attached patch which tries to extract some<br>
> performance in the area of query parsing and query analysis for routing<br>
> decisions. Most of the performance gains from the changes in the patch can<br>
> be observed in large data INSERT statements.<br>
> <br>
> Patch contains the following changes<br>
> ==========<br>
> 1-- The idea here is since Pgpool-II only needs a very little information<br>
> about the queries especially for the insert queries to decide where it<br>
> needs to send the query,<br>
> for example: For the INSERT queries we only need the type of query and the<br>
> relation name.<br>
> But since the parser we use in Pgpool-II is taken from PostgreSQL source<br>
> which parses the complete query including the value lists ( which is not<br>
> required by Pgpool).<br>
> This parsing of value part seems very harmless in small statements but in<br>
> case of INSERTs with lots of column values and large data in each value<br>
> item, this becomes significant.<br>
> So this patch adds a smaller bison grammar rule to short circuit the INSERT<br>
> statement parsing when it gets the enough information required for<br>
> Pgpool-II.<br>
> <br>
> 2-- The patch also re-arranges some of the if statements in<br>
> pool_where_to_send() function and tries to make sure the pattern_compare<br>
> and pool_has_function_call calls should only be made when they are<br>
> absolutely necessary.<br>
> <br>
> 3--Another thing this patch does is, it tries to save the raw_parser()<br>
> calls in case of un-recognised queries. Instead of invoking the parser of<br>
> "dummy read" and "dummy write" queries in case of syntax error in original<br>
> query, the patch adds the functions to get pre-built parse_trees for these<br>
> dummy queries.<br>
> <br>
> 4-- strlen() call is removed from scanner_init() function and is passed in<br>
> as an argument to it, and the reason is we already have the query length in<br>
> most cases before invoking the parser so why waste CPU cycles on it. Again<br>
> this becomes significant in case of large query strings.<br>
> <br>
> Finally the patch tries to remove the unnecessary calls of<br>
> pool_is_likely_select()<br>
> <br>
> As mentioned above the area of improvements in this patch are mostly around<br>
> writing queries and for the testing purpose I used a INSERT query with<br>
> large binary insert data and I am getting a very huge performance gains<br>
> with this patch<br>
> <br>
> *Current Master Branch*<br>
> ================<br>
> usama=# \i sample.sql<br>
> id<br>
> -----<br>
> 104<br>
> (1 row)<br>
> <br>
> INSERT 0 1<br>
> Time: *2059.807* ms (00:02.060)<br>
> <br>
> *WITH PATCH*<br>
> ===============<br>
> usama=# \i sample.sql<br>
> id<br>
> -----<br>
> 102<br>
> (1 row)<br>
> <br>
> INSERT 0 1<br>
> Time: *314.237* ms<br>
> <br>
> <br>
> Performance gain* 655.50 %*<br>
> <br>
> <br>
> Comments and suggestions?<br>
> <br>
> Please let me know if you also want the test data I used for the INSERT test<br>
> <br>
> Thanks<br>
> Best regards<br>
> Muhammad Usama<br>
</blockquote></div></div>