[pgpool-general: 436] Re: strange load balancing issue in Solaris

Aravinth aravinth at mafiree.com
Wed May 9 19:39:30 JST 2012


Yes the issue is with random() function.

Looks like I have solved the problem by using rand.

Regards,
Aravinth


On Wed, May 9, 2012 at 4:02 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:

> Thanks. Apparently random() of Solaris could return value beyond
> RAND_MAX! It's easy to fix the problem, but I would like to do it with
> respcet to portability. Any idea?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
> >>From Solaris 10 (x86) man page:
> >
> >
> > SYNOPSIS
> >      #include <stdlib.h>
> >
> >      long random(void);
> >
> >      void srandom(unsigned int seed);
> >
> >      char  *initstate(unsigned  int  seed,  char  *state,  size_t
> >      size);
> >
> >      char *setstate(const char *state);
> >
> > DESCRIPTION
> >      The random() function uses  a  nonlinear  additive  feedback
> >      random-number generator employing a default state array size
> >      of 31  long  integers  to  return  successive  pseudo-random
> >      numbers  in the range from 0 to 2**31 -1. The period of this
> >      random-number generator is approximately 16 x (2 **31   -1).
> >      The  size  of  the  state array determines the period of the
> >      random-number generator. Increasing  the  state  array  size
> >      increases the period.
> >
> >      The srandom() function initializes the current  state  array
> >      using the value of seed.
> >
> >
> > (...)
> >
> >
> >
> > Regards,
> > Rafal
> >
> >
> >
> > -----Original Message-----
> > From: pgpool-general-bounces at pgpool.net [mailto:
> pgpool-general-bounces at pgpool.net] On Behalf Of Tatsuo Ishii
> > Sent: Wednesday, May 09, 2012 11:44 AM
> > To: caravinth at gmail.com
> > Cc: pgpool-general at pgpool.net
> > Subject: [pgpool-general: 431] Re: strange load balancing issue in
> Solaris
> >
> > Thanks.
> >
> > 2012-05-09 14:31:48 LOG:   pid 22459: r: 268356063.000000 total_weight:
> 32767.000000
> >
> > This is really weird. Here pgpool caculate this:
> >
> >       r = (((double)random())/RAND_MAX) * total_weight;
> >
> > Total weight is same as RAND_MAX.  It seems your random() returns
> > bigger than RAND_MAX, which does not make sense because man page of
> > random(3) on my Linux says:
> >
> >          The random() function uses a non-linear additive feedback
> random number
> >        generator  employing a default table of size 31 long integers to
> return
> >        successive pseudo-random numbers in the range from 0 to RAND_MAX.
>   The
> >        period  of  this  random  number generator is very large,
> approximately
> >        16 * ((2^31) - 1).
> >
> > What does your man page for random() say on your system?
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > English: http://www.sraoss.co.jp/index_en.php
> > Japanese: http://www.sraoss.co.jp
> >
> >> Sorry . I missed it.
> >>
> >> Here is the log file.
> >>
> >> --Aravinth
> >>
> >>
> >> On Wed, May 9, 2012 at 2:07 PM, Tatsuo Ishii <ishii at postgresql.org>
> wrote:
> >>
> >>> > The code you have sent is same in child.c.
> >>>
> >>> No.
> >>>
> >>>        pool_log("r: %f total_weight: %f", r, total_weight);
> >>>
> >>> You need to add the line above to get usefull information.
> >>> --
> >>> Tatsuo Ishii
> >>> SRA OSS, Inc. Japan
> >>> English: http://www.sraoss.co.jp/index_en.php
> >>> Japanese: http://www.sraoss.co.jp
> >>>
> >>>
> >>> > I have attached the log file. Please check
> >>> >
> >>> >
> >>> > --Aravinth
> >>> >
> >>> >
> >>> > On Tue, May 8, 2012 at 6:20 AM, Tatsuo Ishii <ishii at postgresql.org>
> >>> wrote:
> >>> >
> >>> >> I suspect there's some portablity issue with load balance code. The
> >>> >> actual source code is in select_load_balancing_nodechild.c).
> >>> >> Please modify source code and connect to pgpool by using psql.
> >>> >> Please send the log output.
> >>> >> --
> >>> >> Tatsuo Ishii
> >>> >> SRA OSS, Inc. Japan
> >>> >> English: http://www.sraoss.co.jp/index_en.php
> >>> >> Japanese: http://www.sraoss.co.jp
> >>> >>
> >>> >> int select_load_balancing_node(void)
> >>> >> {
> >>> >>        int selected_slot;
> >>> >>        double total_weight,r;
> >>> >>        int i;
> >>> >>
> >>> >>        /* choose a backend in random manner with weight */
> >>> >>        selected_slot = MASTER_NODE_ID;
> >>> >>        total_weight = 0.0;
> >>> >>
> >>> >>        for (i=0;i<NUM_BACKENDS;i++)
> >>> >>        {
> >>> >>                if (VALID_BACKEND(i))
> >>> >>                {
> >>> >>                        total_weight +=
> BACKEND_INFO(i).backend_weight;
> >>> >>                }
> >>> >>        }
> >>> >>        r = (((double)random())/RAND_MAX) * total_weight;
> >>> >>        pool_log("r: %f total_weight: %f", r, total_weight);
> >>>  <--
> >>> >> add this
> >>> >>
> >>> >>        total_weight = 0.0;
> >>> >>        for (i=0;i<NUM_BACKENDS;i++)
> >>> >>        {
> >>> >>                if (VALID_BACKEND(i) &&
> BACKEND_INFO(i).backend_weight >
> >>> >> 0.0)
> >>> >>                {
> >>> >>                        if(r >= total_weight)
> >>> >>                                selected_slot = i;
> >>> >>                        else
> >>> >>                                break;
> >>> >>                        total_weight +=
> BACKEND_INFO(i).backend_weight;
> >>> >>                 }
> >>> >>        }
> >>> >>
> >>> >>        pool_debug("select_load_balancing_node: selected backend id
> is
> >>> %d",
> >>> >> selected_slot);
> >>> >>         return selected_slot;
> >>> >> }
> >>> >>
> >>> >>
> >>> >> > Hi Tatsuo, Thanks for the reply.
> >>> >> >
> >>> >> > The normalized weights are 0.5 for both nodes and the selected
> node is
> >>> >> always the same node. I hope then it's srandom().
> >>> >> >
> >>> >> >
> >>> >> > Any idea to solve this srandom issue
> >>> >> >
> >>> >> >
> >>> >> > Thanks and Regards,
> >>> >> > Aravinth
> >>> >> >
> >>> >> >
> >>> >> > ________________________________
> >>> >> >  From: Tatsuo Ishii <ishii at postgresql.org>
> >>> >> > To: aravinth at mafiree.com
> >>> >> > Cc: pgpool-general at pgpool.net
> >>> >> > Sent: Tuesday, May 1, 2012 4:41 AM
> >>> >> > Subject: Re: [pgpool-general: 396] strange load balancing issue in
> >>> >> Solaris
> >>> >> >
> >>> >> > First of all please check "normalized" weights are as you
> expected.
> >>> >> > Run "show pool_status;" and see "backend_weight0",
> "backend_weight1"
> >>> >> > section. You see a floating point numbers, which are the
> normalized
> >>> >> > weight between 0.0 and 1.0. If you see both are 0.5, primary and
> >>> >> > standby are given same weight.
> >>> >> >
> >>> >> > If they are ok, I suspect srandom() function behavior is different
> >>> >> > from other platforms. Pgpool-II chooses the load balance node by
> using
> >>> >> > srandom(). select_load_balancing_node() is the function which is
> >>> >> > responsible for selecting the load balance node. If you run
> pgpool-II
> >>> >> > with -d (debug) option, you will see following in the log:
> >>> >> >
> >>> >> >     pool_debug("select_load_balancing_node: selected backend id is
> >>> %d",
> >>> >> selected_slot);
> >>> >> >
> >>> >> > If backend_weight in show pool_status are fine but the line above
> >>> >> > always shows same number, it is the sign that we have problem with
> >>> >> > srandom().
> >>> >> > --
> >>> >> > Tatsuo Ishii
> >>> >> > SRA OSS, Inc. Japan
> >>> >> > English: http://www.sraoss.co.jp/index_en.php
> >>> >> > Japanese: http://www.sraoss.co.jp
> >>> >> >
> >>> >> >> Hi All,
> >>> >> >>
> >>> >> >> I am facing a strange issue in load balancing with replication
> mode
> >>> set
> >>> >> to
> >>> >> >> true in Solaris. Load balancing algorithm always select the same
> node
> >>> >> >> whatever may be the backend weight
> >>> >> >>
> >>> >> >> Here is the scenario.
> >>> >> >>
> >>> >> >> I have a pgpool installed installed in 1 server
> >>> >> >> 2 postgres nodes in other 2 servers
> >>> >> >> replication mode set to true and load balancing set to true
> >>> >> >> backend weight of the 2 nodes is 1.
> >>> >> >>
> >>> >> >> When I fire the queries manuall using different connections or
> using
> >>> >> >> pgbench all the queries hit the same node. Load balancing
> algorithm
> >>> >> always
> >>> >> >> select the same node.
> >>> >> >> No effect in changing the backend weight. Only when I set backend
> >>> >> weight to
> >>> >> >> 0 hits go to the other server.
> >>> >> >>
> >>> >> >>
> >>> >> >> I face this issue only in solaris. The same setup in other
> servers (
> >>> >> centos
> >>> >> >> ,RHEL, ubunt etc) does the load balancing perfectly.
> >>> >> >>
> >>> >> >> Also tries various postgres versions and pgpool version with same
> >>> >> result.
> >>> >> >> But every version runs fine in other servers.
> >>> >> >>
> >>> >> >> Has anyone faced this issue?
> >>> >> >>
> >>> >> >> Any information would highly helpful.
> >>> >> >>
> >>> >> >> Regards,
> >>> >> >> Aravinth
> >>> >> _______________________________________________
> >>> >> pgpool-general mailing list
> >>> >> pgpool-general at pgpool.net
> >>> >> http://www.pgpool.net/mailman/listinfo/pgpool-general
> >>> >>
> >>>
> > _______________________________________________
> > pgpool-general mailing list
> > pgpool-general at pgpool.net
> > http://www.pgpool.net/mailman/listinfo/pgpool-general
> > _______________________________________________
> > pgpool-general mailing list
> > pgpool-general at pgpool.net
> > http://www.pgpool.net/mailman/listinfo/pgpool-general
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20120509/4fb2b36d/attachment-0001.html>


More information about the pgpool-general mailing list