locked
A Question about "ConstrainEqualRandom" (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • Keri posted on 10-23-2009 9:09 PM

    Hi,

    I am new to Infer.net, and I am trying to play with it.
    As documented, there is a great way to attach constraint between
    random variables using "ConstrainEqualRandom"...I made
    a simple test showing below, but I cannot explain why Infer.net
    gives me that result...

    Could anyone explain how Infer.net generates such result? or I
    used this function in a wrong way?..

    What I want to test is suppose I have a prio distribution on A, and B.
    I also know C = A & B... My intuition is if I have some constraint on
    C, can I infer something about A,B,and C...

    ---------------------------------------------------------------------------------------------------------------------

           static void Main()
            {
                Variable<bool> PA = Variable.Bernoulli(0.3).Named("PA");  
                Variable<bool> PB = Variable.Bernoulli(0.8).Named("PB");               
                Variable<bool> PC = (PA & PB).Named("PC");

                Variable.ConstrainEqualRandom<bool, Bernoulli>(PC, new Bernoulli(0.6));

                InferenceEngine ie = new InferenceEngine();
                ie.ShowFactorGraph = true;
               
                Console.WriteLine("PA: " + ie.Infer(PA));
                Console.WriteLine("PB: " + ie.Infer(PB));
                Console.WriteLine("PC: " + ie.Infer(PC));
            }

    ----------------------------------------------------------------------------------------------------------------------

    Compiling model...done.
    Initialising...done.
    Iterating:
    .........|.........|.........|.........|.........| 50
    PA: Bernoulli(0.375)
    PB: Bernoulli(0.8214)
    PC: Bernoulli(0.3214)
    .

    when I change Variable.ConstrainEqualRandom<bool, Bernoulli>(PC, new Bernoulli(0.6));
    to some others like Variable.ConstrainEqualRandom<bool, Bernoulli>(PC, new Bernoulli(0.7));
    Infer.net will give me different result as well.

    Friday, June 3, 2011 5:22 PM

Answers

  • minka replied on 11-11-2009 12:13 PM

    As explained in the user guide (http://research.microsoft.com/en-us/um/cambridge/projects/infernet/docs/Attaching%20constraints%20to%20variables.aspx), ConstrainEqualRandom is just a shortcut for ConstrainEqual(Variable.Random).  So the line:

     Variable.ConstrainEqualRandom<bool, Bernoulli>(C, new Bernoulli(0.6));

    is equivalent to:

     Variable.ConstrainEqual(C, Variable.Random(new Bernoulli(0.6)));

    which is equivalent to:

      Variable<bool> D = Variable.Bernoulli(0.6);
      Variable.ConstrainEqual(C, D);

    You can interpret ConstrainEqual as "I have observed that C and D are equal."  

    To solve any probability problem, start by writing down the joint distribution of all variables.  We know the following conditional distributions:

    p(C=T) = 0.3
    p(D=T) = 0.6
    p(constraint | C, D) = 1 if C=D and 0 otherwise

    Therefore the joint distribution is p(C,D,constraint) = p(constraint | C,D) p(C) p(D)

    Suppose the question is to compute p(C=T | constraint).  Substitute C=T and sum out D:

    p(C=T, constraint) = sum_D p(C=T,D,constraint) = p(constraint | C=T, D=T) p(C=T) p(D=T) + p(constraint | C=T, D=F) p(C=T) p(D=F)
                                      = 0.3*0.6

    Now apply the definition of a conditional distribution:

    p(C=T | constraint) = p(C=T, constraint) / p(constraint)

    p(constraint) = sum_(D,C) p(C,D,constraint) = 0.3*0.6 + 0.7*0.4

    Thus we get p(C=T | constraint) = 0.3913.

    Friday, June 3, 2011 5:23 PM

All replies

  • John Guiver replied on 10-25-2009 4:41 AM

    Hi Keri

    Your code is correct, and the results are as expected.  If you comment out the ConstrainEqualRandom line, you will get the following expected results:

    PA: Bernoulli(0.3)
    PB: Bernoulli(0.8)
    PC: Bernoulli(0.24)

    This is in conflict with the prior belief in C given by the ConstrainEqualRandom (i.e. is much smaller than our prior belief of 0.6), and the inference adjusts our beliefs in A, B, C to be consistent with the hard constraint PC = PA & PB, increasing the beliefs in A, B and decreasing the belief in C

    John 

    Friday, June 3, 2011 5:22 PM
  • Keri replied on 10-25-2009 1:21 PM

    Thanks John,


    Now I understand the point. But what I didn't figure out is how Infer.net reaches that result, based on what kind of formula? Or can I manually follow
    some algorithm to get that result.

    Let's make a even simple example:

            static void Main()
            {
                Variable<bool> P = Variable.Bernoulli(0.3).Named("P");           
                Variable.ConstrainEqualRandom<bool, Bernoulli>(P, new Bernoulli(0.6));

                InferenceEngine ie = new InferenceEngine();
                ie.ShowFactorGraph = true;

                Console.WriteLine("P: " + ie.Infer(P));
          }

    ------------------------------------------------------------------------------------------------------------------------------------------------

    The code means (if I understand correctly): Suppose I have a prior distribution on P (say 0.3).  Now I
    observed from some sampling for the probability of P is 0.6 (So I made a constraint to say P should have
    the probability 0.6). Then I invoke Infer.net to adjust my belief on P, and it will amazingly get a new
    distribution for P as 0.3913 (the result makes sense since the P has a higher probability of 0.6, so the
    new P is 0.3913)....

    Compiling model...done.
    Initialising...done.
    Iterating:
    .........|.........|.........|.........|.........| 50
    P: Bernoulli(0.3913)

    --------------------------------------------------------------------------------------------------------------------------------------------------

    It's quite awesome that Infet.net can be used to adjust the original belief. My question is how that gets achieved.
    By Bayesian inference?(How, since I have only one random variable), or the factor graph?

    Best,
    Keri.

    Friday, June 3, 2011 5:22 PM
  • jwinn replied on 10-25-2009 6:42 PM

    The subtlety here is that ConstrainEqualRandom(new Bernoulli(0.6)) does not constrain the posterior over P to be Bernoulli(0.6).  Instead it modifies the posterior so that the probability of P being true is weighted by 0.6 and the probability of P being false is weighted by 0.4 – the probabilities are then renormalized to add up to one.  So the posterior over P is 0.3*0.6 / (0.3*0.6 + 0.7*0.4) = 0.3913.

    To understand this in more detail, consider your first example. Without the constraint, the following states of A and B have these probabilities:
    P(A=F, B=F) = 0.7*0.2 = 0.14
    P(A=T,B=F) = 0.3*0.2 = 0.06
    P(A=F, B=T) = 0.7*0.8 = 0.56
    P(A=T,B=T) = 0.3*0.8 = 0.24
    The total is 1.0, as we would expect.

    Now supposing we constrain C to be true, which is like ConstrainEqualRandom(new Bernoulli(1.0)). The only outcome which could allow this is A and B both being true, so all three variables will be true with probability 1.0. All other configurations of A and B will have probability 0.
    P(A=F, B=F) =0 (impossible because of the constraint)
    P(A=T,B=F) = 0 (impossible because of the constraint)
    P(A=F, B=T) =0 (impossible because of the constraint)
    P(A=T,B=T) = 1.0 (the only valid configuration)

    But what if we constrain C to be false (ConstrainEqualRandom(new Bernoulli(0.0)))? Now there are three possibilities, corresponding to the first three configurations.  The fourth configuration where A=T, B=T is no longer possible and so has probability 0. So we renormalize the three remaining probabilities to add up to one – the sum of 0.14+0.06+0.56=0.76, so the probabilities are:
    P(A=F, B=F) = 0.14/0.76 = 0.1842...
    P(A=T,B=F) = 0.06/0.76=  0.0789...
    P(A=F, B=T) =0.56/0.76 =0.7368...
    P(A=T,B=T) = 0 (impossible because of the constraint)
    These still add up to 1.

    Now for the case you present, where we constrain C using ConstrainEqualRandom(new Bernoulli(0.6)). So 0.6 of the time, we constrain C=1 (so we weight the fourth case by 0.6) and the rest of the time we constain C=0 (so we weight the first three cases by 0.4). Once again we then need to normalize by the total, which I will write here as Z.
    P(A=F, B=F) = 0.14 * 0.4 / Z = 0.056 / Z
    P(A=T,B=F) =  0.06 * 0.4/ Z = 0.024 / Z
    P(A=F, B=T) = 0.56 *0.4 / Z =  0.224 / Z
    P(A=T,B=T) = 0.24 *0.6 / Z= 0.144 / Z

    Now, Z = 0.056+0.024+0.224+0.144 = 0.448, so:
    P(A=F, B=F) = 0.056 / Z = 0.125
    P(A=T,B=F) =  0.024 / Z = 0.0536
    P(A=F, B=T) = 0.224 / Z = 0.5
    P(A=T,B=T) = 0.144 / Z  = 0.3214

    So P(C=T) = P(A=T,B=T) = 0.3214
    Then P(A=T) = P(A=T,B=F) + P(A=T,B=T) = 0.0536+0.3214 = 0.375
    Then P(B=T) = P(A=F,B=T) + P(A=T,B=T) = 0.5+0.3214 = 0.8214
    Which is exactly what Infer.NET reported!

    The moral of the story is that constraints are powerful, but often counter-intuitive things because they cause the distribution to be renormalized. In other words, the probability mass for all configurations that violate the constraint is removed and the probabilities for remaining states must be rescaled so that they still add up to one. Stochastic constraints are even more confusing because they re-weight the probabilities instead of excluding some of them. But although counter-intuitive, constraints are very useful modelling components - without them we could not have observed variables or construct undirected graphical models.

    Hope that makes sense,


    Best,
    John W.


     

    Friday, June 3, 2011 5:23 PM
  • Keri replied on 10-25-2009 7:12 PM

    Really appreciate John's explanation... It is very detail, and makes sense to me.

    Friday, June 3, 2011 5:23 PM
  • Keri replied on 11-06-2009 3:08 PM

    Hi,

    Sorry to raise up the question again. I am still trying to understand the mathematic part of these computation. I had tried to use bayes formula (e.g., P(A|B)=P(B|A)P(A)/P(B) )to interpret them, but fails.  For example,  this computation 0.3*0.6 / (0.3*0.6 + 0.7*0.4) = 0.3913 is not a bayesian one, and it is based on "weight" to re-normalize data. So what's the "math" behind such computation? Why we can do in such a way?

    Actually, my central problem is "yes, I had used this ConstrainEqualRandom to attach condition on random variable, and it works great"..But I don't know how to explain them mathematically? What's the logic behind this function (does it just use weight to re-normalize a posterior distribution?) So could you point out me any reference?

    Thanks a lot,
    Keri

     

    Friday, June 3, 2011 5:23 PM
  • Keri replied on 11-09-2009 1:16 PM


    Hi,

    Sorry to raise up the question again. I am still trying to get the mathematic part of these computation. I had tried to use bayes formula (e.g., P(A|B)=P(B|A)P(A)/P(B) )to interpret them, but fails.  For example,  this computation 0.3*0.6 / (0.3*0.6 + 0.7*0.4) = 0.3913 is not a bayesian one, and it is based on "weight" to re-normalize data. So what's the "math" behind such computation? Why we can do in such a way?

    Actually, my central problem is "yes, I had used this ConstrainEqualRandom to attach condition on random variable, and it works great", and I can follow John's equation to compute them....But I don't know how to explain them mathematically? What's the logic behind this function (does it just use weight to re-normalize a posterior distribution?) So could you point out me any reference?

    Thanks a lot,
    Keri

     

    Friday, June 3, 2011 5:23 PM
  • minka replied on 11-11-2009 12:13 PM

    As explained in the user guide (http://research.microsoft.com/en-us/um/cambridge/projects/infernet/docs/Attaching%20constraints%20to%20variables.aspx), ConstrainEqualRandom is just a shortcut for ConstrainEqual(Variable.Random).  So the line:

     Variable.ConstrainEqualRandom<bool, Bernoulli>(C, new Bernoulli(0.6));

    is equivalent to:

     Variable.ConstrainEqual(C, Variable.Random(new Bernoulli(0.6)));

    which is equivalent to:

      Variable<bool> D = Variable.Bernoulli(0.6);
      Variable.ConstrainEqual(C, D);

    You can interpret ConstrainEqual as "I have observed that C and D are equal."  

    To solve any probability problem, start by writing down the joint distribution of all variables.  We know the following conditional distributions:

    p(C=T) = 0.3
    p(D=T) = 0.6
    p(constraint | C, D) = 1 if C=D and 0 otherwise

    Therefore the joint distribution is p(C,D,constraint) = p(constraint | C,D) p(C) p(D)

    Suppose the question is to compute p(C=T | constraint).  Substitute C=T and sum out D:

    p(C=T, constraint) = sum_D p(C=T,D,constraint) = p(constraint | C=T, D=T) p(C=T) p(D=T) + p(constraint | C=T, D=F) p(C=T) p(D=F)
                                      = 0.3*0.6

    Now apply the definition of a conditional distribution:

    p(C=T | constraint) = p(C=T, constraint) / p(constraint)

    p(constraint) = sum_(D,C) p(C,D,constraint) = 0.3*0.6 + 0.7*0.4

    Thus we get p(C=T | constraint) = 0.3913.

    Friday, June 3, 2011 5:23 PM