Answered by:
BernoulliFromBeta and logical factors
Question

Hi! I want to implement simple inference engine with factor graph inference as my research project. Simple = Bernoulli and Beta variables, BernoulliFromBeta and logical factors. I have a couple of questions about how these things are implemented in Infer.Net messagepassing algorithms.
1. I created a treestructured model based on Two Coins example. It has Bernoulli variables A, B, C, C = A & B, Beta priors for A and B and observations for C. From the generated code I've figured that only single iteration is done (because it's a tree?). Does that mean that inference is exact?
2. I looked at the sorce code of BernoulliFromBeta and And factors and found ProbTrueAverageConditional(Bernoulli, Beta) and AAverageConditional(Bernoulli, Bernoulli) to be the most sophisticated. Although there are comments describing how these messages are computed, only the final result of integration is presented. Can you point me to any books/papers where I can find explicit derivation of these?
 Edited by Alexander Fishkov Tuesday, May 1, 2012 7:00 AM
Tuesday, May 1, 2012 6:59 AM
Answers

1. Yes
2. With a few exceptions, each operator called from the generated code is a calculation that constructs an outgoing message from a factor to a variable; it takes all the messages coming into the factor and calculates the outgoing message. This local calculation will differ depending on the algorithm.
The two cases that you highlight are EP operators (shown by the 'AverageConditional' suffix). ProbTrueAverageConditional on the BernoulliFromBetaOp class is the calculation of the message from the Bernoulli factor to the 'ProbTrue' variable; it takes the incoming Beta message from the 'ProbTrue' variable, and the incoming Bernoulli message from the 'Sample' variable. AAverageConditional on the BooleanAndOp class is the calculation of the message from the 'And' factor to the A variable (the fact that your variable is called A is a coincidence  A just refers to the first variable attached to this factor).
In general, an EP operation (a) multiplies the factor with all incoming messages, (b) Integrates out all variables except the one we are outputting a message to, (c) projects the resulting distribution to the marginal distribution family of that variable, and (d) divides by the incoming message.
For me, the best reference for a general understanding of the operations needed in EP and VMP is Tom Minka's Divergence Measures and Message Passing report (http://research.microsoft.com/enus/um/people/minka/papers/messagepassing/).
John
 Edited by John GuiverMicrosoft employee, Owner Thursday, May 3, 2012 10:51 AM
 Marked as answer by Alexander Fishkov Friday, May 4, 2012 9:45 PM
Thursday, May 3, 2012 10:50 AMOwner 
1. If sample takes on values 0 and 1, then factor(sample,probTrue) = probTrue^sample * (1probTrue)^(1sample). The is the usual formula for a Bernoulli distribution.
2. Since And is a deterministic factor, its formula involves a Dirac delta function: factor(A,B,And) = delta(And  (A & B)). That is, the factor value is 1 when (And == A&B) and 0 otherwise.
 Marked as answer by Alexander Fishkov Friday, May 4, 2012 9:45 PM
Friday, May 4, 2012 5:01 PMOwner
All replies

1. Yes
2. With a few exceptions, each operator called from the generated code is a calculation that constructs an outgoing message from a factor to a variable; it takes all the messages coming into the factor and calculates the outgoing message. This local calculation will differ depending on the algorithm.
The two cases that you highlight are EP operators (shown by the 'AverageConditional' suffix). ProbTrueAverageConditional on the BernoulliFromBetaOp class is the calculation of the message from the Bernoulli factor to the 'ProbTrue' variable; it takes the incoming Beta message from the 'ProbTrue' variable, and the incoming Bernoulli message from the 'Sample' variable. AAverageConditional on the BooleanAndOp class is the calculation of the message from the 'And' factor to the A variable (the fact that your variable is called A is a coincidence  A just refers to the first variable attached to this factor).
In general, an EP operation (a) multiplies the factor with all incoming messages, (b) Integrates out all variables except the one we are outputting a message to, (c) projects the resulting distribution to the marginal distribution family of that variable, and (d) divides by the incoming message.
For me, the best reference for a general understanding of the operations needed in EP and VMP is Tom Minka's Divergence Measures and Message Passing report (http://research.microsoft.com/enus/um/people/minka/papers/messagepassing/).
John
 Edited by John GuiverMicrosoft employee, Owner Thursday, May 3, 2012 10:51 AM
 Marked as answer by Alexander Fishkov Friday, May 4, 2012 9:45 PM
Thursday, May 3, 2012 10:50 AMOwner 
Thanks for the answer! Let me formulate my question more precisely: how exactly do the functions of the aforementioned factors look like?
 BernoulliFromBeta. Comment before ProbTrueAverageConditional says: "proj[p(probTrue) sum_(sample) p(sample) factor(sample,probTrue)]/p(probTrue)". What is the exact formula for factor(sample,probTrue) ?
 BooleanAndOp. It uses socalled 'gating' function deeper in implementation. How is this related to integrating (summing) a product of factors and messages? Is it possible to express factor function explicitly? like factor(A, B, And)?
Maybe it's some simple math but I can't figure this out. I just want to fully understand the derivation.
Thursday, May 3, 2012 12:20 PM 
1. If sample takes on values 0 and 1, then factor(sample,probTrue) = probTrue^sample * (1probTrue)^(1sample). The is the usual formula for a Bernoulli distribution.
2. Since And is a deterministic factor, its formula involves a Dirac delta function: factor(A,B,And) = delta(And  (A & B)). That is, the factor value is 1 when (And == A&B) and 0 otherwise.
 Marked as answer by Alexander Fishkov Friday, May 4, 2012 9:45 PM
Friday, May 4, 2012 5:01 PMOwner 
Thanks! I came to nearly same formulas after representing the model as Bayes Net. After reading the paper suggested by John, I realised that source of my confusion was projection operation.
Friday, May 4, 2012 9:38 PM