Asked by:
Details of Conjugate prior for Gaussian
Question

I'm new to Infer.Net, but I have some experience with probabilistic modelling, including conjugate priors and variational Bayes.
I'm trying to understand the constraints in Infer.Net which seem to require conjugate priors in places. For this email, let us use the good old Gaussian as example.
The conjugate prior for a Gaussian, with unknown mean and precision, is GaussianGamma. This makes the mean dependent on the precision (in the prior). This gives a closedform solution to the posterior (again GaussianGamma) and a closedform solution for the predictive distribution (Student's T). (This is all explained e.g. in Bishop's PRML.)
In the examples in Infer.Net however, the mean and precision are given independent Gaussian and Gamma priors. Each of these priors would be conjugate if the other parameter had been fixed. But when both mean and precision are unknown, the independent prior is not conjugate.
I do understand that VB and EP can handle models for which no closedform solutions exist. For example, if we have a hierarchical model with three (continuous) unknowns, A> B > C, then even if P(A) is a conjugate prior for P(BA) and P(BA) is a conjugate prior for P(CB), then we often have that the marginal P(B) is no longer conjugate to P(CB). For example let P(CB) be Gaussian, with mean B and fixed precision. Let P(BA) be Gaussian with fixed mean and precision A and let A be Gamma. We have conjugacy associated with each step (each >) in the hierarchy, but the marginal P(B) is Student's T and not conjugate to the Gaussian P(CB), so that the marginal P(C) does not have a closed form. I know that VB can be used to provide approximate solutions for P(C) in this kind of model. (If we introduce an additional arc A>C, so that A is the precision for C, then P(A,B) is GaussianGamma and conjugate to the Gaussian P(CB,A) and then P(C) and P(A,BC) do have the closed forms mentioned above.)
So my question is: Although VB and EP can be used for models where the whole model does not have a closedform solution, there are nevertheless local constraints on conjugacy. Can someone please help to explain more clearly what these local constraints on conjugacy are, especially for distributions with multiple unknown parameters.
Thanks
Sunday, April 7, 2013 7:20 AM
All replies

The VMP constraint is that the approximate marginals need to be conjugate with respect to the attached factors, when all other arguments are known. In your example, B's true marginal is Student T but this is irrelevant since the approximate marginal is Gaussian. To make your example more precise wrt factor graphs, let the unnormalized joint distribution be f(A,B)*g(B,C). If we are taking q(B) to be Gaussian, then this must be conjugate to f(A,B) (with A fixed) and also conjugate to g(B,C) (with C fixed). This constraint is explained in detail in the VMP papers. EP has no such constraint, but conjugacy makes it more efficient so Infer.NET tends to only provide implementations for the conjugate case.
 Edited by Tom MinkaMicrosoft employee, Owner Sunday, April 7, 2013 8:59 AM
Sunday, April 7, 2013 8:56 AMOwner 
Thanks for the prompt reply (mine is an order of magnitude slower)! By looking at John Winn's 2004 JMLR VMP paper, which has a univariate Gaussian example, I now understand this better.
So if one wanted an exact solution in this simple case with Infer.Net, I guess one could define a new distribution class (as an extension to Infer.Net), namely the GaussianGamma distribution. And I guess that would then force the inferred post5erior distribution to be a joint distribution for the mean and precision.
Wednesday, April 10, 2013 1:21 PM