Answered by:
Error computing COMPoisson distribution for inference in Poisson mixture model
Question

I'm trying to make some simple predictions using a mixture of poisson distributions with a gamma prior. However, when I set the priors to the posterior that I learned from the model, I'm getting a very strange error when trying to do inference in the model.
I've isolated some code that does this, as the following:
class PoissonMixtureTest { static void Main() { int components = 3; Range k = new Range(components).Named("k"); VariableArray<Gamma>meanPriors = Variable.Array<Gamma>(k).Named("meanPriors"); VariableArray<double> means = Variable.Array<double>(k).Named("means"); means[k] = Variable<double>.Random(meanPriors[k]); double[] priorWeights = Enumerable.Repeat<double>(1, components).ToArray(); Variable<Vector> weights = Variable.Dirichlet(k, priorWeights).Named("weights"); Variable<int> num = Variable.New<int>().Named("num"); Range n = new Range(num).Named("n"); VariableArray<int> z = Variable.Array<int>(n).Named("z"); VariableArray<int> numSessions = Variable.Array<int>(n).Named("numSessions"); using (Variable.ForEach(n)) { z[n] = Variable.Discrete(weights); using (Variable.Switch(z[n])) { numSessions[n] = Variable.Poisson(means[z[n]]); } } // Set observed priors meanPriors.ObservedValue = Util.ArrayInit(components, t => new Gamma(100, 0.01)); meanPriors.ObservedValue = new [] { new Gamma(5.363e+04, 0.001202), new Gamma(5.583e+04, 0.0002394), new Gamma(8.217e+04, 2.045e05) }; // Do some inference num.ObservedValue = 1; Variable.ConstrainTrue(numSessions[n] >= 5); var engine = new InferenceEngine(); // Print out weight vector and count Console.WriteLine(engine.Infer(z)); Console.WriteLine(engine.Infer(numSessions)); } }
By setting meanPriors to something uninformative that was originally used to train the model, the inference works. However, if I set these to the distributions that I learned from training the model (the second part), the following error appears:
This happens when Poisson.FromMeanAndMeanLogFactorial(38.91, 115.79) is called (and also with other combinations of values with other priors)
Is there something wrong with this model, that the Gamma parameters are too extreme for the priors? Is there a way to set this up to work around this error?
Thanks in advance.
Thursday, December 11, 2014 5:42 AM
Answers

This is a bug. It will only work if you change the Switch block to read:
using (Variable.Switch(z[n])) { numSessions[n] = Variable.Poisson(means[z[n]]); Variable.ConstrainTrue(numSessions[n] >= 5); numSessions.AddAttribute(new DoNotInfer()); }
and you do not infer numSessions. Note that even without the bug you would not get good results on this kind of model, since it is using a COMPoisson approximation while the posterior over numSessions is actually a truncated Poisson. Marked as answer by Andrew Mao Sunday, December 14, 2014 7:00 AM
Thursday, December 11, 2014 6:07 PMOwner
All replies

This is a bug. It will only work if you change the Switch block to read:
using (Variable.Switch(z[n])) { numSessions[n] = Variable.Poisson(means[z[n]]); Variable.ConstrainTrue(numSessions[n] >= 5); numSessions.AddAttribute(new DoNotInfer()); }
and you do not infer numSessions. Note that even without the bug you would not get good results on this kind of model, since it is using a COMPoisson approximation while the posterior over numSessions is actually a truncated Poisson. Marked as answer by Andrew Mao Sunday, December 14, 2014 7:00 AM
Thursday, December 11, 2014 6:07 PMOwner 
Hi Tom,
Thanks for the reply. I understand that the model is simplistic and could be improved; here I was just trying to make some simple predictions from the data.
I'm slightly confused about your response. I tried your code and it does indeed allow for predictions to be made for just the z discrete distribution. However, what if I wanted to make predictions about the conditional distribution of numSessions given that it is above some value, like in the Truncated Gaussian example? Is there some workaround that I could use to do this, or do I need to use a different model altogether?
It seems also that the posterior over numSessions would be a mixture of truncated Poisson distributions (or truncated mixture of ...?) . Suppose we just wanted the mean of this distribution  is the "right" way to do that to infer an appropriate approximating distribution and compute its mean, or is there a more accurate way to get the moment directly?
 Edited by Andrew Mao Thursday, December 11, 2014 8:15 PM
Thursday, December 11, 2014 8:13 PM 
The right way to do it is to infer an appropriate approximating distribution. In this case, a truncated Poisson distribution (which does not yet exist in Infer.NET).Friday, December 12, 2014 6:13 PMOwner