# Gaussian mixture model (univariate+EP)

• ### Question

• Hi

I want to implement a univariate Gaussian mixture model. I just used the code from infer.net to see if it works using Expectation Propagation. I change the code to get unvariate Gaussian. But I get a strange error "Parameter is not valid". Can you please help me in understanding what is wrong here?

Actually I want to learn a weight matrix, and I want to prevent overfitting and get a sparse number of non-zero elements. I need to put a mixture of two diestributions for each w element (one with mean zero and the other non-zero mean - my goal is to implement Gibbs variable selection in paper "A Review of Bayesian Variable Selection
Methods: What, How and Which : By R.B. O'Hara and M. J. Sillanpaa, 2009"). I have no idea if it is going to work for Expectation Propagation. Would you please give me some advices in this case?

Thanks a lot.

static void Main(string[] args)
{
Program p = new Program();
p.Run();
}

public void Run()
{
// Define a range for the number of mixture components
Range k = new Range(2).Named("k");

// Mixture component means
VariableArray<double> means = Variable.Array<double>(k).Named("means");
//means[k] = Variable.GaussianFromMeanAndVariance(0.0,2).ForEach(k);
means[k] = Variable.GaussianFromMeanAndVariance(0.5, 0.08).ForEach(k);

// Mixture component precisions
VariableArray<double> precs = Variable.Array<double>(k).Named("precs");
// precs[k] = Variable.GammaFromShapeAndRate(1, 2).ForEach(k);
precs[k] = Variable.GammaFromShapeAndRate(1, 1).ForEach(k);
/*Creates a PositiveDefiniteMatrix random variable with a Wishart distribution with the specified shape parameter and scale matrix.*/

// Mixture weights
Variable<Vector> weights = Variable.Dirichlet(k, new double[] { 1, 1 }).Named("weights");
/*In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted
* \operatorname{Dir}(\boldsymbol\alpha), is a family of continuous multivariate probability distributions parametrized
* by a vector \boldsymbol\alpha of positive reals. It is the multivariate generalization of the beta distribution.[1]
* Dirichlet distributions are very often used as prior distributions in Bayesian statistics, and in fact the Dirichlet
* distribution is the conjugate prior of the categorical distribution and multinomial distribution.*/

// Create a variable array which will hold the data
Range n = new Range(300).Named("n");
VariableArray<double> data = Variable.Array<double>(n).Named("x");
// Create latent indicator variable for each data point
VariableArray<int> z = Variable.Array<int>(n).Named("z");

// The mixture of Gaussians model
using (Variable.ForEach(n))
{
z[n] = Variable.Discrete(weights);
using (Variable.Switch(z[n]))
{
data[n] = Variable.GaussianFromMeanAndPrecision(means[z[n]], precs[z[n]]);
}
}
// each z[n] is now a discrete random variable : an int discrete random variable

// Attach some generated data
data.ObservedValue = GenerateData(n.SizeAsInt);

// Initialise messages randomly so as to break symmetry
Discrete[] zinit = new Discrete[n.SizeAsInt];
for (int i = 0; i < zinit.Length; i++)
zinit[i] = Discrete.PointMass(Rand.Int(k.SizeAsInt), k.SizeAsInt);
z.InitialiseTo(Distribution<int>.Array(zinit));
//Console.WriteLine("******************");
//for (int i = 0; i < 300; i++)
//{
//    Console.WriteLine("zinit[" + i + "]=" + zinit[i]);
//}
//The inference
InferenceEngine ie = new InferenceEngine(new ExpectationPropagation());
ie.ShowFactorGraph = true;
if ((ie.Algorithm is ExpectationPropagation))
{
Console.WriteLine("Dist over pi=" + ie.Infer(weights));
Console.WriteLine("Dist over means=\n" + ie.Infer(means));
Console.WriteLine("Dist over precs=\n" + ie.Infer(precs));
}
else
Console.WriteLine("This example is not supported by Expectation Propagation");
}

/// <summary>
/// Generates a data set from a particular true model.
/// </summary>
public double[] GenerateData(int nData)
{
double trueM1 = 0.5;
double trueM2 = 0.5;
double trueP1 = 0.1;
double trueP2 = 0.1;
Gaussian trueVG1 = Gaussian.FromMeanAndPrecision(trueM1, trueP1);
Gaussian trueVG2 = Gaussian.FromMeanAndPrecision(trueM2, trueP2);
double truePi = 0.6;
Bernoulli trueB = new Bernoulli(truePi);
// Restart the infer.NET random number generator
Rand.Restart(12347);
double[] data = new double[nData];
for (int j = 0; j < nData; j++)
{
bool bSamp = trueB.Sample();
data[j] = bSamp ? trueVG1.Sample() : trueVG2.Sample();
}
return data;
}

Monday, October 6, 2014 2:12 PM

• That is an issue with drawing the factor graph.  Set ShowFactorGraph = false and it works fine.
• Marked as answer by Monday, October 6, 2014 3:11 PM
Monday, October 6, 2014 3:05 PM
• When I run it, the weights are different.  Maybe you changed something else?  Also keep in mind your prior on the means is very narrow.
• Marked as answer by Monday, October 6, 2014 3:48 PM
Monday, October 6, 2014 3:35 PM

### All replies

• That is an issue with drawing the factor graph.  Set ShowFactorGraph = false and it works fine.
• Marked as answer by Monday, October 6, 2014 3:11 PM
Monday, October 6, 2014 3:05 PM
• Thank you Tom. Just one more thing: I get equal weights for mixture components. I don't understand why is it like this. I have changed the trueMeans and True precisions to different values as follows:

            double trueM1 = 10;            double trueM2 = 1;            double trueP1 = 5;            double trueP2 = 1;            Gaussian trueVG1 = Gaussian.FromMeanAndPrecision(trueM1, trueP1);            Gaussian trueVG2 = Gaussian.FromMeanAndPrecision(trueM2, trueP2);            double truePi = 0.2;

Still I get (Dirichlet distribution still has the pseudocount initially set):

Dist over pi=Dirichlet(1 1)
Dist over means=
[0] Gaussian(1.553, 0.04997)
[1] Gaussian(1.553, 0.04997)
Dist over precs=
[0] Gamma(57.19, 0.001147)[mean=0.06561]
[1] Gamma(57.19, 0.001147)[mean=0.06561]

Seems it works with VMP but still strange answers: (Is there any way to get it work using EP?)

Dist over pi=Dirichlet(212.6 89.41)
Dist over means=
[0] Gaussian(1.091, 0.003303)
[1] Gaussian(1.295, 0.07056)
Dist over precs=
[0] Gamma(106.8, 0.01284)[mean=1.372]
[1] Gamma(45.2, 0.0004183)[mean=0.01891]

• Edited by Monday, October 6, 2014 3:36 PM more complete
Monday, October 6, 2014 3:25 PM
• When I run it, the weights are different.  Maybe you changed something else?  Also keep in mind your prior on the means is very narrow.
• Marked as answer by Monday, October 6, 2014 3:48 PM
Monday, October 6, 2014 3:35 PM
• Thank you. I think there was a problem with my computer or visual studio. After clean and build it is different. Thank you so much :)
Monday, October 6, 2014 3:48 PM