locked
Gaussian 2D mixture - comparison with VIBES (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • pmckeigue posted on 01-06-2009 5:25 AM

    With the Gaussian 2D mixture example given in the tutorial, the results I obtain are nowhere near those given in the user documentation:

    Dist over pi=Dirichlet(3.924 98.08)
    Dist over means=
    [0] VectorGaussian(1.878 1.109, 0.2143    -0.005538)
                                    -0.005538 0.2014   
    [1] VectorGaussian(4.316 4.006, 0.06767 0.02269)
                                    0.02269 0.01116)

    As an exercise, I've tried adapting this script to run the Gaussian 2D mixture dataset provided with VIBES (9 components, modelled with K=20).  VIBES easily infers the 9 components with correct means, but infer.net doesn't seem to distinguish the clusters.

    Is there any way to output the lower bound for VMP iterations, as in VIBES?

     

     

     

     

    Friday, June 3, 2011 4:42 PM

Answers

  • John Guiver replied on 01-12-2009 9:05 AM

    The Gaussian 2-D mixture in the Infer.NET Beta 2 download now matches the documentation.

     John G.

    Friday, June 3, 2011 4:42 PM

All replies

  • John Guiver replied on 01-06-2009 5:47 AM

    The discrepancy between the the user documentation and the tutorial code in the release is that the release example code does not set reasonable priors for the mixture component means and precisions - this was fixed at the point when the tutorial documentation was written, and will be fixed in the example when we next update the release. In the meantime, here is the correct Infer.NET code which matches the documentation:

    // Define a range for the number of mixture components
    Range k = new Range (2).Named( "k" );

    // Mixture component means
    VariableArray<Vector> means = Variable.Array<Vector>(k).Named("means");
    means[k] =
    Variable.VectorGaussianFromMeanAndPrecision(new Vector(0.0,0.0), PositiveDefiniteMatrix.IdentityScaledBy(2,0.01)).ForEach(k);

    // Mixture component precisions
    VariableArray < PositiveDefiniteMatrix > precs = Variable .Array< PositiveDefiniteMatrix >(k).Named( "precs" );
    precs[k] =
    Variable .WishartFromShapeAndScale(100.0, PositiveDefiniteMatrix .IdentityScaledBy(2,0.01)).ForEach(k); // Mixture weights
    Variable < Vector > weights = Variable .Dirichlet(k, new double [ { 1, 1 }).Named( "weights" );

    // Create a variable array which will hold the data
    Range n = new Range (300).Named( "n" );
    VariableArray < Vector > data = Variable .Array< Vector >(n).Named( "x" ); // Create latent indicator variable for each data point
    VariableArray < int > z = Variable .Array< int >(n).Named( "z" ); // The mixture of Gaussians model
    using ( Variable .ForEach(n))
    {
      z[ n] =
    Variable .Discrete(weights);
     
    using ( Variable .Switch(z[ n]))
      {
        data[ n] =
    Variable .VectorGaussianFromMeanAndPrecision(means[z[ n]], precs[z[ n]]);
      }
    }

    // Attach some generated data
    data.ObservedValue = GenerateData(n.SizeAsInt);

    // Initialise messages randomly so as to break symmetry
    Discrete[ zinit = new Discrete[n.SizeAsInt];
    for (int i = 0; i < zinit.Length; i++)
      zinit[ i] =
    Discrete.PointMass(Rand.Int(k.SizeAsInt), k.SizeAsInt);
    z.InitialiseTo(
    Distribution<int>.Array(zinit));

    // The inference
    InferenceEngine ie = new InferenceEngine(new VariationalMessagePassing());
    Console.WriteLine("Dist over pi=" + ie.Infer(weights));
    Console.WriteLine("Dist over means=\n" + ie.Infer(means));
    Console.WriteLine("Dist over precs=\n" + ie.Infer(precs));

    Friday, June 3, 2011 4:42 PM
  • jwinn replied on 01-06-2009 6:16 AM

    The results of applying VMP to this model will depend significantly on the initialisation of the messages.  VIBES uses a fixed initialisation strategy that gives reasonable results in many cases.  With Infer.NET we chose instead to have a user-specified initialisation, such as the one given in the tutorial code for zInit.  This initialisation randomly hard assigns each data point to one of the mixture components.  There are other, less drastic, possibilities e.g. setting zInit to a slightly perturbed uniform distribution or initialising the means instead. To see what initialisation VIBES is using, load your network, press the Init button and look at the distributions at each node - they will show the initialised posteriors for each variable.  If you reproduce this initialisation in Infer.NET, you should get the same results as VIBES does. 

    To find the lower bound in Infer.NET, you need to compute the model evidence as described in this page:
    http://research.microsoft.com/en-us/um/cambridge/projects/infernet/docs/Computing%20model%20evidence%20for%20model%20selection.aspx

    If you want to monitor the bound as inference progresses, then you will have to control the inference yourself using the CompiledAlgorithm as described on this page:
    http://research.microsoft.com/en-us/um/cambridge/projects/infernet/docs/Controlling%20how%20inference%20is%20performed.aspx
    More precisely, you would call Reset() and Initialise() on the CompiledAlgorithm, then retrieve the marginal distribution for your evidence variable in between calls to Update().

    Best,
    John W.

    Friday, June 3, 2011 4:42 PM
  • John Guiver replied on 01-12-2009 9:05 AM

    The Gaussian 2-D mixture in the Infer.NET Beta 2 download now matches the documentation.

     John G.

    Friday, June 3, 2011 4:42 PM