locked
A question about gaussian mixture models (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • euclid posted on 10-07-2009 6:37 AM

    Hi All,

    I am completely new to both Infer and Gaussian Mixture Models (GMM). Infer.Net looks extensive, thanks for the effort on it. I am going to use it in a novel application. 

    At a glance, I thought that a GMM for a set of (random) observations could be obtained within the Infer.Net. I obtained these random observations from a system; but when I started going through the API, I felt quite lost. 

    Is there an easy starting point to obtain a GMM for random observations with unknown distribution?

    Thanks!

     

    Friday, June 3, 2011 5:19 PM

Answers

  • euclid replied on 10-20-2009 10:26 AM

    Hi John,

    Thanks for the final part; it really helped. The thing with the observed data is that, it may/or not follow a certain distribution, in either case; the GMM will not be aware of that distribution. So question probably comes down to distribution of the means defined.

    Yes, that's true that predominance is often occuring; I ran the model multiple times and made a statistical comparison of  observed vs. sample from final GMM to see the effect of  the final weight posterior). 

    Dirichlet(2.94 3.003 2.906 37.76 6.297 3.092)     Significance 0.03;

    Dirichlet(21.89 11.48 8.917 5.183 3.586 4.941)   Sign. 0.113

    Dirichlet(9.025 8.973 9.429 9.197 9.601 9.776)   Sign. 0.8

    Dirichlet(2.495 2.475 2.469 2.464 2.466 43.63)   Sign. 0.195

    Dirichlet(9.451 9.58 8.918 9.858 9.06 9.134)       Sign. 0.4

    Dirichlet(4.28 9.327 6.86 6.677 5.09 23.77)         Sign. 0.45

    when this sign values are greater then 0.05, there's no difference between the distribution of observations & samples.
    More uniform the weight values (as in the second one) more successful was the GMM in terms of similarity to distribution of observation.
    All in all, this should be it now. Thanks for all the support. 
    Can

     

    Friday, June 3, 2011 5:20 PM

All replies

  • jwinn replied on 10-07-2009 10:18 AM

    If you work through the tutorials, then you will get an introduction to the Infer.NET API and as a bonus, tutorial number 6 is a multivariate Gaussian Mixture Model.

    If you want to skip straight there, the link is:
    http://research.microsoft.com/en-us/um/cambridge/projects/infernet/docs/Mixture%20of%20Gaussians%20tutorial.aspx

    I would recommend working through the tutorials as these should provide a gentle introduction to the API and Infer.NET concepts.  If the tutorials are confusing in places please let us know where and we will try to improve them.

    Best,
    John W.

     

    Friday, June 3, 2011 5:20 PM
  • euclid replied on 10-09-2009 9:32 AM

    Hi John,

    Thank you for your quick response. I have gone through the tutorials (particularly 3rd,6th, and Branching). I have a feeling that what I want to do can be done through Infer.Net very practically. 

    The story is this, assume that I have a function f=(2x^3)+t, where t takes random values with unknown distribution. Therefore, for an x value, the function can have different values. What I like to do is to replace the t with a GMM and then store/retrieve/use this GMM to generate new realisations for t when required. 

    Although I receive an error at the last bit, I have come this far:

    private void getGMM_Click(object sender, EventArgs e)
     {

               //RETRIEVE OBSERVATIONS

               double[] observedSet = new double[50];   // I can have a max of 50 observations of z

                for (int i = 0; i < observedSet.Length; i++) observedSet[ i ] = Rand.Normal(0, 1);  //mock for z, more or less random realisations.

                Range dataRange = new Range(observedSet.Length);    // data range

                VariableArray<double> data = Variable.Array<double>(dataRange);

                data.ObservedValue = observedSet;  // pass as observation

                //DESIGN MIXTURE MODEL

                Range noMixComp = new Range(6);  //Number of mixture components (how many core distr will form the mixed model)

                ///CREATE AN ARRAY FOR THE MEANS OF OUR MIXTURE COMPONENTS AND GIVE THEM PRIORS            

                //An array of double variables of size noMixComp for means

                VariableArray<double> means = Variable.Array<double>(noMixComp).Named("means");   

                //Assign UNIVARIATE Gaussian prior to each element of the array (z

                means[noMixComp] = Variable.GaussianFromMeanAndVariance(0.5,0.08).ForEach(noMixComp); //mock mean/var,  

                 //An array of double precs of size noMixComp for precs

                VariableArray<double> precs = Variable.Array<double>(noMixComp).Named("precs");

                precs[noMixComp] = Variable.GaussianFromMeanAndPrecision(0, 1).ForEach(noMixComp);  //mock mean/prec

                 //TO DO: CONSTRAIN THE SUM OF THE WEIGHTS OF INDIVIDUAL MODELS TO <=1 

                //Mixture weights with a Dirichlet prior 

                Variable<Vector> weights = Variable.Dirichlet(noMixComp, new double[] { 1, 1 }).Named("weights"); //size of the vector weights is given by the range k

                 //MIXING TOGETHER

                 //Create a variable array for the data

                Range n = new Range(data.Range.SizeAsInt); //or use dataRange

                 //for each data point, we need a variable to indicate which mixture component the data point came from

                VariableArray<int> z = Variable.Array<int>(dataRange);

                 //CONSTRUCT THE MIXTURE MODEL

                using (Variable.ForEach(dataRange))

                { //ForEach block to loop over z and give each element of z a prior given by the weights

                    z[dataRange] = Variable.Discrete(weights);

                    using (Variable.Switch(z[dataRange]))

                    { //Switch to assign each data point to be drawn from the gaussian with mean/prec given by the element of z

                        data[dataRange] = Variable.GaussianFromMeanAndPrecision(means[z[dataRange]], precs[z[dataRange]]);

                    }

                }

                //TO DO:  initialisation of z, constraining 

                 // The inference

                InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());

                //Console.WriteLine("Dist over pi=" + ie.Infer(weights));

                Console.WriteLine("mean=\n" + engine.Infer(means));

                Console.WriteLine("precs=\n" + engine.Infer(precs));

                 //ie.ShowFactorGraph = true;

            }

     How do you think it is the best way to proceed from here? When I run it givces an error "Attempt to redefine array lengths at depth 0." at line engine.Infer(means). I will also constrain the weights so that in the final model, the sum of the weight of different Gaussians will add up to 1. 

    All the best,

    Can

     

     

    Friday, June 3, 2011 5:20 PM
  • John Guiver replied on 10-18-2009 8:46 AM

    Hi Can

    There are two problems in your code.

    First, the line 

         Variable<Vector> weights = Variable.Dirichlet(noMixComp, new double[] { 1, 1 }).Named("weights");

    should be replaced by

         Variable<Vector> weights = Variable.DirichletUniform(noMixComp).Named("weights"

    The 'new double[]{1,1} was giving two pseudo-counts when you need 6 - i.e. you were attempting to redefine array lengths.

    Secondly, you need to precision to be drawn from a Gamma. For example:

        precs[noMixComp] = Variable.GammaFromShapeAndRate(1, 1).ForEach(noMixComp);

    Also, as commented in your code, you will need to provide initialisation to break symmetry.

    John G.

    Friday, June 3, 2011 5:20 PM
  • euclid replied on 10-19-2009 9:37 AM

    Thanks John,

    I made the corrections you suggested. Since Dirichlet has a sum constraint, I didn't need to constrain the weights anymore. This was very useful.

    I had a look at the weights by outputting 'engine.Infer(z)' in the code; it appears that the magnitudes of weight are relatively quite large for a mixture component compared to the rest of them(e.g.: 0.01116   0.01116   0.01118   0.01116   0.9442   0.01117) although I used DirichletUniform. Is this a problem or a general behaviour? 

    The rest below may sound a bit naive; but couldn't figure out this 'how to use the results':

    I wonder how I will know which of these weights actually fits my data most, also if the weights inferred are optimized in accordance with the inferred mixture components? 

    Given that the model works, I'll have to derive random realisations from the final mixture model.  I did this by splitting means/precs of each of the components into a string[] by getting rid of the \n\t; then used the given numbers to return random samples from the mixture model components(well, weights were an issue as I asked above).

    Thanks,

    Can

    Friday, June 3, 2011 5:20 PM
  • John Guiver replied on 10-20-2009 4:17 AM

    Hi Can

    It's difficult to interpret your weights without knowing what your data was. If your data matches the model - i.e. is generated from the same mixture structure that you are trying to infer, then the z posteriors will reflect which mixture component each datum belongs to. The more data you have, the more pronounced this will be. If your model does not match the data - for example if your data comes from a single Gaussian, and you have several mixture components - then the picture is less clear. For example, when I run your model with 50 random data points generated from Norm(0,1), I get the following - the fifth component predominates for all data points, but all the mixture components contribute.

    Dist over pi=Dirichlet(8.217 6.847 10.25 6.113 18.12 6.457)
    Dist over z=
    [0] Discrete(0.1539 0.1237 0.2026 0.1076 0.2971 0.1151)
    [1] Discrete(0.1377 0.1219 0.1458 0.1118 0.366 0.1167)
    [2] Discrete(0.1553 0.1275 0.1966 0.1123 0.2889 0.1194)
    [3] Discrete(0.1273 0.1015 0.1624 0.08796 0.4266 0.0943)
    [4] Discrete(0.1443 0.1252 0.1609 0.1135 0.337 0.1191)
    ...
    mean=
    [0] Gaussian(0.3594, 0.04957)
    [1] Gaussian(0.3844, 0.05376)
    [2] Gaussian(0.3238, 0.0436)
    [3] Gaussian(0.3984, 0.05628)
    [4] Gaussian(0.1878, 0.04042)
    [5] Gaussian(0.3918, 0.05508)
    precs=
    [0] Gamma(4.653, 0.2262)
    [1] Gamma(3.969, 0.259)
    [2] Gamma(5.641, 0.1999)
    [3] Gamma(3.598, 0.2821)
    [4] Gamma(9.366, 0.07861)
    [5] Gamma(3.772, 0.2706)

    To use the model, you first need to recover the posteriors as the correct types:

    Dirichlet weightPosteriors = engine.Infer<Dirichlet>(weights);
    Gaussian[] meanPosteriors = engine.Infer<Gaussian[]>(means);
    Gamma[] precPosteriors = engine.Infer<Gamma[]>(precs);

    Then you can sample from these to give you samples from the overall model as follows:

    double sampleFromModel(Dirichlet weightPosteriors, Gaussian[] meanPosteriors, Gamma[] precPosteriors)
    {
        // Sample from the Diriclet
       
    Vector v = weightPosteriors.Sample();
        // Sample from the Discrete
       
    int component = (new Discrete(v)).Sample();
        // Sample from the mean posterior
       
    double mean = meanPosteriors[component].Sample();
        // Sample from the precision posterior
       
    double prec = precPosteriors[component].Sample();
        // Sample from the component model
       
    return (Gaussian.FromMeanAndPrecision(mean, prec)).Sample();
    }

    Alternatively you can just use the means of the posteriors - i.e. v, mean, and prec in the above code are just the means of the corresponding posteriors rather than samples from the posterior distributions.

    John G.

    Friday, June 3, 2011 5:20 PM
  • euclid replied on 10-20-2009 10:26 AM

    Hi John,

    Thanks for the final part; it really helped. The thing with the observed data is that, it may/or not follow a certain distribution, in either case; the GMM will not be aware of that distribution. So question probably comes down to distribution of the means defined.

    Yes, that's true that predominance is often occuring; I ran the model multiple times and made a statistical comparison of  observed vs. sample from final GMM to see the effect of  the final weight posterior). 

    Dirichlet(2.94 3.003 2.906 37.76 6.297 3.092)     Significance 0.03;

    Dirichlet(21.89 11.48 8.917 5.183 3.586 4.941)   Sign. 0.113

    Dirichlet(9.025 8.973 9.429 9.197 9.601 9.776)   Sign. 0.8

    Dirichlet(2.495 2.475 2.469 2.464 2.466 43.63)   Sign. 0.195

    Dirichlet(9.451 9.58 8.918 9.858 9.06 9.134)       Sign. 0.4

    Dirichlet(4.28 9.327 6.86 6.677 5.09 23.77)         Sign. 0.45

    when this sign values are greater then 0.05, there's no difference between the distribution of observations & samples.
    More uniform the weight values (as in the second one) more successful was the GMM in terms of similarity to distribution of observation.
    All in all, this should be it now. Thanks for all the support. 
    Can

     

    Friday, June 3, 2011 5:20 PM