locked
Why the posterior is not truely updated by data observation? RRS feed

  • Question

  • Hi, I have a code for feature selection based on a spike and slab distribution on feature weights. I use a variable called gamma to choose between spike and slab for each feature (I followed clinical trial example).

    I have a variable called output, which is the class attribute (+1,-1 in the data file, and in my model I changed it to true/false).

    When I run the model, all gamma posteriors are almost the same. That is while, it should be a selector between spike and slab - like the model in clinical trial example. I have no idea why this happens. It seems I have missed something, but I think the factor graph is OK.

    Can anyone have a look into this please? Any help is appreciated.

    My data file can be downloaded from this link:

    https://www.dropbox.com/s/8t1uq75xpuyrw8q/train_v2?dl=0

            static int n = 51;
            static int d ;
            // Range j = y.Range.Named("person");
    
            static void Main(string[] args)
            {
                readData();
                // Rho is the parameter of Bernoulli distribution used for gamma parameter - formula 4 of the paper. 
                // (sparsity can be determined using Rho)
                double Rho = 0.003; 
                Range dRange = new Range(d).Named("dimensionRange");
    
                VariableArray<bool> output = Variable.Observed(outputData).Named("output");
                Range nRange = output.Range.Named("nRange");
                VariableArray<VariableArray<double>, double[][]> input = Variable.Array(Variable.Array<double>(dRange), nRange).Named("input");
                VariableArray<bool> gamma = Variable.Array<bool>(dRange).Named("gamma");
                gamma[dRange] = Variable.Bernoulli(Rho).ForEach(dRange);
    
                VariableArray<VariableArray<double>, double[][]> w = Variable.Array(Variable.Array<double>(dRange), nRange).Named("w");
                VariableArray<Gaussian> spike = Variable.Array<Gaussian>(dRange).Named("spike");
                //VariableArray<Gaussian> fixedSlab = Variable.Array<Gaussian>(dRange).Named("slab");
    
                Variable<Gaussian> slabMeanPrior = Gaussian.FromMeanAndPrecision(0, 3);
                VariableArray<double> slabMean = Variable.Array<double>(dRange).Named("slabMean");
                slabMean[dRange] = Variable<double>.Random(slabMeanPrior).ForEach(dRange);
                            
                using (Variable.ForEach(dRange)) spike[dRange] = Gaussian.FromMeanAndVariance(0, 0.000001);
    
                // implementing formula 3. 
                // defining w based on gamma 
                // do it for all w elements (for all data rows)
                using (Variable.ForEach(dRange))
                {
                    using (Variable.If(gamma[dRange]))
                    {
                        using (Variable.ForEach(nRange))
                        {
                            // slab is the true distribution for w
                            //variance is set to one and mean is a variable 
                            w[nRange][dRange] = Variable<double>.GaussianFromMeanAndPrecision(slabMean[dRange], 1); // Variable<double>.Random(fixedSlab[dRange]);
                        }
                    }
                    using (Variable.IfNot(gamma[dRange]))
                    {
                        using (Variable.ForEach(nRange))
                        {
                            // spike is the true distribution for w
                            // spike is just a delta function at origin and does not have any parameter to be learned from the data
                            w[nRange][dRange] = Variable<double>.Random(spike[dRange]);
                        }
                    }
                }
    
    
                // Now for all data rows, formula 1 should be applied 
                using (Variable.ForEach(nRange))
                {
                    VariableArray<double> tmp = Variable<double>.Array(dRange);
                    tmp[dRange] = w[nRange][dRange] * input[nRange][dRange]; // 
                    Variable<double> weightedSum = Variable<double>.Sum(tmp).Named("WeightedSum_formula1")  ;
                    
                    // noise 
                    double noise = 0.1;
    
                    output[nRange] = Variable.GaussianFromMeanAndVariance(weightedSum, noise) >= 0;
    
                }//dataRange
    
    
                //GenerateData(n);
                input.ObservedValue = inputData;
    
    
                InferenceEngine ie = new InferenceEngine();
                ie.ShowFactorGraph = true;
                Bernoulli[] gammaPosterior = ie.Infer<Bernoulli[]>(gamma);
                PointMass<Gaussian[]> spikePosterior = ie.Infer<PointMass<Gaussian[]>>(spike);
                Gaussian[] slabMeanPosterior = ie.Infer<Gaussian[]>(slabMean);
    
                System.IO.StreamWriter file = new System.IO.StreamWriter("result.txt");
                
    
                for (int i = 0; i < gammaPosterior.Length; i++)
                {
                    file.WriteLine("gammaPosterior" + "(" + i + ")=" + gammaPosterior[i]);
                    file.WriteLine("spikePosterior" + "(" + i + ")=" + spikePosterior.Point[i]);
                    file.WriteLine("slabMeanPosterior" + "(" + i + ")=" + slabMeanPosterior[i]);
                    file.WriteLine();
                }
    
                file.Close();
    
                Console.ReadLine();
            }
    
            public static void writeArr(Gaussian[] array, string name/*, StreamWriter resultFile*/)
            {
                // Console.WriteLine();
                for (int i = 0; i < array.Length; i++)
                {
                    Console.WriteLine(name + "(" + i + ")=" + array[i]);
                }
            }
    
            public static void writeArr(Bernoulli[] array, string name/*, StreamWriter resultFile*/)
            {
                // Console.WriteLine();
                for (int i = 0; i < array.Length; i++)
                {
                    Console.WriteLine(name + "(" + i + ")=" + array[i]);
                }
            }
    
            static double[][] inputData = new double[n][];
            static bool[] outputData = new bool[n];
    
            public static void readData()
            {
                int counter = 0;
                string line;
                char[] separators = new char[] { ',', '\t' };
    
                
    
                // Read the file and display it line by line.
                System.IO.StreamReader file =
                   new System.IO.StreamReader("train_v2");
                d = file.ReadLine().Split(separators).Length;
                
                while((line = file.ReadLine()) != null)
                {
                   string[] elements = line.Split(separators);
                   // Console.WriteLine(elements.Length);
                   inputData[counter] = new double[d];
                   for (int i = 0; i < elements.Length-1; i++ ) {
                       inputData[counter][i] = Convert.ToDouble(elements[i]);
                   }
                   outputData[counter] = (Convert.ToInt16(elements[elements.Length - 1]) == 1) ? true : false;
                   counter++;
                }
    
                file.Close();
                
            }





    • Edited by Capli19 Wednesday, November 19, 2014 11:16 AM
    Thursday, October 30, 2014 3:28 PM

Answers

  • I don't see any problem in the implementation or the inference.  The issue is that there are only 51 data points and you are trying to learn 10k parameters.  Observing this data reduces the uncertainty by a tiny amount, hence the posterior is nearly equal to the prior.
    • Marked as answer by Capli19 Wednesday, November 5, 2014 2:55 PM
    Wednesday, November 5, 2014 2:30 PM
    Owner
  • Inference should work with either a fixed mean or not. 
    • Marked as answer by Capli19 Wednesday, November 12, 2014 12:13 PM
    Tuesday, November 11, 2014 2:17 PM
    Owner

All replies

  • Have you checked that the inference is converging?
    Friday, October 31, 2014 1:14 PM
    Owner
  • I think it is converged. Because increasing the number of iterations from 50 to 100 does not change the result at all.

    I was thinking maybe I implemented it in the wrong way. One think I read from the reference paper, with this data I should use:

    double Rho = 0.003; // instead of 0.3 which is already there

    and the variance in the main paper is set to 1:

    w[nRange][dRange] = Variable<double>.GaussianFromMeanAndPrecision(slabMean[dRange], 1); // it is already 10 (prec 0.1)

    But even with this value, the results are not good.

    Is it the right way at all to implement spike and slab prior? It seems that I don't get a global sparsity. gamma posterior should be bold at a few of these w elements, while in this code I get all gamma posteriors around the same low value (which is equal to overfitting - the problem I was trying to avoid using this slab and spike posterior). I don't know what can be the reason.


    • Edited by Capli19 Wednesday, November 5, 2014 11:33 AM
    Wednesday, November 5, 2014 11:01 AM
  • I don't see any problem in the implementation or the inference.  The issue is that there are only 51 data points and you are trying to learn 10k parameters.  Observing this data reduces the uncertainty by a tiny amount, hence the posterior is nearly equal to the prior.
    • Marked as answer by Capli19 Wednesday, November 5, 2014 2:55 PM
    Wednesday, November 5, 2014 2:30 PM
    Owner
  • Thank you Tom. There is an R code implemented by the authors implementing the same model, and it works differently. On the same data it gives an sparse posterior. I think I will go through the code, maybe I will find something new about what they have done.
    Wednesday, November 5, 2014 2:54 PM
  • Hi. I needed to come back to this post again.

    I wanted to ask, when we implement this spike and slab model, should the mean of slab be a fixed double? Or it should be learned also. For example here I have used a prior distribution for it and I infer it at the end:

                Variable<Gaussian> slabMeanPrior = Gaussian.FromMeanAndPrecision(0, 3);
                VariableArray<double> slabMean = Variable.Array<double>(dRange).Named("slabMean");
                slabMean[dRange] = Variable<double>.Random(slabMeanPrior).ForEach(dRange);

    Thank You.

    Tuesday, November 11, 2014 2:03 PM
  • Inference should work with either a fixed mean or not. 
    • Marked as answer by Capli19 Wednesday, November 12, 2014 12:13 PM
    Tuesday, November 11, 2014 2:17 PM
    Owner
  • I found out what was my problem. I have a gamma variable array and a w jagged array.

    For each attribute I have one gamma, but also I should have one w element.I have defined w as a 2d array which is not correct.

    For each attriubte I should have defined one relative w.

    So instead of :

    VariableArray<VariableArray<double>, double[][]> w = Variable.Array(Variable.Array<double>(dRange), nRange).Named("w");

    I should have had:

    VariableArray<double> w = Variable.Array<double>(dRange).Named("w");
    And it works this way.
    Thursday, November 13, 2014 4:31 PM