locked
Logit Regression RRS feed

  • Question

  • Hi

    I want to implement logit Regression using EP.

    I have a class variable Y and n features. My goal is to select a sparse set of features (for example using a spike and slab or a Laplace prior).

    Do you have any idea how should I implement it? I don't know specially how should I implement the logit regression itself.

    Thanks

    Sunday, March 27, 2016 4:44 PM

Answers

All replies

  • Some basic code has been posted here.  For more scalable and robust code, see the Bayes Point Machine classifiers.
    • Marked as answer by Capli19 Wednesday, March 30, 2016 6:35 PM
    Tuesday, March 29, 2016 9:18 AM
    Owner
  • Note though that the BPM code implements probit regression (which in practice will probably give very similar results to the logit one).

    -Y-

    Wednesday, March 30, 2016 9:38 AM
  • In case I observe the probabilities themselves (and not the final Bernoulli variable), do you have any idea why the code does not work?

    The error is:

    A InnerProduct factor with fixed output is not yet implemented for Expectation Propagation.

    public static void Run() {
                int numData = 3000;
                int numDim = 3;
                double bias = 2.5;
                Random rand = new Random();
                Vector[] x = new Vector[numData];
                bool[] y = new bool[numData];
                double[] probs = new double[numData];
                double[] w = new double[]{0.2 , 1, 3};
                for (int i = 0; i < numData; i++) { 
                    //
                    double[] local = new double[numDim];
                    double weightedSum = 0;
                    for(int j=0; j<local.Length; j++){
                        local[j] = rand.Next(10)-5;
                        //Console.Write(local[j] + "   ");
                        weightedSum = weightedSum + w[j] * local[j];
                        x[i] = Vector.FromArray(local);
                    }
                    //Console.WriteLine("weighted sum = " + weightedSum);
                    probs[i] = 1/(1+Math.Exp(-1 * (weightedSum+bias)));
                    y[i] = Bernoulli.Sample(probs[i]);
                }
                VectorGaussian wPost;
                Gaussian biasPost;
                InferCoefficients(x, y, probs, out wPost, out biasPost);
                Console.ReadLine();
            }
    
            public static void InferCoefficients(Vector[] xObs, bool[] yObs, double[] probsObs,
                                           out VectorGaussian wPost, out Gaussian biasPost)
            {
    
                Variable<int> dataCount = Variable.Observed(xObs.Length).Named("dataCount");
                int D = xObs[0].Count;
                Range n = new Range(dataCount).Named("n");
    
                Vector meanW = Vector.Zero(D);
                Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(
                    meanW, PositiveDefiniteMatrix.Identity(D)).Named("W");
    
                // bias term is normally distributed.
                Variable<double> bias = Variable.GaussianFromMeanAndPrecision(0, 0.1)
                    .Named("bias");
    
                VariableArray<Vector> x = Variable.Observed<Vector>(xObs, n).Named("X");
                VariableArray<double> logisticArgs = Variable.Array<double>(n).Named("logist_arg");
                logisticArgs[n] = Variable.InnerProduct(x[n], w).Named("inner") + bias;
    
                VariableArray<bool> yData = Variable.Observed<bool>(yObs, n).Named("Y");
                VariableArray<double> probsData = Variable.Observed<double>(probsObs, n).Named("probsData");
                VariableArray<double> probs = Variable.Array<double>(n).Named("probs");
                probs[n] = Variable.Logistic(logisticArgs[n]).Named("logistic");
                //yData[n] = Variable.Bernoulli(probs[n]);
                probsData[n] = Variable.GaussianFromMeanAndPrecision(probs[n], 1);
                // set observed values
                x.ObservedValue = xObs;
                //yData.ObservedValue = yObs;
                probs.ObservedValue = probsObs;
    
                InferenceEngine engine = new InferenceEngine();
    
                wPost = engine.Infer<VectorGaussian>(w);
                biasPost = engine.Infer<Gaussian>(bias);
                Console.WriteLine(biasPost);
                Console.WriteLine(wPost);
            }

    Wednesday, March 30, 2016 6:54 PM
  • The problem is that there is no noise in the model.  You need to add noise to logisticArgs before the Logistic.
    Thursday, March 31, 2016 10:26 AM
    Owner
  • I guess the problem was due to having a very large number of training data. The noise is considered in GaussianFromMeanAndPrecision.

    setting data point numbers to 10 or 100 solves the problem.

    Thursday, March 31, 2016 1:41 PM