locked
Infer.NET: Bayesian logistic regression (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • Ferrum posted on 05-14-2009 9:23 AM

    Does Infer.NET have factors for Bayesian logistic regression? If there are some in-house factors for VMP, do they use Tommi Jaakkola's bound with auxiliary variational parameters?

    Friday, June 3, 2011 5:13 PM

Answers

  • minka replied on 08-12-2009 11:35 AM

    You want to use Variable.DiscreteFromLogProbs here, not Variable.Softmax.  See the documentation for both functions.

    Friday, June 3, 2011 5:16 PM

All replies

  • John Guiver replied on 05-18-2009 12:28 PM

    Hi Ferrum

    Apologies for the delay in replying. Infer.NET does not have inbuilt factors for logistic regression. However, it does have support for probit regression - you can build up a linear model, either making use of the InnerProduct factor or, using plates (variable arrays), with product and summation factors. The output of the linear model can then be fed through an IsPositive factor.  The IsPositive factor uses EP rather than VMP (so the answer is no to the VMP question).  We have, in the past, implemented an EP operator for a logistic link function (actually, as I recall, it took Binomial observations rather than Bernoulli observations); this never made it into Infer.NET, as we have tended to make do with probit regression for internal applications. We also have an inhouse version of a deterministic logistic factor where the message types are Gaussian on input and Beta on output (again EP) - but this is probably not what you were looking for. If there are special reasons for needing a VMP treatment, or a logistic rather than probit, then we can discuss further.

    Thanks

    John G.

    Friday, June 3, 2011 5:13 PM
  • Ferrum replied on 05-23-2009 6:30 PM

    Hi John, thank you for the reply.

    Friday, June 3, 2011 5:14 PM
  • Vincent Tan replied on 06-04-2009 12:52 PM

    Hi John,

    Hope all is well at MSR-C.

    In fact, Multiple Logistic Regression (i.e., softmax) would be really useful because it can naturally handle multiclass problems. I need to do logistic regression very soon. A potentially dumb question -- Is it really not possible to come up with such a model but building a linear model, use logisitic transform (by using the Variable.Exp factor) and tying the result to the (0-1) labels? I would think that this may work for binary classification.

    Thanks,

    Vincent

     

    Friday, June 3, 2011 5:14 PM
  • Vincent Tan replied on 06-04-2009 1:46 PM

    Hi Ferrum and John G,

    If I'm not mistaken, the probit regression model you mentioned in this post is exactly the same as the Bayes Point Machine model/example that can be found in the Tutorials. So that would be a good place to start to build the model.

    Best regards,

    Vincent

    Friday, June 3, 2011 5:14 PM
  • John Guiver replied on 06-07-2009 7:31 AM

    Hi Vincent

    In theory you might be able to do something like that, and this is certainly in the spirt of Infer.NET. However, you cannot do it with the existing set of operators. For example (a) the current Exp operator projects to a Gamma (b) ratio operator only supports Gaussian and does not support random denominator, and (c) you would need to project the ratio to a Beta.

    John

     

    Friday, June 3, 2011 5:14 PM
  • John Guiver replied on 06-07-2009 7:37 AM

    Vincent

    Yes, thanks for pointing that out. The Bayes Point machine shows probit regression - using the IsPositive constraint on a Gaussian distributed variable. You can also use IsPositive and IsBetween to do ordinal regression.

    You can do multi-class probit regression using http://research.microsoft.com/en-us/um/cambridge/projects/infernet/docs/Multi-class%20classification.aspx.

    John

    Friday, June 3, 2011 5:14 PM
  • Ferrum replied on 06-08-2009 6:27 AM


    Hi John and Vincent,

    in my view, one of the practical motivations for implementing VMP for logistic regression (in addition to the existing EP for probit regression) is linked to better convergence properties of VMP (possibly at a cost of under-estimating the variance). For some datasets, it may be rather awkward to handle the problems with improper messages arising for EP (see my thread on Infer.NET: improper distribution exceptions during EP inference, and a link to John's and Tom's excellent NIPS workshop presentation). Using an annealing of alpha in power EP may potentially be a possible way forward, though I can see that including it to Infer.NET and working out the details may take some time. At the meantime, could it be useful to include an arguably more stable (VMP?) solution for a GLM classifier?

    Friday, June 3, 2011 5:14 PM
  • Vincent Tan replied on 06-08-2009 5:46 PM

    Hi Ferrum and John G,

    Agreed. I also encounter the error message "improper distributions exception during EP inference" when running a standard BPM on a dataset.

    One quick fix that I have found useful is the following. One needs to add Gaussian noise (with small precision => high uncertainty) to the inner product <w,x> when constructing the model. The precision of the noise has to be fixed. Ideally, one would hope that this can be inferred but I have run into the same improper distributions exception when I try to set a Gamma prior on the prec. So I simply set the precision to be a fixed double, say 0.1d (sometimes larger precs don't work). Then everything seems to work fine. Any scientific explanation for this phenomenon?  Perhaps the intuition is that one needs to incorporate more uncertainty when dealing with noisy datasets.

    Vincent

    Friday, June 3, 2011 5:14 PM
  • John Guiver replied on 06-09-2009 8:16 AM

    Tom has now written a VMP operator for logistic regression which uses the Jaakola and Jordan bound. It will be available in the next beta release (tentatively scheduled for the second week in July). Meanwhile, the code is small enough that I can paste it in here - it uses the BernoulliFromLogOdds factor which already exists in the current beta. I have renamed the operator class below to TempBernoulliFromLogOddsOp so as not to conflict with the largely unimplemented operator class which exists in the current beta. You can put this code in your assembly, and Infer.NET should pick this up, though somewhere in your assembly you will need to put the following annotation which tells Infer.NET that your assembly contains message functions:

    [assembly: MicrosoftResearch.Infer.Factors.HasMessageFunctions]

    Here is the code for the operator. I will put example usage code in a follow-up post.

    // (C) Copyright 2009 Microsoft Research Cambridge

    using System;
    using System.Collections.Generic;
    using System.Text;
    using MicrosoftResearch.Infer.Distributions;
    using MicrosoftResearch.Infer.Maths;

    namespace MicrosoftResearch.Infer.Factors
    {
        /// <summary>
       
    /// Provides outgoing messages for <see cref="Factor.BernoulliFromLogOdds"/>, given random arguments to the function.
       
    /// </summary>
       
    [FactorMethod(typeof(Factor), "BernoulliFromLogOdds")]
        public static class TempBernoulliFromLogOddsOp
       
    {
            /// Evidence message for VMP 
            public static double AverageLogFactor(bool sample, Gaussian logOdds)
            {
                double m,v;
                logOdds.GetMeanAndVariance(out m, out v);
                double t = Double.IsPositiveInfinity(v) ? 2.4 : Math.Sqrt(m*m+v);
                double a = Math.Tanh(t/2)/(2*t);
                double s = sample ? 1 : -1;
                double m2 = m*m+v;
                return MMath.LogisticLn(t) + (s*m-t)/2 - a/2*(m2 - t*t);
            }

            /// VMP message to 'LogOdds' 
            public static Gaussian LogOddsAverageLogarithm(bool sample, Gaussian logOdds)
            {
                double m,v;
                logOdds.GetMeanAndVariance(out m, out v);
                double t = Double.IsPositiveInfinity(v) ? 2.4 : Math.Sqrt(m*m+v);
                double a = Math.Tanh(t/2)/(2*t);
                double s = sample ? 1 : -1;
                return Gaussian.FromMeanAndPrecision(s/(2*a), a);
            }
        }
    }

     

    Friday, June 3, 2011 5:14 PM
  • John Guiver replied on 06-09-2009 8:31 AM

    Here's the usage for the previous post. Simplest example for a single observation:

    Variable<double> w = Variable.GaussianFromMeanAndPrecision(1.2, 0.4);
    Variable<bool> y = Variable.BernoulliFromLogOdds(w);
    InferenceEngine ie = new InferenceEngine(new VariationalMessagePassing());
    y.ObservedValue = true;
    Gaussian wPosterior = ie.Infer<Gaussian>(w);

    If you want to compute evidence, you can put in an If block as usual:

    Variable<bool> evidence = Variable.Bernoulli(0.5);
    IfBlock block = Variable.If(evidence);
    Variable<double> w = Variable.GaussianFromMeanAndPrecision(1.2, 0.4);
    Variable<bool> y = Variable.BernoulliFromLogOdds(w);
    block.CloseBlock();
    y.ObservedValue = true;
    Gaussian wPosterior = ie.Infer<Gaussian>(w);
    Bernoulli e = ie.Infer<Bernoulli>(evidence)

    Let usknow how it goes if you decide to use these.

    John G

    Friday, June 3, 2011 5:15 PM
  • Ferrum replied on 06-09-2009 9:18 AM

    Excellent, thanks!

    Friday, June 3, 2011 5:15 PM
  • minka replied on 06-10-2009 8:17 AM

    Yes, the Gaussian noise here is important since it encodes how much noise is expected in the dataset.  If you use zero noise, then you are implying that the classes are perfectly separable by a line.  If this is not the case, then EP will usually crash since the posterior distribution is empty.

    Friday, June 3, 2011 5:15 PM
  • Alvin Kaule replied on 06-10-2009 2:55 PM

    Hi John,

    Will you make support for binomial variables with Bayesian Logistic Regression?

    I would like to apply following BUGS model using Infer.Net:

    model {
        for (i in 1:N)
        {
            y[ i ] ~ dbin(p[ i ], n[ i ])
            logit(p[ i ]) <- inprod(x[ i,], w[])
        }   
        for (j in 1:m)
        {
            w[ j ] ~ dnorm(0, tau[ j ])
            tau[ j ] ~ dgamma(0.001, 0.001)
        }   
    }

    Best regards   

    Alvin

    Friday, June 3, 2011 5:15 PM
  • Vincent Tan replied on 06-11-2009 1:14 PM

    Hi John G,

    This is not really an Infer.NET question but rather a question about the multiclass BPM model in the link you posted.

    Am I right to say that in the multiclass probit model, there is no indeterminacy in the weights w_1, ..., w_K because we impose a VectorGaussian(0,I) prior on each weight vector? I ask this because the multiclass model does not seem to reduce to the binary classification model since there are two sets of weights in the multiclass model (when K=2) but for the BPM model in the tutorials, there is only 1 set of weights.

    Thanks once again, Vincent.

    Friday, June 3, 2011 5:15 PM
  • minka replied on 06-11-2009 1:48 PM

    Yes, this will be supported in the next beta release.

    Friday, June 3, 2011 5:15 PM
  • minka replied on 06-17-2009 10:10 AM

    Yes, there is no indeterminacy in the weights when you impose a proper prior.

    Friday, June 3, 2011 5:15 PM
  • Vincent Tan replied on 07-15-2009 6:47 PM

    Hi John G,

    Thanks for the BernoulliFromLogOdds factor and the example. The Bayesian logistic regression model seems to work very well on a binary classification task. The weights inferred make sense.

    I was wondering whether ti's straightforward to extend the model to a softmax or multiclass logistic regression model. I would imagine that instead of doing Variable.BernoulliFromLogOdds(w), I would have to impose pairwise constraints on a set of weights and the training vectors. 

    Any help (including code snippets) would be greatly appreciated.

    Thanks,

    Vincent

    Friday, June 3, 2011 5:15 PM
  • minka replied on 07-16-2009 8:00 AM

    You can do multiclass logistic regression in a very similar way as the multiclass Bayes point machine.  If there are C classes, you would have C weight vectors.  Given an input vector, take inner products to get C scores.  During training, if the correct class is 1 you create C-1 booleans indicating whether score 1 is greater than score j for all j > 1.  In the Bayes point machine, these are explicit constraints of the form score[0] > score[j].  For logistic regression, these are soft constraints defined by BernoulliFromLogOdds(score[1] - score[j]).  These booleans are all observed to be true.  Mathematically, this model is similar to a softmax likelihood but not exactly the same.  To do exact softmax, you would need the new factor that we are adding in the next version.

    Friday, June 3, 2011 5:16 PM
  • minka replied on 07-29-2009 4:35 AM

    By the way, the code that John Guiver sent out for using the BernoulliFromLogOdds factor is not quite right.  Because this factor is not completely integrated, what you need to say is:

    Variable<bool> y = Variable<bool>.Factor(Factor.BernoulliFromLogOdds,w);

    instead of:

    Variable<bool> y = Variable.BernoulliFromLogOdds(w);

    Tom

    Friday, June 3, 2011 5:16 PM
  • minka replied on 08-07-2009 11:20 AM

    Infer.NET 2.3 now has built-in support for logistic/softmax regression and binomial/multinomial observations under VMP.

    Friday, June 3, 2011 5:16 PM
  • mean1010 replied on 08-07-2009 7:39 PM

    I’ll be sure to keep everyone posted.

    Friday, June 3, 2011 5:16 PM
  • Vincent Tan replied on 08-10-2009 12:09 PM

    Hi Tom,

    Thanks for softmax regression factor. I tried to use it but somehow I get improper message exceptions or syntax problems. Can you please look at the relevant parts of my code. Thanks!

    Vincent

     

     // The model

      for (int c = 0; c < nClass; c++)
                {
                    nItem[ c ] = Variable.New<int>().Named("nItem_" + c);
                    item[ c ] = new Range(nItem[ c ]).Named("item_" + c);
                    xValues[ c ] = Variable.Array<Vector>(item[ c ]).Named("xValues_" + c);
                    using (Variable.ForEach(item[ c ]))
                    {
                        score = ComputeClassScores(weights, xValues[ c ][item[ c ]], rC);
                        noisyScore = AddNoiseToScore(score, vPi, rC);
                        ConstrainForLogisticRegression(c, noisyScore, rC);
                    }
                } 

      private void ConstrainForLogisticRegression(int argmax, VariableArray<double> noisyScore, Range rC)
            {
                Vector v = new Vector(rC.SizeAsInt);
                for (int i = 0; i < rC.SizeAsInt; i++)v[ i ]=1.0;
                Variable<Vector> softProbs = Variable.Dirichlet(v);
                softProbs = Variable.Softmax(noisyScore);
                Vector hardVector = new Vector(rC.SizeAsInt);
                for (int i = 0; i < rC.SizeAsInt; i++) if (i == argmax) hardVector[ i ] = 1.0;
     
                Variable.ConstrainEqual(hardVector, softProbs); // If I do this, I get complains that the softmax factor needs a Dirichlet instead of vector
                //softProbs.ObservedValue = hardVector; // If I do this, I get improper message exceptions
            }

     

     private VariableArray<double> ComputeClassScores(VariableArray<Vector> w, Variable<Vector> xValues, Range rC)
            {
                VariableArray<double> score = Variable.Array<double>(rC);            
                score[rC] = Variable.InnerProduct(w[rC], xValues);            
                return score;
            }

       private VariableArray<double> AddNoiseToScore(VariableArray<double> score, Variable<double> prec, Range rC)
            {           
                VariableArray<double> noisyScore = Variable.Array<double>(rC);         
                noisyScore[rC] = Variable.GaussianFromMeanAndPrecision(score[rC], prec);
                return noisyScore;
            }

     

    Friday, June 3, 2011 5:16 PM
  • minka replied on 08-12-2009 11:35 AM

    You want to use Variable.DiscreteFromLogProbs here, not Variable.Softmax.  See the documentation for both functions.

    Friday, June 3, 2011 5:16 PM