locked
Bayes Point Machine - 'Will Buy' vs. Multi-Class Examples (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • Dweezil posted on 08-13-2010 11:47 PM

    I'd like to understand the difference between the 'WillBuy' (and Image Classification) example, and the Multi-class Machines. As an exercise, I tried to translate the the 'will buy' example into a 'two class' example from the BPM code*.  I changed the data to include the 'will buy sample data, making the 'will buy' class '0' and 'won't buy' class '1'. 

    Example (income: 58, age: 36, from the BayesPointMachineExample):

    [0] Bernoulli(0.9833)

    Example (from the BPM example with the same training data):

    Discrete(0.7897 0.2103)

    I assume the BPM values are the probability of the test is in each class, in this case there is a 78% chance it's in the 'will buy' class. I'm just not sure how to relate the original 98% probability from the 'will buy' example.

    To help level set, I have about two hours of experience with Infer.Net and less then that with the math, so any help would be great.

    Thanks,

    Chris

     * I think there is an 'off by one' bug in the DataFromFile.cs:Read method, and should be:

    x[i] = Double.Parse(pieces[i+1]);


    Friday, June 3, 2011 6:06 PM

Answers

All replies

  • John Guiver replied on 08-16-2010 3:43 AM

    Thanks for bringing that bug to our attention. It should be corrected as:

    for (int i = 1; i <= x.Length; i++)
    {
        x[i - 1] = Double.Parse(pieces[i]);
    }

    i.e. change the upper limit rather than the indexing. This will be fixed in the next release.

    As regards the discrepancy, please see the post http://community.research.microsoft.com/forums/t/5351.aspx. This talks about fixing one of the weights priors to a point mass to use up a degree of freedom. For example, for the BPM class, this will change the training method to look as follows:

    for (int c = 0; c < nClass; c++)
    {
        trainModel.wInit[c].ObservedValue = (c == 0)
        ? VectorGaussian.PointMass(Vector.Zero(nFeatures))
        : VectorGaussian.FromMeanAndPrecision(Vector.Zero(nFeatures), PositiveDefiniteMatrix.Identity(nFeatures));
    }
    return InferW(xValuesData);

    Also, make sure you have a bias term in your data set. So your data file should look as follows:

    1 99 63 38 1
    0 0 16 23 1
    1 21 28 40 1
    1 46 55 27 1
    0 63 22 18 1
    0 80 20 40 1

    John

     

    Friday, June 3, 2011 6:06 PM
  • Dweezil replied on 08-16-2010 10:45 PM

    Hi John,

    In your example data, is the appended '1' the bias? Is the second number in the row a confidence for that class?  So in the first row, we are 99% sure it's in class '1'?

    Chris

     

    Friday, June 3, 2011 6:06 PM
  • Dweezil replied on 08-16-2010 11:07 PM

    Does fixing one of the weights affect correctness? or just performance?

     

    Friday, June 3, 2011 6:06 PM
  • John Guiver replied on 08-17-2010 3:31 AM

    It's a different model, so it will give a different solution. I think what's happening if we don't fix one of the vector weights is that there is a continuum of equivalent solutions which are all equally likely if you factor out the prior; so when you look at the marginal weights for any particular class, you just get a meaningless average over those solutions - i.e. the solution is correct given the model, but the model itself is ineffective.

    Friday, June 3, 2011 6:06 PM
  • John Guiver replied on 08-17-2010 3:35 AM

    The first number is the class, the next three numbers are (expenses,income,age). The final number is bias.

    Friday, June 3, 2011 6:06 PM
  • Dweezil replied on 08-18-2010 8:44 PM

    Wow, after I fixed one weight, I was able to use the original 'will buy' data, and reproduce the same results. Good stuff.

    Thanks,

    Chris

     

    Friday, June 3, 2011 6:06 PM
  • Dweezil replied on 08-18-2010 11:08 PM

    In an application with more classes, why/when/can I fix n-1 of them in a PointMass?

    Sorry if this is a remedial question, I am working my way through Russell/Norvig now, trying to catch up.

    Chris

     

    Friday, June 3, 2011 6:06 PM
  • minka replied on 08-23-2010 1:34 PM

    You should only ever fix one of them.

    Friday, June 3, 2011 6:06 PM
  • JaredBroad replied on 10-21-2010 4:07 PM

    Hey! 

    This is exactly the example / problem I was trying to replicate as well - would you mind posting the source you used for it?

    Thank you

    Jared

    Friday, June 3, 2011 6:06 PM
  • JaredBroad replied on 10-21-2010 5:56 PM

    Also - I have searched all the API but can't seem to find "BPMUtils" - is this example out of date? Or packaged in some other library?

    Thank you for your help,

    Friday, June 3, 2011 6:06 PM
  • John Guiver replied on 11-01-2010 8:54 AM

    The example source is in the download. A new release got posted a couple of days ago - best to download that release. It will give links from the start menu to the Bayes Point Machine Visual Studio example solution.

     

    John

    Friday, June 3, 2011 6:07 PM
  • JaredBroad replied on 11-01-2010 11:07 AM

    Thank you-

    Friday, June 3, 2011 6:07 PM