Answered by:
Bayes Point Machine  'Will Buy' vs. MultiClass Examples (Migrated from community.research.microsoft.com)
Question

Dweezil posted on 08132010 11:47 PM
I'd like to understand the difference between the 'WillBuy' (and Image Classification) example, and the Multiclass Machines. As an exercise, I tried to translate the the 'will buy' example into a 'two class' example from the BPM code*. I changed the data to include the 'will buy sample data, making the 'will buy' class '0' and 'won't buy' class '1'.
Example (income: 58, age: 36, from the BayesPointMachineExample):
[0] Bernoulli(0.9833)
Example (from the BPM example with the same training data):
Discrete(0.7897 0.2103)
I assume the BPM values are the probability of the test is in each class, in this case there is a 78% chance it's in the 'will buy' class. I'm just not sure how to relate the original 98% probability from the 'will buy' example.
To help level set, I have about two hours of experience with Infer.Net and less then that with the math, so any help would be great.
Thanks,
Chris
* I think there is an 'off by one' bug in the DataFromFile.cs:Read method, and should be:
x[i] = Double.Parse(pieces[i+1]);
Friday, June 3, 2011 6:06 PM
Answers

JaredBroad replied on 11012010 11:07 AM
Thank you
 Marked as answer by Microsoft Research Friday, June 3, 2011 6:07 PM
Friday, June 3, 2011 6:07 PM
All replies

John Guiver replied on 08162010 3:43 AM
Thanks for bringing that bug to our attention. It should be corrected as:
for (int i = 1; i <= x.Length; i++)
{
x[i  1] = Double.Parse(pieces[i]);
}i.e. change the upper limit rather than the indexing. This will be fixed in the next release.
As regards the discrepancy, please see the post http://community.research.microsoft.com/forums/t/5351.aspx. This talks about fixing one of the weights priors to a point mass to use up a degree of freedom. For example, for the BPM class, this will change the training method to look as follows:
for (int c = 0; c < nClass; c++)
{
trainModel.wInit[c].ObservedValue = (c == 0)
? VectorGaussian.PointMass(Vector.Zero(nFeatures))
: VectorGaussian.FromMeanAndPrecision(Vector.Zero(nFeatures), PositiveDefiniteMatrix.Identity(nFeatures));
}
return InferW(xValuesData);Also, make sure you have a bias term in your data set. So your data file should look as follows:
1 99 63 38 1
0 0 16 23 1
1 21 28 40 1
1 46 55 27 1
0 63 22 18 1
0 80 20 40 1John
Friday, June 3, 2011 6:06 PM 
Dweezil replied on 08162010 10:45 PM
Hi John,
In your example data, is the appended '1' the bias? Is the second number in the row a confidence for that class? So in the first row, we are 99% sure it's in class '1'?
Chris
Friday, June 3, 2011 6:06 PM 
Dweezil replied on 08162010 11:07 PM
Does fixing one of the weights affect correctness? or just performance?
Friday, June 3, 2011 6:06 PM 
John Guiver replied on 08172010 3:31 AM
It's a different model, so it will give a different solution. I think what's happening if we don't fix one of the vector weights is that there is a continuum of equivalent solutions which are all equally likely if you factor out the prior; so when you look at the marginal weights for any particular class, you just get a meaningless average over those solutions  i.e. the solution is correct given the model, but the model itself is ineffective.
Friday, June 3, 2011 6:06 PM 
John Guiver replied on 08172010 3:35 AM
The first number is the class, the next three numbers are (expenses,income,age). The final number is bias.
Friday, June 3, 2011 6:06 PM 
Dweezil replied on 08182010 8:44 PM
Wow, after I fixed one weight, I was able to use the original 'will buy' data, and reproduce the same results. Good stuff.
Thanks,
Chris
Friday, June 3, 2011 6:06 PM 
Dweezil replied on 08182010 11:08 PM
In an application with more classes, why/when/can I fix n1 of them in a PointMass?
Sorry if this is a remedial question, I am working my way through Russell/Norvig now, trying to catch up.
Chris
Friday, June 3, 2011 6:06 PM 
minka replied on 08232010 1:34 PM
You should only ever fix one of them.
Friday, June 3, 2011 6:06 PM 
JaredBroad replied on 10212010 4:07 PM
Hey!
This is exactly the example / problem I was trying to replicate as well  would you mind posting the source you used for it?
Thank you
Jared
Friday, June 3, 2011 6:06 PM 
JaredBroad replied on 10212010 5:56 PM
Also  I have searched all the API but can't seem to find "BPMUtils"  is this example out of date? Or packaged in some other library?
Thank you for your help,
Friday, June 3, 2011 6:06 PM 
John Guiver replied on 11012010 8:54 AM
The example source is in the download. A new release got posted a couple of days ago  best to download that release. It will give links from the start menu to the Bayes Point Machine Visual Studio example solution.
John
Friday, June 3, 2011 6:07 PM 
JaredBroad replied on 11012010 11:07 AM
Thank you
 Marked as answer by Microsoft Research Friday, June 3, 2011 6:07 PM
Friday, June 3, 2011 6:07 PM