Answered by:
Multi-Class Labeling using the Bayes Point Machine

Question
-
Hi, I am having trouble adapting the Bayes Point Machine provided in the examples to work with my data set. I am trying to classify the first two points (a data pair) into one of 4 possible classes (0 to 3). Below is a sample of the data (it is just a small sample, have over 10,000 points in total).
138,81,1
150,98,2
200,102,3
152,90,2
175,110,3
110,65,0
183,125,3Could someone please advise me as to how the Main for the example should look like to classify the data? I have about 6 different versions at the moment and none seem to work...
Thanks!
Thursday, February 7, 2013 2:29 AM
Answers
-
Please see http://social.microsoft.com/Forums/en-US/infer.net/thread/4ceaf7ef-1110-43b8-8037-48ca8ceddf01. Let me know if you still have questions. By the way, remember to add a constant bias input to your feature vector.
- Edited by John GuiverMicrosoft employee, Owner Thursday, February 7, 2013 3:22 PM
- Marked as answer by NikolaLoncar Thursday, February 7, 2013 5:12 PM
Thursday, February 7, 2013 3:22 PMOwner
All replies
-
Please see http://social.microsoft.com/Forums/en-US/infer.net/thread/4ceaf7ef-1110-43b8-8037-48ca8ceddf01. Let me know if you still have questions. By the way, remember to add a constant bias input to your feature vector.
- Edited by John GuiverMicrosoft employee, Owner Thursday, February 7, 2013 3:22 PM
- Marked as answer by NikolaLoncar Thursday, February 7, 2013 5:12 PM
Thursday, February 7, 2013 3:22 PMOwner -
Thank you very much John! Just to double check - does the constant bias have any impact on the inference based on its value? By this I mean would the result be different if the bias constant is 0 or 1?Thursday, February 7, 2013 5:08 PM
-
The bias constant has to be non-zero. 1 is a good value for it.
Your other features should be standardised - i.e. each feature scaled and offset to be roughly between 0 and 1 or -1 and 1. If not you will need to think much more carefully about priors.
Thursday, February 7, 2013 5:44 PMOwner -
Would this approach have to be used for the Sparse BPM as well? The reason I ask is because I have written a class with a SparseBPM and a standard BPM, I supplied both with the same data, and ensured (as far as I can see) that the settings are the same on both (eg: inference algorithm, noise, etc...), and somehow I get two different results. The standard BPM is doing fine, however the SparseBPM is leaning significantly towards one class. Is there an inherent difference between the two BPMs?
*The data supplied is not sparse to ensure that the standard BPM performs correctly.Sunday, April 7, 2013 11:32 AM