Asked by:
Using a discrete variable in the BayesPointMachineExample
Question

The inputs into the model in the BayesPointMachineExample are income and ages. How can I change one of these to be a discrete input e.g. SalesPerson: {0,1,2}.
I know I can declare it as follows but I'm not sure how to include it in the BPM:
Variable<int> salesPerson = Variable.DiscreteUniform(3);
Thanks
Sunday, February 10, 2013 5:29 AM
All replies

In many applications, inputs to BPM represent discrete values. Even if you have continuous values it is still worthwhile representing them as discrete (they are several ways to do this) especially if you have lots of data. The advantage of doing this is that you can learn a nonlinear mapping between inputs and outputs, whereas just inputting the continuous value directly forces a linear relationship.
You should keep the BPM as is, but write some code that maps from raw values to input vector. Just represent a 3value discrete value as a 1of3 code (1, 0, 0), (0, 1, 0), (0, 0, 1).
John
Monday, February 11, 2013 9:38 AMOwner 
I've changed the BayesPointMachineExample to use a discrete value (category instead of income) and output a gaussian representing a price instead of willBuy. It does seem to force a linear relationship. As I increase the category value the output also increases even given a category that did not exist in the input data.
Here's the code:
public void Run() { // data double[] category = { 2, 1, 0, 2, 1, 0 }; double[] ages = { 38, 23, 40, 27, 18, 40 }; double[] price = { 33, 50, 22, 19, 44, 19}; // Create target y VariableArray<double> y = Variable.Observed(price).Named("y"); Variable<Vector> w = Variable.Random(new VectorGaussian(Vector.Zero(3), PositiveDefiniteMatrix.Identity(3))).Named("w"); BayesPointMachine(category, ages, w, y); InferenceEngine engine = new InferenceEngine(); if (!(engine.Algorithm is GibbsSampling)) { VectorGaussian wPosterior = engine.Infer<VectorGaussian>(w); Console.WriteLine("Dist over w=\n"+wPosterior); double[] incomesTest = { 0, 1, 2, 3 }; double[] agesTest = { 24, 24, 24, 24 }; VariableArray<double> ytest = Variable.Array<double>(new Range(agesTest.Length)).Named("ytest"); BayesPointMachine(incomesTest, agesTest, Variable.Random(wPosterior).Named("w"), ytest); Console.WriteLine("output=\n" + engine.Infer(ytest)); } else Console.WriteLine("This model has a nonconjugate factor, and therefore cannot use Gibbs sampling"); } public void BayesPointMachine(double[] incomes, double[] ages, Variable<Vector> w, VariableArray<double> y) { // Create x vector, augmented by 1 Range j = y.Range.Named("person"); Vector[] xdata = new Vector[incomes.Length]; for (int i = 0; i < xdata.Length; i++) xdata[i] = Vector.FromArray(incomes[i], ages[i], 1); VariableArray<Vector> x = Variable.Observed(xdata,j).Named("x"); // Bayes Point Machine double noise = 0.1; y[j] = Variable.GaussianFromMeanAndVariance(Variable.InnerProduct(w, x[j]).Named("innerProduct"),noise); }
So if I want to map the discrete values as a vector instead would I do something like this in the BayesPointMachine() method?
for (int i = 0; i < xdata.Length; i++) { Vector category = Vector.FromArray(new double[] { 0, 0, 1, 0 }); // Category 2 (Category is 0 based) Vector age = Vector.FromArray(new double[] { 0, 0, 0, 0, 0, 0, 0, 0, 1 }); // Age = 9 xdata[i] = category + age; }
Thanks
 Edited by PeterTurner87 Monday, February 11, 2013 11:26 AM
Monday, February 11, 2013 11:25 AM 
You could do that. But more efficient just to create a zero Vector of the correct length (Vector x = Vector.Zero(9)) and then set the individual nonzero values (x[2] = 1; x[8] = 1).
If you are building an application here rather than just look at toy problems, you should build up some feature infrastructure which maps from your raw records to feature vectors.
If you have a lot or features and/or feature buckets, you are better off using the sparse version of BPM.
Tuesday, February 12, 2013 9:05 AMOwner