Answered by:
Bayesian Linear Regression (Migrated from community.research.microsoft.com)

jlopes posted on 05172009 10:07 PM
Hi,
I've been trying to do some bayesian lineâr regression as a first trial but with much success:
double[,] data = new double[,] { {1,3}, {1,2.1}, {1,1.3}, {1,0.5}, {1,1.2}, {1,3.3}, {1,4.4}, {1,5.5} };
Range rows= new Range(data.GetLength(0));
Range columns = new Range(data.GetLength(1));
Variable<Matrix> x = Variable.Constant<Matrix>(new Matrix(data)).Named("x");
Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(new Vector(new double[]{0,0}),PositiveDefiniteMatrix.Identity(2)).Named("w");
Variable<Vector> yVar = Variable.MatrixTimesVector(x, w).Named("y");
yVar.ObservedValue = new Vector(30, 45, 40, 80, 70, 100, 130, 110);
InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());
VectorGaussian postW = engine.Infer<VectorGaussian>(w);I suspect that it doesn't work because Variable.MatrixTimesVector is not implemented yet. What would be the best way to solve this problem? Implement MatrixTimesVector, quit because it won't work at all because of..., or try to find a way arround?
Thanks,
Joao
Question
Answers

jlopes replied on 05192009 6:58 PM
Many thanks. That's what I meant to do in the first place but I misconceived the model.
 Marked as answer by Microsoft ResearchOwner Friday, June 03, 2011 4:54 PM
All replies


minka replied on 05192009 1:31 AM
The problem here is not with Infer.NET, but with the Variational Message Passing algorithm and the particular model being used here. This is not really a standard linear regression model, since normally you would add Gaussian noise before observing the product. Here there is no noise, and that is the source of the problem. You are directly observing the product of two variables, which VMP cannot support. It is not a case of Infer.NET being incomplete. VMP simply does not handle the case when a derived variable is observed. You will run into this limitation no matter how you rewrite the model. So, you should either use EP or change the model to have some additional noise.

minka replied on 05192009 1:38 AM
You can get some insight into why VMP breaks down here by reading my paper "Divergence measures and message passing" (http://research.microsoft.com/enus/um/people/minka/papers/messagepassing/). As shown there, VMP will not represent the posterior distribution but simply pick one possible solution and put all probability mass there. This happens due to the zeroforcing nature of the divergence being minimized. Rather than have VMP return degenerate solutions, we opted to have Infer.NET throw an exception in these cases.

jwinn replied on 05192009 10:40 AM
So to build on Tom's answer, here is a solution using Vectors and InnerProduct which adds Gaussian noise:
Vector[] data = new Vector[] { new Vector( 1.0, 3 ), new Vector( 1.0, 2.1 ), new Vector( 1.0, 1.3 ), new Vector ( 1.0, 0.5 ), new Vector( 1.0, 1.2 ), new Vector( 1.0, 3.3 ), new Vector( 1.0, 4.4 ), new Vector( 1.0, 5.5 ) };
Range rows= new Range(data.Length);
VariableArray<Vector> x = Variable.Constant(data, rows).Named("x");
Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(new Vector(new double[] { 0, 0 }), PositiveDefiniteMatrix.Identity(2)).Named("w");
VariableArray<double> y = Variable.Array<double>(rows);
y[rows] = Variable.GaussianFromMeanAndVariance(Variable.InnerProduct(x[rows], w),1.0);
y.ObservedValue = new double[] { 30, 45, 40, 80, 70, 100, 130, 110 };
InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());
VectorGaussian postW = engine.Infer<VectorGaussian>(w);
Console.WriteLine("Posterior over the weights: "+Environment.NewLine+postW);Best,
John W. Proposed as answer by Harsh Vathsangam Tuesday, August 23, 2011 11:37 PM

jlopes replied on 05192009 6:58 PM
Many thanks. That's what I meant to do in the first place but I misconceived the model.
 Marked as answer by Microsoft ResearchOwner Friday, June 03, 2011 4:54 PM

Thanks for that example. I've been trying to figure out how to extend the data initialization to cases where the input dimensions are of a very large number in an elegant fashion. My motivation for this is that I have a CSV file which I load into an array. This array is of dimension > Nx4000. Manually entering these high dimensional datapoints would take a lot of time. I was not able to figure out how to use Vector[] for this. I tried to extend examples based on the factor analysis and bayes point machine examples but couldn't progress much. Any help would be greatly appreciated! :)

Not sure exactly what question you are asking here  this seems like a C# question which we woul not typically address on this forum. But something like this should do the trick:
List<Vector> dataList = new List<Vector>(); using (StreamReader sr = new StreamReader("myFileName.csv")) { string str; while ((str = sr.ReadLine()) != null) { double[] arr = str.Split(',').Select(s => double.Parse(s)).ToArray(); dataList.Add(Vector.FromArray(arr)); } } Vector[] data = dataList.ToArray();
John


Would you tell me how this translates to the latest infer.net?
I tried this:
double[] input = { 3, 2.1, 1.3, 0.5, 1.2, 3.3, 4.4, 5.5}; Vector[] data = new Vector[input.Length]; for (int i = 0; i < data.Length; i++) { data[i] = Vector.FromArray(input[i]); } Range rows = new Range(data.Length); VariableArray<Vector> x = Variable.Constant(data, rows).Named("x"); Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(Vector.Zero(2), PositiveDefiniteMatrix.Identity(2)).Named("w"); VariableArray<double> y = Variable.Array<double>(rows); y[rows] = Variable.GaussianFromMeanAndVariance(Variable.InnerProduct(x[rows], w), 1.0); y.ObservedValue = new double[] { 30, 45, 40, 80, 70, 100, 130, 110 }; InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing()); VectorGaussian postW = engine.Infer<VectorGaussian>(w); Console.WriteLine("Posterior over the weights: " + Environment.NewLine + postW);<br/>
For me this results in: 'MicrosoftResearch.Infer.Utils.AssertFailedException' occurred in Infer.Runtime.dll
Many thanks,
Mirko

Hi Mirko
Your weights are Vector variables of length 2, but your inputs are vectors of length one, and this throws the runtime exception. For example:
data[i] = Vector.FromArray(input[i], 1.0);
will add a bias input to your feature vectors whose length will then match the weights.
John
 Edited by John GuiverMicrosoft employee, Owner Monday, October 03, 2011 2:22 PM



A VectorGaussian stores a full covariance matrix. A VariableArray does not. So you will get better results with the VectorGaussian.