# Bayesian Linear Regression (Migrated from community.research.microsoft.com) • ### Question

• jlopes posted on 05-17-2009 10:07 PM

Hi,

I've been trying to do some bayesian lineâr regression as a first trial but with much success:

double[,] data = new double[,] { {1,-3}, {1,-2.1}, {1,-1.3}, {1,0.5}, {1,1.2}, {1,3.3}, {1,4.4}, {1,5.5} };
Range rows= new Range(data.GetLength(0));
Range columns = new Range(data.GetLength(1));
Variable<Matrix> x = Variable.Constant<Matrix>(new Matrix(data)).Named("x");
Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(new Vector(new double[]{0,0}),PositiveDefiniteMatrix.Identity(2)).Named("w");
Variable<Vector> yVar = Variable.MatrixTimesVector(x, w).Named("y");
yVar.ObservedValue = new Vector(30, 45, 40, 80, 70, 100, 130, 110);
InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());
VectorGaussian postW = engine.Infer<VectorGaussian>(w);

I suspect that it doesn't work because Variable.MatrixTimesVector is not implemented yet. What would be the best way to solve this problem? Implement MatrixTimesVector, quit because it won't work at all because of..., or try to find a way arround?

Thanks,

Joao

Friday, June 3, 2011 4:54 PM

• jlopes replied on 05-19-2009 6:58 PM

Many thanks. That's what I meant to do in  the first place but I misconceived the model.

Friday, June 3, 2011 4:54 PM

### All replies

• jlopes replied on 05-18-2009 7:31 AM

I tried to do it as a series of innerproduct of vectors, but is also non supported. :)

Guess I will have no choice?

Friday, June 3, 2011 4:54 PM
• minka replied on 05-19-2009 1:31 AM

The problem here is not with Infer.NET, but with the Variational Message Passing algorithm and the particular model being used here.  This is not really a standard linear regression model, since normally you would add Gaussian noise before observing the product.  Here there is no noise, and that is the source of the problem.  You are directly observing the product of two variables, which VMP cannot support. It is not a case of Infer.NET being incomplete.  VMP simply does not handle the case when a derived variable is observed.  You will run into this limitation no matter how you rewrite the model.  So, you should either use EP or change the model to have some additional noise.

Friday, June 3, 2011 4:54 PM
• minka replied on 05-19-2009 1:38 AM

You can get some insight into why VMP breaks down here by reading my paper "Divergence measures and message passing" (http://research.microsoft.com/en-us/um/people/minka/papers/message-passing/).  As shown there, VMP will not represent the posterior distribution but simply pick one possible solution and put all probability mass there.  This happens due to the zero-forcing nature of the divergence being minimized.  Rather than have VMP return degenerate solutions, we opted to have Infer.NET throw an exception in these cases.

Friday, June 3, 2011 4:54 PM
• jwinn replied on 05-19-2009 10:40 AM

So to build on Tom's answer, here is a solution using Vectors and InnerProduct which adds Gaussian noise:

Vector[] data = new Vector[] { new Vector( 1.0, -3 ), new Vector( 1.0, -2.1 ), new Vector( 1.0, -1.3 ), new Vector ( 1.0, 0.5 ), new Vector( 1.0, 1.2 ), new Vector( 1.0, 3.3 ), new Vector( 1.0, 4.4 ), new Vector( 1.0, 5.5 ) };
Range rows= new Range(data.Length);
VariableArray<Vector> x = Variable.Constant(data, rows).Named("x");
Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(new Vector(new double[] { 0, 0 }), PositiveDefiniteMatrix.Identity(2)).Named("w");
VariableArray<double> y = Variable.Array<double>(rows);
y[rows] =
Variable.GaussianFromMeanAndVariance(Variable.InnerProduct(x[rows], w),1.0);
y.ObservedValue = new double[] { 30, 45, 40, 80, 70, 100, 130, 110 };
InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());
VectorGaussian postW = engine.Infer<VectorGaussian>(w);
Console.WriteLine("Posterior over the weights: "+Environment.NewLine+postW);

Best,
John W.

Friday, June 3, 2011 4:54 PM
• jlopes replied on 05-19-2009 6:58 PM

Many thanks. That's what I meant to do in  the first place but I misconceived the model.

Friday, June 3, 2011 4:54 PM
• Thanks for that example. I've been trying to figure out how to extend the data initialization to cases where the input dimensions are of a very large number in an elegant fashion. My motivation for this is that I have a CSV file which I load into an array. This array is of dimension > Nx4000. Manually entering these high dimensional datapoints would take a lot of time. I was not able to figure out how to use Vector[] for this. I tried to extend examples based on the factor analysis and bayes point machine examples but couldn't progress much. Any help would be greatly appreciated! :)
Tuesday, August 23, 2011 11:37 PM
• Not sure exactly what question you are asking here - this seems like a C# question which we woul not typically address on this forum. But something like this should do the trick:

```List<Vector> dataList = new List<Vector>();

{
string str;
while ((str = sr.ReadLine()) != null)
{
double[] arr = str.Split(',').Select(s => double.Parse(s)).ToArray();
}
}
Vector[] data = dataList.ToArray();
```

John

Wednesday, August 31, 2011 3:22 PM
• Many thanks for this, it worked. Sorry for posting a rather basic question.
Thursday, September 8, 2011 11:20 PM
• Would you tell me how this translates to the latest infer.net?

I tried this:

```double[] input = { -3, -2.1, -1.3, 0.5, 1.2, 3.3, 4.4, 5.5};
Vector[] data = new Vector[input.Length];
for (int i = 0; i < data.Length; i++)
{
data[i] = Vector.FromArray(input[i]);
}
Range rows = new Range(data.Length);
VariableArray<Vector> x = Variable.Constant(data, rows).Named("x");
Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(Vector.Zero(2), PositiveDefiniteMatrix.Identity(2)).Named("w");
VariableArray<double> y = Variable.Array<double>(rows);
y[rows] = Variable.GaussianFromMeanAndVariance(Variable.InnerProduct(x[rows], w), 1.0);
y.ObservedValue = new double[] { 30, 45, 40, 80, 70, 100, 130, 110 };
InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());
VectorGaussian postW = engine.Infer<VectorGaussian>(w);
Console.WriteLine("Posterior over the weights: " + Environment.NewLine + postW);<br/>
```

For me this results in: 'MicrosoftResearch.Infer.Utils.AssertFailedException' occurred in Infer.Runtime.dll

Many thanks,

Mirko

Monday, October 3, 2011 2:00 PM
• Hi Mirko

Your weights are Vector variables of length 2, but your inputs are vectors of length one, and this throws the runtime exception. For example:

data[i] = Vector.FromArray(input[i], 1.0);

will add a bias input to your feature vectors whose length will then match the weights.

John

Monday, October 3, 2011 2:20 PM
• Thank you very much, John. The exception is gone and I've realised I haven't quite understood the code just yet :-)
Monday, October 3, 2011 2:42 PM
• Would anyone please explain why the a multivariate posterior is defined for w?

What if w is defined using a variableArray, each element with a Gaussian distribution?

(then using Variable.Sum(w * data[i]) instead of innerproduct)

Tuesday, January 20, 2015 5:18 PM
• A VectorGaussian stores a full covariance matrix.  A VariableArray does not.  So you will get better results with the VectorGaussian.
• Proposed as answer by Wednesday, January 21, 2015 10:04 AM
• Unproposed as answer by Wednesday, January 21, 2015 10:58 AM
• Proposed as answer by Friday, January 23, 2015 1:22 PM
Tuesday, January 20, 2015 5:42 PM
• Hi All,

Am new to Bayesian Inferencing, although do have some basic understanding of probabilistic graphical models.

Can anyone please share some references to a tutorial/basic paper wrt Bayesian Linear regression being discussed in this post.

• 