# Specifying large Conditional Probability Tables of discrete events (Migrated from community.research.microsoft.com) • ### Question

• freddycct posted on 08-11-2009 9:41 AM

Suppose I have 3 random variables x, y and z. Each random variable can take 10 possible states. and let

x depend on y and z, hence, I need to specify the conditional probability table of P(x | y, z )

If I specify the conditional probabilities using the "using Variable.if(case) .... " statements, then I need to type alot of statements.

I understand there is a ifblock to use but there's no simple example to illustrate how to use it.

Is there any help on this or a simple work around?

Friday, June 3, 2011 5:13 PM

• jwinn replied on 08-12-2009 5:57 AM

You should be aware that some care is needed when working with large tabular conditional probabilities such as the one you are proposing.  This table involves 10x10x10=1000 parameters - learning such as large set of parameters will require a huge amount of data - you will need to see every combination of the states of x,y and z multiple times to get a good estimate of the probability P(x|y,z).

The problem is that you are assuming nothing about the underlying relationship between x, y and z, so that P(x|y=1,z=1) could be utterly different from P(x|y=2,z=1) or P(x|y=1,z=2).  This is rarely the case.  I don't know your application, but suppose if x, y, and z are discretisations of continuous variables into ten bins - then you would expect P(x|y,z) to vary smoothly across x and also to vary smoothly as y and z change.  This smoothness is lost if you use a tabular conditional probability.  Infer.NET provides lots of other kinds of factor which can be used to represent more specific relationships between x,y and z  - for example if x,y and z are discretisation of continuous variables which are linearly related then you can directly represent them as continuous variables and specify arithmetic relationships between them e.g. x = Ay + Bz + C and learn A, B, C - just three parameters instead of 1000.

I hope this makes sense and explains why we have many other kinds of relationships available in Infer.NET as well as conditional probability tables.

Best,

John W.

Friday, June 3, 2011 5:13 PM

### All replies

• John Guiver replied on 08-11-2009 12:29 PM

Useful question. I don't think we have example code in the user guide, but here is a succinct way to do it using the Variable.Switch statement. I have shown this for 2x2x2, but the approach is the same for any number of states:

Vector[ , ] cpt =
{{new Vector(0.1, 0.9), new Vector(0.3, 0.7)},
{new Vector(0.5, 0.5), new Vector(0.4, 0.6)}};

var x = Variable.New<int>();
Range yRange = new Range(cpt.GetLength(0));
Range zRange = new Range(cpt.GetLength(1));

var y = Variable.DiscreteUniform(yRange);
var z = Variable.DiscreteUniform(zRange);
var probs = Variable.Array<Vector>(yRange, zRange);
probs.ObservedValue = cpt;

using (Variable.Switch(y))
{
using (Variable.Switch(z))
{
x.SetTo(Variable.Discrete(probs[y, z]));
}
}
var engine = new InferenceEngine();
var xpost = engine.Infer(x);

Friday, June 3, 2011 5:13 PM
• jwinn replied on 08-12-2009 5:57 AM

You should be aware that some care is needed when working with large tabular conditional probabilities such as the one you are proposing.  This table involves 10x10x10=1000 parameters - learning such as large set of parameters will require a huge amount of data - you will need to see every combination of the states of x,y and z multiple times to get a good estimate of the probability P(x|y,z).

The problem is that you are assuming nothing about the underlying relationship between x, y and z, so that P(x|y=1,z=1) could be utterly different from P(x|y=2,z=1) or P(x|y=1,z=2).  This is rarely the case.  I don't know your application, but suppose if x, y, and z are discretisations of continuous variables into ten bins - then you would expect P(x|y,z) to vary smoothly across x and also to vary smoothly as y and z change.  This smoothness is lost if you use a tabular conditional probability.  Infer.NET provides lots of other kinds of factor which can be used to represent more specific relationships between x,y and z  - for example if x,y and z are discretisation of continuous variables which are linearly related then you can directly represent them as continuous variables and specify arithmetic relationships between them e.g. x = Ay + Bz + C and learn A, B, C - just three parameters instead of 1000.

I hope this makes sense and explains why we have many other kinds of relationships available in Infer.NET as well as conditional probability tables.

Best,

John W.

Friday, June 3, 2011 5:13 PM
• freddycct replied on 08-12-2009 6:48 AM

Thanks for all the replies. I exaggerated abit on the problem. The conditional table i have is P(x | y, z) where x and y has 5 states and z has 2 states. So I do have a total of 5x5x2 = 50 parameters which I think is still a hassle to type it. I am still exploring Bayesian networks so I am modeling my application using discrete variables instead of continuous  variables.

Friday, June 3, 2011 5:13 PM