# Modelling a Bayesian Net - Help, please! (Migrated from community.research.microsoft.com)

• ### Question

• Valmir Meneses posted on 06-14-2010 4:00 PM

Greetings,

I am trying to model the example below , but have not been sucessful. I cannot figure which kind of inference is necessary to the model.

I have tried defining Ontime, MajorChange and MajorChangeOntime as in the 2 coins sample but I does not work. Can anyone provide assistance, please?

A numerical example

Assume our project manager has historical data on projects that have been carried out within  the organisation.  On analyzing the data, the PM  finds that 60% of all projects finished on time. This implies:

$P(T) = 0.6$……(13),

and

$P(T) = 0.4$……(13),

Let us assume that our organisation also tracks major changes made to projects in progress.  Say 50% of all historical projects are found to have major changes. This implies:

$P(C) = 0.5$……(15).

Finally, let us assume that our project manager has access to detailed data on successful projects, and that an analysis of this data shows that 30% on time projects have undergone at least one major scope change. This gives:

$P(C|T) = 0.3$……(16).

Equations (13) through (16) give us the numbers we need to calculated   $P(T|C)$ using Bayes Theorem.  Plugging the numbers in equation (11), we get:

$P(T|C)=\displaystyle\frac{0.3*0.6}{0.5}=0 .36$……(16)

Friday, June 3, 2011 5:48 PM

• Valmir Meneses replied on 06-21-2010 9:41 AM

Greetings David,

I am thankful for your help. I cannot evaluate if the formula is wrong. I was following the article "as-is". The article does mention the sum of formulas which I did not copied to the text in the forum but it is available at the article itself.

Friday, June 3, 2011 5:48 PM

### All replies

• DavidKnowles replied on 06-16-2010 4:17 AM

The example you gave isn't in a very straightforward form for Infer.NET to solve because P(C|T') is not specified. However, we can calculate it easily using the numbers you do give:

P(C)=P(C|T)P(T)+P(C|T')P(T')

=> P(C|T')= ( P(C)-P(C|T)P(T) )  / P(T')  = (.5 -.3*.6) / .4 = .8

Now we can solve this problem using Infer.NET:

var t = Variable.Bernoulli(.6);

Variable<bool> c = Variable.New<bool>();

using (Variable.If(t))

{

c.SetTo(Variable.Bernoulli(.3));

}

using (Variable.IfNot(t))

{

c.SetTo(Variable.Bernoulli(.8));

}

c.ObservedValue = true;

var ie = new InferenceEngine(new ExpectationPropagation());

Console.WriteLine("P(T|C)=" + ie.Infer(t));

Friday, June 3, 2011 5:48 PM
• John Guiver replied on 06-16-2010 6:10 AM

Hi Valmir

When you are defining branches in Infer.NET which are conditioned on a random variable (in your case T), you must give code for all values of the condition. In your case the condition is a boolean random variable, and you must define both the If and IfNot paths - i.e. P(C|T) and P(C|not T). You have given P(C|T) and P(C), but P(C|not T) easily follows from your data, or from the knowledge that P(C) =  P(C|T)P(T) + P(C| not T)P(not T) - this gives P(C|not T) = 0.8. The model then looks as follows which gives the correct answer.

var T = Variable.Bernoulli(0.6);
var C = Variable.New<bool>();
using (Variable.If(T))
C.SetTo(
Variable.Bernoulli(0.3));
using (Variable.IfNot(T))
C.SetTo(
Variable.Bernoulli(0.8));
C.ObservedValue =
true;
var engine = new InferenceEngine();
Console.WriteLine(engine.Infer<Bernoulli>(T).GetProbTrue());

John

Friday, June 3, 2011 5:48 PM
• Valmir Meneses replied on 06-17-2010 1:52 PM

Thank you, very much.

I was looking to the endpoint (INFER.NET) and should be looking to the starting point (Bayes Theorem).

I have implemented the code and it looks fine.

I will probably need some help with a two random variables model, but I will give it a try first.

Thanks,

Friday, June 3, 2011 5:48 PM
• Valmir Meneses replied on 06-17-2010 2:30 PM

The following article, depicts a two random variables model.
How can I model this? Where does the AND clause from the "two coins" example applies here?

Khodakarami, V., Fenton, N., Neil, M.
"Project Scheduling: Improved Approach to Incorporate Uncertainty Using Bayesian Networks."
Project Management Journal, 38(2): 39-49, Page 42
Available at: http://www.agenarisk.com/resources/technology_articles/vahid.pdf
pages 10-11 from the above PDF

Suppose in addition to the sub-contract delay ,
the project manager has noticed that the ‘staff quality’
also has a direct influence on the task’s duration and therefore on its delay.
Now there are two independent variables that influence another variable.

‘Sub-contract’ and ‘Staff Quality’ are nodes that are parents of ‘Delay in Task’.
Each node has a set of possible states (e.g. ‘on time’ and ‘late’ for sub-contract node).
Attached to each node, there is a ‘Node Probability Table’ (NPT).
The NPT can be a prior probability (e.g. ‘Staff Quality’ ) or a
conditional probability given the states of its parents (e.g. ‘Delay in Task’ ) .
The NPT values can be assessed by prior knowledge (subjective estimation or expert judgment)
, empirical data, or a combination of both.

Sub-Contract (SC)
On Time  Probability=0.95
not On Time Probability=0.05 (1-On Time Probability)

Staff Quality (SQ)
Good   Probability=0.70
Poor  Probability=0.30 (1-Good Probability)

Sub-contract           On time                          Late
Staff Quality             Good Poor                    Good   Poor
False                          0.95   0.7                        0.7    0.01
True                            0.05   0.3                        0.3    0.99

Notation:

(d1 : D is ' false ', d2 : D is ' true ')

SC : Sub - contract
(sc1: SC is' late ', sc2 : SC is ' on time ')

SQ : Staff  Quality
(sq1 :SQ is ' good ', sq2 : SQ is ' poor ')

According to the chain rule the joint probability distribution is:

P(D, SC, SQ) = P(D | SC, SQ) * P(SC) * P(SQ)

P(Delay in Task is true) = 0.95 × 0.7 × 0.95 + 0.7 × 0.3 × 0.95 +0.7 × 0.7 × 0.05 +0.01× 0.3 × 0.05
P(Delay in Task is true) = 0.8559

P(Delay in Task is true | Sub - cont. is late)
P(D, SC, SQ | sc2 ) = P(D | sc2 , SQ) * P(sc2 ) * P(SQ)
P(D is true | SC is late) = 0.7 × 0.7 + 0.01× 0.3
P(D is true | SC is late) = 0.493

Friday, June 3, 2011 5:48 PM
• DavidKnowles replied on 06-21-2010 4:31 AM

Hi Valmir,

I think your calculation for P(Delay in Task is true) is incorrect. You should be calculating

P(D) = Sum_SC Sum_SQ P(D | SC, SQ) * P(SC) * P(SQ)

as shown in this table:

 P(D=True|SC,SQ) Subcontract P(D=True|SC,SQ) T F Subcontract 0.95 0.05 T F Quality T 0.7 0.05 0.3 Quality T 0.03325 0.0105 F 0.3 0.3 0.99 F 0.0855 0.01485 P(D=True) 0.1441

You can achieve this calculation in Infer.NET as follows:

var SubContractOnTime = Variable.Bernoulli(.95);

var StaffQualityIsGood = Variable.Bernoulli(.7);

var Delay = Variable.New<bool>();

using (Variable.If(SubContractOnTime))

{

using (Variable.If(StaffQualityIsGood))

Delay.SetTo(Variable.Bernoulli(.05));

using (Variable.IfNot(StaffQualityIsGood))

Delay.SetTo(Variable.Bernoulli(.3));

}

using (Variable.IfNot(SubContractOnTime))

{

using (Variable.If(StaffQualityIsGood))

Delay.SetTo(Variable.Bernoulli(.3));

using (Variable.IfNot(StaffQualityIsGood))

Delay.SetTo(Variable.Bernoulli(.99));

}

var ie = new InferenceEngine(new ExpectationPropagation());

Console.WriteLine("P(Delay)=" + ie.Infer(Delay));

This gives the correct answer. By my calculations, P(D is true | SC is late) = 0.507, which you can also calculate in Infer.NET using the above code but setting

SubContractOnTime.ObservedValue = false;

before inference.

Friday, June 3, 2011 5:48 PM
• Valmir Meneses replied on 06-21-2010 9:41 AM

Greetings David,

I am thankful for your help. I cannot evaluate if the formula is wrong. I was following the article "as-is". The article does mention the sum of formulas which I did not copied to the text in the forum but it is available at the article itself.

Friday, June 3, 2011 5:48 PM