Answered by:
Modelling a Bayesian Net  Help, please! (Migrated from community.research.microsoft.com)
Question

Valmir Meneses posted on 06142010 4:00 PM
Greetings,
I am trying to model the example below , but have not been sucessful. I cannot figure which kind of inference is necessary to the model.
I have tried defining Ontime, MajorChange and MajorChangeOntime as in the 2 coins sample but I does not work. Can anyone provide assistance, please?
example ( http://eight2late.wordpress.com/2010/03/11/bayestheoremforprojectmanagers/)
A numerical example
Assume our project manager has historical data on projects that have been carried out within the organisation. On analyzing the data, the PM finds that 60% of all projects finished on time. This implies:
……(13),
and
……(13),
Let us assume that our organisation also tracks major changes made to projects in progress. Say 50% of all historical projects are found to have major changes. This implies:
……(15).
Finally, let us assume that our project manager has access to detailed data on successful projects, and that an analysis of this data shows that 30% on time projects have undergone at least one major scope change. This gives:
……(16).
Equations (13) through (16) give us the numbers we need to calculated using Bayes Theorem. Plugging the numbers in equation (11), we get:
……(16)
Friday, June 3, 2011 5:48 PM
Answers

Valmir Meneses replied on 06212010 9:41 AM
Greetings David,
I am thankful for your help. I cannot evaluate if the formula is wrong. I was following the article "asis". The article does mention the sum of formulas which I did not copied to the text in the forum but it is available at the article itself.
 Marked as answer by Microsoft Research Friday, June 3, 2011 5:49 PM
Friday, June 3, 2011 5:48 PM
All replies

DavidKnowles replied on 06162010 4:17 AM
The example you gave isn't in a very straightforward form for Infer.NET to solve because P(CT') is not specified. However, we can calculate it easily using the numbers you do give:
P(C)=P(CT)P(T)+P(CT')P(T')
=> P(CT')= ( P(C)P(CT)P(T) ) / P(T') = (.5 .3*.6) / .4 = .8
Now we can solve this problem using Infer.NET:
var t = Variable.Bernoulli(.6);
Variable<bool> c = Variable.New<bool>();
using (Variable.If(t))
{
c.SetTo(Variable.Bernoulli(.3));
}
using (Variable.IfNot(t))
{
c.SetTo(Variable.Bernoulli(.8));
}
c.ObservedValue = true;
var ie = new InferenceEngine(new ExpectationPropagation());
Console.WriteLine("P(TC)=" + ie.Infer(t));
Friday, June 3, 2011 5:48 PM 
John Guiver replied on 06162010 6:10 AM
Hi Valmir
When you are defining branches in Infer.NET which are conditioned on a random variable (in your case T), you must give code for all values of the condition. In your case the condition is a boolean random variable, and you must define both the If and IfNot paths  i.e. P(CT) and P(Cnot T). You have given P(CT) and P(C), but P(Cnot T) easily follows from your data, or from the knowledge that P(C) = P(CT)P(T) + P(C not T)P(not T)  this gives P(Cnot T) = 0.8. The model then looks as follows which gives the correct answer.
var T = Variable.Bernoulli(0.6);
var C = Variable.New<bool>();
using (Variable.If(T))
C.SetTo(Variable.Bernoulli(0.3));
using (Variable.IfNot(T))
C.SetTo(Variable.Bernoulli(0.8));
C.ObservedValue = true;
var engine = new InferenceEngine();
Console.WriteLine(engine.Infer<Bernoulli>(T).GetProbTrue());John
Friday, June 3, 2011 5:48 PM 
Valmir Meneses replied on 06172010 1:52 PM
Thank you, very much.
I was looking to the endpoint (INFER.NET) and should be looking to the starting point (Bayes Theorem).
I have implemented the code and it looks fine.
I will probably need some help with a two random variables model, but I will give it a try first.
Thanks,
Friday, June 3, 2011 5:48 PM 
Valmir Meneses replied on 06172010 2:30 PM
The following article, depicts a two random variables model.
How can I model this? Where does the AND clause from the "two coins" example applies here?Khodakarami, V., Fenton, N., Neil, M.
"Project Scheduling: Improved Approach to Incorporate Uncertainty Using Bayesian Networks."
Project Management Journal, 38(2): 3949, Page 42
Available at: http://www.agenarisk.com/resources/technology_articles/vahid.pdf
pages 1011 from the above PDFSuppose in addition to the subcontract delay ,
the project manager has noticed that the ‘staff quality’
also has a direct influence on the task’s duration and therefore on its delay.
Now there are two independent variables that influence another variable.‘Subcontract’ and ‘Staff Quality’ are nodes that are parents of ‘Delay in Task’.
Each node has a set of possible states (e.g. ‘on time’ and ‘late’ for subcontract node).
Attached to each node, there is a ‘Node Probability Table’ (NPT).
The NPT can be a prior probability (e.g. ‘Staff Quality’ ) or a
conditional probability given the states of its parents (e.g. ‘Delay in Task’ ) .
The NPT values can be assessed by prior knowledge (subjective estimation or expert judgment)
, empirical data, or a combination of both.SubContract (SC)
On Time Probability=0.95
not On Time Probability=0.05 (1On Time Probability)Staff Quality (SQ)
Good Probability=0.70
Poor Probability=0.30 (1Good Probability)Delay in Task (D)
Subcontract On time Late
Staff Quality Good Poor Good Poor
False 0.95 0.7 0.7 0.01
True 0.05 0.3 0.3 0.99Notation:
D : Delay in Task
(d1 : D is ' false ', d2 : D is ' true ')
SC : Sub  contract
(sc1: SC is' late ', sc2 : SC is ' on time ')
SQ : Staff Quality
(sq1 :SQ is ' good ', sq2 : SQ is ' poor ')
According to the chain rule the joint probability distribution is:
P(D, SC, SQ) = P(D  SC, SQ) * P(SC) * P(SQ)
P(Delay in Task is true) = 0.95 × 0.7 × 0.95 + 0.7 × 0.3 × 0.95 +0.7 × 0.7 × 0.05 +0.01× 0.3 × 0.05
P(Delay in Task is true) = 0.8559
P(Delay in Task is true  Sub  cont. is late)
P(D, SC, SQ  sc2 ) = P(D  sc2 , SQ) * P(sc2 ) * P(SQ)
P(D is true  SC is late) = 0.7 × 0.7 + 0.01× 0.3
P(D is true  SC is late) = 0.493Friday, June 3, 2011 5:48 PM 
DavidKnowles replied on 06212010 4:31 AM
Hi Valmir,
I think your calculation for P(Delay in Task is true) is incorrect. You should be calculating
P(D) = Sum_SC Sum_SQ P(D  SC, SQ) * P(SC) * P(SQ)
as shown in this table:
P(D=TrueSC,SQ)
Subcontract
P(D=TrueSC,SQ)
T
F
Subcontract
0.95
0.05
T
F
Quality
T
0.7
0.05
0.3
Quality
T
0.03325
0.0105
F
0.3
0.3
0.99
F
0.0855
0.01485
P(D=True) 0.1441
You can achieve this calculation in Infer.NET as follows:
var SubContractOnTime = Variable.Bernoulli(.95);
var StaffQualityIsGood = Variable.Bernoulli(.7);
var Delay = Variable.New<bool>();
using (Variable.If(SubContractOnTime))
{
using (Variable.If(StaffQualityIsGood))
Delay.SetTo(Variable.Bernoulli(.05));
using (Variable.IfNot(StaffQualityIsGood))
Delay.SetTo(Variable.Bernoulli(.3));
}
using (Variable.IfNot(SubContractOnTime))
{
using (Variable.If(StaffQualityIsGood))
Delay.SetTo(Variable.Bernoulli(.3));
using (Variable.IfNot(StaffQualityIsGood))
Delay.SetTo(Variable.Bernoulli(.99));
}
var ie = new InferenceEngine(new ExpectationPropagation());
Console.WriteLine("P(Delay)=" + ie.Infer(Delay));
This gives the correct answer. By my calculations, P(D is true  SC is late) = 0.507, which you can also calculate in Infer.NET using the above code but setting
SubContractOnTime.ObservedValue = false;
before inference.
Friday, June 3, 2011 5:48 PM 
Valmir Meneses replied on 06212010 9:41 AM
Greetings David,
I am thankful for your help. I cannot evaluate if the formula is wrong. I was following the article "asis". The article does mention the sum of formulas which I did not copied to the text in the forum but it is available at the article itself.
 Marked as answer by Microsoft Research Friday, June 3, 2011 5:49 PM
Friday, June 3, 2011 5:48 PM