locked
Modelling a Bayesian Net - Help, please! (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • Valmir Meneses posted on 06-14-2010 4:00 PM

    Greetings,

    I am trying to model the example below , but have not been sucessful. I cannot figure which kind of inference is necessary to the model.

    I have tried defining Ontime, MajorChange and MajorChangeOntime as in the 2 coins sample but I does not work. Can anyone provide assistance, please?

    example ( http://eight2late.wordpress.com/2010/03/11/bayes-theorem-for-project-managers/)

    A numerical example

    Assume our project manager has historical data on projects that have been carried out within  the organisation.  On analyzing the data, the PM  finds that 60% of all projects finished on time. This implies:

    P(T) = 0.6……(13),

    and

    P(T) = 0.4……(13),

    Let us assume that our organisation also tracks major changes made to projects in progress.  Say 50% of all historical projects are found to have major changes. This implies:

    P(C) = 0.5……(15).

    Finally, let us assume that our project manager has access to detailed data on successful projects, and that an analysis of this data shows that 30% on time projects have undergone at least one major scope change. This gives:

    P(C|T) = 0.3……(16).

    Equations (13) through (16) give us the numbers we need to calculated   P(T|C) using Bayes Theorem.  Plugging the numbers in equation (11), we get:

    P(T|C)=\displaystyle\frac{0.3*0.6}{0.5}=0 .36 ……(16)

     

    Friday, June 3, 2011 5:48 PM

Answers

  • Valmir Meneses replied on 06-21-2010 9:41 AM

    Greetings David,

    I am thankful for your help. I cannot evaluate if the formula is wrong. I was following the article "as-is". The article does mention the sum of formulas which I did not copied to the text in the forum but it is available at the article itself.

    Friday, June 3, 2011 5:48 PM

All replies

  • DavidKnowles replied on 06-16-2010 4:17 AM

    The example you gave isn't in a very straightforward form for Infer.NET to solve because P(C|T') is not specified. However, we can calculate it easily using the numbers you do give:

    P(C)=P(C|T)P(T)+P(C|T')P(T')

    => P(C|T')= ( P(C)-P(C|T)P(T) )  / P(T')  = (.5 -.3*.6) / .4 = .8

    Now we can solve this problem using Infer.NET: 

    var t = Variable.Bernoulli(.6);

    Variable<bool> c = Variable.New<bool>();

    using (Variable.If(t))

    {

        c.SetTo(Variable.Bernoulli(.3));

    }

    using (Variable.IfNot(t))

    {

        c.SetTo(Variable.Bernoulli(.8));

    }

    c.ObservedValue = true;

    var ie = new InferenceEngine(new ExpectationPropagation());

    Console.WriteLine("P(T|C)=" + ie.Infer(t));

     

    Friday, June 3, 2011 5:48 PM
  • John Guiver replied on 06-16-2010 6:10 AM

    Hi Valmir

    When you are defining branches in Infer.NET which are conditioned on a random variable (in your case T), you must give code for all values of the condition. In your case the condition is a boolean random variable, and you must define both the If and IfNot paths - i.e. P(C|T) and P(C|not T). You have given P(C|T) and P(C), but P(C|not T) easily follows from your data, or from the knowledge that P(C) =  P(C|T)P(T) + P(C| not T)P(not T) - this gives P(C|not T) = 0.8. The model then looks as follows which gives the correct answer.

    var T = Variable.Bernoulli(0.6);
    var C = Variable.New<bool>();
    using (Variable.If(T))
      C.SetTo(
    Variable.Bernoulli(0.3));
    using (Variable.IfNot(T))
      C.SetTo(
    Variable.Bernoulli(0.8));
    C.ObservedValue =
    true;
    var engine = new InferenceEngine();
    Console.WriteLine(engine.Infer<Bernoulli>(T).GetProbTrue()); 

    John

    Friday, June 3, 2011 5:48 PM
  • Valmir Meneses replied on 06-17-2010 1:52 PM

    Thank you, very much.

    I was looking to the endpoint (INFER.NET) and should be looking to the starting point (Bayes Theorem).

    I have implemented the code and it looks fine.

    I will probably need some help with a two random variables model, but I will give it a try first.

    Thanks,

     

    Friday, June 3, 2011 5:48 PM
  • Valmir Meneses replied on 06-17-2010 2:30 PM

    The following article, depicts a two random variables model.
    How can I model this? Where does the AND clause from the "two coins" example applies here?

    Khodakarami, V., Fenton, N., Neil, M.
    "Project Scheduling: Improved Approach to Incorporate Uncertainty Using Bayesian Networks."
    Project Management Journal, 38(2): 39-49, Page 42
    Available at: http://www.agenarisk.com/resources/technology_articles/vahid.pdf
    pages 10-11 from the above PDF

    Suppose in addition to the sub-contract delay ,
    the project manager has noticed that the ‘staff quality’
    also has a direct influence on the task’s duration and therefore on its delay.
    Now there are two independent variables that influence another variable.

    ‘Sub-contract’ and ‘Staff Quality’ are nodes that are parents of ‘Delay in Task’.
    Each node has a set of possible states (e.g. ‘on time’ and ‘late’ for sub-contract node).
    Attached to each node, there is a ‘Node Probability Table’ (NPT).
    The NPT can be a prior probability (e.g. ‘Staff Quality’ ) or a
    conditional probability given the states of its parents (e.g. ‘Delay in Task’ ) .
    The NPT values can be assessed by prior knowledge (subjective estimation or expert judgment)
    , empirical data, or a combination of both.

    Sub-Contract (SC)
     On Time  Probability=0.95
     not On Time Probability=0.05 (1-On Time Probability)

    Staff Quality (SQ)
     Good   Probability=0.70
     Poor  Probability=0.30 (1-Good Probability)

    Delay in Task (D)
     Sub-contract           On time                          Late
     Staff Quality             Good Poor                    Good   Poor
     False                          0.95   0.7                        0.7    0.01
     True                            0.05   0.3                        0.3    0.99

    Notation:

    D : Delay in Task
    (d1 : D is ' false ', d2 : D is ' true ')
     
    SC : Sub - contract
    (sc1: SC is' late ', sc2 : SC is ' on time ')
     
    SQ : Staff  Quality
    (sq1 :SQ is ' good ', sq2 : SQ is ' poor ')
     

    According to the chain rule the joint probability distribution is:

    P(D, SC, SQ) = P(D | SC, SQ) * P(SC) * P(SQ)

    P(Delay in Task is true) = 0.95 × 0.7 × 0.95 + 0.7 × 0.3 × 0.95 +0.7 × 0.7 × 0.05 +0.01× 0.3 × 0.05
    P(Delay in Task is true) = 0.8559
     

    P(Delay in Task is true | Sub - cont. is late)
    P(D, SC, SQ | sc2 ) = P(D | sc2 , SQ) * P(sc2 ) * P(SQ)
    P(D is true | SC is late) = 0.7 × 0.7 + 0.01× 0.3
    P(D is true | SC is late) = 0.493

     

    Friday, June 3, 2011 5:48 PM
  • DavidKnowles replied on 06-21-2010 4:31 AM

    Hi Valmir,

    I think your calculation for P(Delay in Task is true) is incorrect. You should be calculating

    P(D) = Sum_SC Sum_SQ P(D | SC, SQ) * P(SC) * P(SQ)

    as shown in this table:

    P(D=True|SC,SQ)

    Subcontract

    P(D=True|SC,SQ)

    T

    F

    Subcontract

    0.95

    0.05

    T

    F

    Quality

    T

    0.7

    0.05

    0.3

    Quality

    T

    0.03325

    0.0105

    F

    0.3

    0.3

    0.99

    F

    0.0855

    0.01485

    P(D=True)

    0.1441

     You can achieve this calculation in Infer.NET as follows:

     

    var SubContractOnTime = Variable.Bernoulli(.95);

    var StaffQualityIsGood = Variable.Bernoulli(.7);

    var Delay = Variable.New<bool>();

    using (Variable.If(SubContractOnTime))

    {

        using (Variable.If(StaffQualityIsGood))

            Delay.SetTo(Variable.Bernoulli(.05));

        using (Variable.IfNot(StaffQualityIsGood))

            Delay.SetTo(Variable.Bernoulli(.3));

    }

    using (Variable.IfNot(SubContractOnTime))

    {

        using (Variable.If(StaffQualityIsGood))

            Delay.SetTo(Variable.Bernoulli(.3));

        using (Variable.IfNot(StaffQualityIsGood))

            Delay.SetTo(Variable.Bernoulli(.99));

    }

    var ie = new InferenceEngine(new ExpectationPropagation());

    Console.WriteLine("P(Delay)=" + ie.Infer(Delay));

     

     

     This gives the correct answer. By my calculations, P(D is true | SC is late) = 0.507, which you can also calculate in Infer.NET using the above code but setting

    SubContractOnTime.ObservedValue = false;

    before inference.

    Friday, June 3, 2011 5:48 PM
  • Valmir Meneses replied on 06-21-2010 9:41 AM

    Greetings David,

    I am thankful for your help. I cannot evaluate if the formula is wrong. I was following the article "as-is". The article does mention the sum of formulas which I did not copied to the text in the forum but it is available at the article itself.

    Friday, June 3, 2011 5:48 PM