locked
Learning Parameters When Some Data is Missing - Sprinkler Example RRS feed

  • Question

  • Noob question please :) In the sprinkler example packaged with Infer.NET, the 2nd part of the program attempts to learn parameters of the model given data C, S, R, W. What if I cannot observe C, but would still like to infer ProbCloudyPosterior?

    Thanks a lot in advance

    Friday, April 6, 2012 5:09 PM

Answers

  • In LearnParameters you will see the line

      Cloudy.ObservedValue = cloudy;

    If you change this to:  

      Cloudy.ClearObservedValue();

    Then Cloudy will be left unobserved.  See ProbRain for an example of this.

    • Marked as answer by exx Monday, April 9, 2012 4:41 PM
    • Unmarked as answer by exx Tuesday, April 17, 2012 10:29 PM
    • Marked as answer by exx Wednesday, April 18, 2012 6:14 PM
    • Unmarked as answer by exx Wednesday, April 18, 2012 7:20 PM
    • Marked as answer by exx Wednesday, April 18, 2012 7:20 PM
    Saturday, April 7, 2012 10:37 AM
    Owner
  • By the looks of it, the parameters have come out symmetric.  When you don't observe Cloudy, the model effectively becomes a mixture model.  So you have to be careful about breaking symmetry as explained in the Mixture of Gaussians example.


    • Edited by Tom MinkaMicrosoft employee, Owner Wednesday, April 18, 2012 10:47 AM
    • Marked as answer by exx Wednesday, April 18, 2012 6:14 PM
    • Unmarked as answer by exx Wednesday, April 18, 2012 7:20 PM
    • Marked as answer by exx Wednesday, April 18, 2012 7:20 PM
    Wednesday, April 18, 2012 10:46 AM
    Owner

All replies

  • In LearnParameters you will see the line

      Cloudy.ObservedValue = cloudy;

    If you change this to:  

      Cloudy.ClearObservedValue();

    Then Cloudy will be left unobserved.  See ProbRain for an example of this.

    • Marked as answer by exx Monday, April 9, 2012 4:41 PM
    • Unmarked as answer by exx Tuesday, April 17, 2012 10:29 PM
    • Marked as answer by exx Wednesday, April 18, 2012 6:14 PM
    • Unmarked as answer by exx Wednesday, April 18, 2012 7:20 PM
    • Marked as answer by exx Wednesday, April 18, 2012 7:20 PM
    Saturday, April 7, 2012 10:37 AM
    Owner
  • Thanks a lot Tom!
    Monday, April 9, 2012 4:41 PM
  • Hi Tom,

    It seems that if I do not provide any observed values to Cloudy (i.e. Cloudy.ClearObservedValue()), the model has a very hard time learning the probabilities. The learned probabilities, even when I generate 10,000 sample data points, are:

        Prob. Cloudy:                                            Ground truth: 0.30, Inferred: 0.50
        Prob. Sprinkler | Cloudy:                            Ground truth: 0.10, Inferred: 0.38
        Prob. Sprinkler | Not Cloudy:                      Ground truth: 0.50, Inferred: 0.38
        Prob. Rain      | Cloudy:                              Ground truth: 0.80, Inferred: 0.38
        Prob. Rain      | Not Cloudy:                        Ground truth: 0.20, Inferred: 0.38
        Prob. Wet Grass | Sprinkler, Rain:               Ground truth: 0.99, Inferred: 0.99
        Prob. Wet Grass | Sprinkler, Not Rain          Ground truth: 0.90, Inferred: 0.89
        Prob. Wet Grass | Not Sprinkler, Rain:         Ground truth: 0.90, Inferred: 0.90
        Prob. Wet Grass | Not Sprinkler, Not Rain:   Ground truth: 0.00, Inferred: 0.00

    (note: I changed the actual prob. cloudy from (0.5, 0.5) in the original code to (0.3, 0.7)).

    Any help is greatly appreciated.


    • Edited by exx Tuesday, April 17, 2012 10:35 PM
    • Marked as answer by exx Wednesday, April 18, 2012 6:14 PM
    • Unmarked as answer by exx Wednesday, April 18, 2012 6:14 PM
    Tuesday, April 17, 2012 10:34 PM
  • By the looks of it, the parameters have come out symmetric.  When you don't observe Cloudy, the model effectively becomes a mixture model.  So you have to be careful about breaking symmetry as explained in the Mixture of Gaussians example.


    • Edited by Tom MinkaMicrosoft employee, Owner Wednesday, April 18, 2012 10:47 AM
    • Marked as answer by exx Wednesday, April 18, 2012 6:14 PM
    • Unmarked as answer by exx Wednesday, April 18, 2012 7:20 PM
    • Marked as answer by exx Wednesday, April 18, 2012 7:20 PM
    Wednesday, April 18, 2012 10:46 AM
    Owner
  • Thanks so much Tom. Did as you suggested, and now, P(S | C) and P(R | C) are better calculated. But it seems P(C) still can't be inferred.

    Prob. Cloudy:                                       Ground truth: 0.30, Inferred: 0.50
    Prob. Sprinkler | Cloudy:                        Ground truth: 0.10, Inferred: 0.16
    Prob. Sprinkler | Not Cloudy:                   Ground truth: 0.50, Inferred: 0.60
    Prob. Rain      | Cloudy:                         Ground truth: 0.80, Inferred: 0.60
    Prob. Rain      | Not Cloudy:                    Ground truth: 0.20, Inferred: 0.16
    Prob. Wet Grass | Sprinkler, Rain:            Ground truth: 0.99, Inferred: 0.99
    Prob. Wet Grass | Sprinkler, Not Rain        Ground truth: 0.90, Inferred: 0.90
    Prob. Wet Grass | Not Sprinkler, Rain:       Ground truth: 0.90, Inferred: 0.90
    Prob. Wet Grass | Not Sprinkler, Not Rain: Ground truth: 0.00, Inferred: 0.00


    • Edited by exx Wednesday, April 18, 2012 7:26 PM
    Wednesday, April 18, 2012 6:54 PM
  • Yes, it isn't possible to recover the true parameters exactly when cloudy is not observed.  This is because the model has too many parameters.  There is 1 for cloudy, 2 for sprinkler, 2 for rain, totalling 5 for p(sprinkler,rain) alone.  But you only need 3 parameters to describe any joint distribution of (sprinkler,rain). 
    Thursday, April 19, 2012 12:47 PM
    Owner