locked
Unobserving a value (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • harr posted on 03-05-2009 8:55 PM

    Hi guys,

    I have a model where I'd like to be able to set some variables to be observed, run inference, then go back and unset some of the variables and rerun inference. I can't seem to find a way in the API to "unset" variables once they've been observed. Any tips? Thanks.

    Friday, June 3, 2011 6:34 PM

Answers

  • Pepe replied on 03-16-2011 3:50 AM

    Hi,

    Sorry for re-opening an old thread; however, I tried this approach, and while it worked for the simple coin case, it seems to be failing for larger models. Specifically, I have a 2D double array which I'd like to observe (at training time), and infer over (at test time). I defined two variables:

    rangeA = countA //countA is observed

    countB = Variable.Observed<int>(new int[0],rangeA) //countB is observed

    rangeB = countB[rangeA]

    VariableArray<VariableArray<double>, double[][]> X = Variable.Array (Variable.Array<double> (rangeB), rangeA).Named ("X"); //X is never observed

    VariableArray<VariableArray<double>, double[][]> Y = Variable.Observed<double>(new double[][]{}, rangeA, rangeB).Named("Y"); //Y is observed

    And the associated If block:

    using (Variable.If(this.isTraining)){

       Variable.ConstrainEqual(X,Y);

    }

    The values are set in a loop like:

    using (Variable.ForEach(rangeA))

       for (int i = 1; i < MAX; i++) //Manually unroll an HMM-like structure

          using (Variable.If (moreElements)) {

             this.reviewsMarkedHelpful[rangeA][i] = Variable.GaussianFromMeanAndPrecision(mean,prec);

    The code runs perfectly if isTraining is observed as true; the code also works if the constraint is deterministically set or not set (i.e. the If block is omitted). However, when the If block is in place, I get an exception after model compilation when isTraining is set to false:

    MicrosoftResearch.Infer.Factors.ImproperMessageException has been thrown "Improper distribution during inference (Gaussian.Uniform). Cannot perform inference on this model."

    Am I doing something wrong? I would greatly appreciate any help you could offer.

    Thanks!

    Friday, June 3, 2011 6:35 PM

All replies

  • harr replied on 03-06-2009 12:20 AM

    To follow up on my own question, I'm wondering if this is a plausible way to do it. I have another boolean observed variable, which I set to one if the value is observed, and zero if it is random. Basically I would use an If block to control this choice. Does this sound plausible?

    Friday, June 3, 2011 6:34 PM
  • laura replied on 03-06-2009 4:20 AM

    I guess you want to have some observed and some missing data.

    you can have a latent variable (f), which you link to training data (fTrain) in case some boolean variable indicates that it is missing (fTrainSet).

     

    It might not be the best solution, but the following should work (I am sorry for the F# syntax,but I hope you get the idea...)

    range1 and range2 are ranges over the jagged array.

     

        let fTrainSetArray =[| [|false;false;true;true;true;false;false|];
                            [|true;false;false;false;false;true;false;false;false;false|];
                            [|true;false;false;false;true;false;false|];
                            [|false;false;false;false;false;false;false|];
                            [|false;false;false;false;false;false|] |]
        let fTrainArray =[| [|0;1;1;1;1;0;0|];
                            [|0;0;0;0;1;1;1;0;1;1|];
                            [|1;1;3;3;3;1;1|];
                            [|0;1;1;1;1;0;0|];
                            [|1;0;0;0;0;2|] |]

        let fTrainSet = (Variable.Array<_>(Variable.Array<bool>(range2), range1)).Named("fTrainSetArray")
        do fTrainSet.IsReadOnly <- false
        do fTrainSet.ObservedValue <- fTrainSetArray
        let fTrain = (Variable.Array<_>(Variable.Array<int>(range2), range1)).Named("fTrainArray")
        do fTrain.IsReadOnly <- false
        do fTrain.ObservedValue <- fTrainArray

        let f = (Variable.Array<_>(Variable.Array<int>(range2), range1)).Named("f")

                    f.[range1].[range2] <- Variable.Discrete(someDistribution).Attrib(new ValueRange(fRange))
                   
                    using (Variable.If(fTrainSet.[range1].[range2])) (fun _ ->
                        Variable.ConstrainEqual(f.[range1].[range2],fTrain.[range1].[range2])
                        )

     

    Friday, June 3, 2011 6:35 PM
  • John Guiver replied on 03-06-2009 5:24 AM

    The 'ClearObservedValue()' method on a variable sets the variable to be non-observed - but it will trigger a recompilation of the model the next time you run inference.

    John G.

    Friday, June 3, 2011 6:35 PM
  • John Guiver replied on 03-06-2009 6:30 AM

    Laura's post is dealing with the problem of missing data. If I understand Harr's original question, you still want the variable to be part of your model, but random, rather than missing from the model. For example, in the two coins example, we can observe or not observe the 'bothHeads', and infer it if it is not observed, but infer other variables if it is observed. The following bit of code shows the two coins example, where we switch off and on whether the 'bothHeads' coin is observed. Unlike when we use the ClearObservedValue method, no recompilation occurs. This makes use of the if block idea as both Harr and Laura have suggested:

    public void Run()
    {
       
    Variable<bool> bothHeadsIsObserved = Variable.Observed<bool>(false);
        Variable<bool> bothHeadsObservedValue = Variable.Observed<bool>(false);
        Variable<bool> firstCoin = Variable.Bernoulli(0.5).Named("firstCoin");
        Variable<bool> secondCoin = Variable.Bernoulli(0.5).Named("secondCoin");
        Variable<bool> bothHeads  = (firstCoin & secondCoin).Named("bothHeads");
        using (Variable.If(bothHeadsIsObserved))
            Variable.ConstrainEqual(bothHeads, bothHeadsObservedValue);
        InferenceEngine ie = new InferenceEngine();

        Console.WriteLine("Probability both coins are heads: "+ie.Infer(bothHeads));
       
    bothHeadsIsObserved.ObservedValue = true;
        bothHeadsObservedValue.ObservedValue = false;
       
    Console.WriteLine("Probability distribution over firstCoin: " + ie.Infer(firstCoin));
        bothHeadsObservedValue.ObservedValue = true;
        Console.WriteLine("Probability distribution over firstCoin: " + ie.Infer(firstCoin));
        bothHeadsIsObserved.ObservedValue = false;
        Console.WriteLine("Probability both coins are heads: "+ie.Infer(bothHeads));

    John G.

    Friday, June 3, 2011 6:35 PM
  • harr replied on 03-06-2009 9:41 AM

    Thanks John and Laura. I actually had tried implementing a solution similar to what John suggested at the end; however, I now occasionally receive AllZeroExceptions when I constrain some certain subsets of my variables. Is this a sign of an underflow issue perhaps? The probabilities shouldn't actually be all zero.

    Friday, June 3, 2011 6:35 PM
  • Pepe replied on 03-16-2011 3:50 AM

    Hi,

    Sorry for re-opening an old thread; however, I tried this approach, and while it worked for the simple coin case, it seems to be failing for larger models. Specifically, I have a 2D double array which I'd like to observe (at training time), and infer over (at test time). I defined two variables:

    rangeA = countA //countA is observed

    countB = Variable.Observed<int>(new int[0],rangeA) //countB is observed

    rangeB = countB[rangeA]

    VariableArray<VariableArray<double>, double[][]> X = Variable.Array (Variable.Array<double> (rangeB), rangeA).Named ("X"); //X is never observed

    VariableArray<VariableArray<double>, double[][]> Y = Variable.Observed<double>(new double[][]{}, rangeA, rangeB).Named("Y"); //Y is observed

    And the associated If block:

    using (Variable.If(this.isTraining)){

       Variable.ConstrainEqual(X,Y);

    }

    The values are set in a loop like:

    using (Variable.ForEach(rangeA))

       for (int i = 1; i < MAX; i++) //Manually unroll an HMM-like structure

          using (Variable.If (moreElements)) {

             this.reviewsMarkedHelpful[rangeA][i] = Variable.GaussianFromMeanAndPrecision(mean,prec);

    The code runs perfectly if isTraining is observed as true; the code also works if the constraint is deterministically set or not set (i.e. the If block is omitted). However, when the If block is in place, I get an exception after model compilation when isTraining is set to false:

    MicrosoftResearch.Infer.Factors.ImproperMessageException has been thrown "Improper distribution during inference (Gaussian.Uniform). Cannot perform inference on this model."

    Am I doing something wrong? I would greatly appreciate any help you could offer.

    Thanks!

    Friday, June 3, 2011 6:35 PM
  • minka replied on 03-16-2011 4:50 AM

    The If approach described above is only intended for missing data situations, not for turning off all observations completely.  If you turn off all observations, then the message passing schedule needs to change significantly and so a recompilation is necessary.  For practical machine learning problems, the best approach is to compile separate train and test models, as we do in our examples.

    Friday, June 3, 2011 6:35 PM