locked
Updating the Factor Graph across Inferences (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • arunchaganty posted on 01-09-2011 2:23 PM

    Hello,

         I am working on an application where the factor graph, or graphical model needs to be revised in the course of the algorithm. When using Infer.NET, I create a number of variables (I am not using a VariableArray - yet), and appropriate factors. I instantiate a InferenceEngine (using the default Expectation Propogation algorithm), and call InferAll on the variables (I have also tried calling Infer on every variable one at at time). I then add some more variables + factors and try to repeat this procedure. I end up getting the _same_ marginals as I had earlier.

    When I create a new InferenceEngine, and call InferAll with it, the marginals are different (and my algorithm gives reasonable results). Is it possible to "clear" the cached marginals and force it to recompute? Is it possible to model on my scenario this way using Infer.NET?

    Cheers,

    Arun.

    Friday, June 3, 2011 6:14 PM

Answers

  • arunchaganty replied on 01-19-2011 1:17 PM

    Hello,

    minka:

    This usage is not supported.  You must observe the whole array or none of it.

    I have finally found the solution to my problem(s) - using Subarrays (Variable.Subarray), I have been able to observe parts of the array, and proceed with the inference. Thanks a lot for all the help!

     

    Cheers,

    Arun.

    Friday, June 3, 2011 6:15 PM

All replies

  • John Guiver replied on 01-10-2011 5:00 AM

    Hi Arun

    First, I'd  just like to confirm what you mean by 'the graphical model needs to be revised in the course of the algorithm'. I assume you are referring to the algorithm that you are implementing which constructs a graphical model in the loop and performs inference on it.

    In general I would advise creating your model in a way that the structure can configured by means of observed variables (for example using variable arrays indexed by observed index arrays whose size themeselves are observed random variables). This way there will no recompilation step per iteration which will greatly speed up your algorithm. But first things first - let's try to understand what's happen with your current models.

    To help diagnose this, can you post a very simple example where this happens? If you are adding new variables and inferring them, this should trigger a recompilation of the model.

    John

    Friday, June 3, 2011 6:14 PM
  • arunchaganty replied on 01-15-2011 1:06 AM

    Hello John,

    I apologise for the late reply. An update of Infer.NET (from 2.3 to 2.4) seems to solve the recompilation problem I was facing. I must say you guys are doing an amazing job with the Infer.NET releases!

    Coming to the "revising the graphical models" query, the algorithm I'm lazily adds factors to the factor graph - these factors are not known before hand; I don't know how do deal with such a scenario with the variable array trick. However, I do know the "structure" of the factors, i.e. I know my factors will be of the form of "(and (or v1 v2) v3)" - v1, v2 and v3 being some random variables. Would the following be a valid approach?

    i) Create 4 variable arrays - factors, v1s, v2s and v3s. Note that random variables in v1s could also be aliased in v2s / v3s. Then factors can be appropriately assigned with a VariableArray.ForEach.

    ii) Whenever a new factor of this structure needs to be added, extend the length of the variable arrays, and assign values to the new indexes appropriately.

    A problem I had faced earlier when trying this approach was that whenever I was trying to "observe" a factor in the variable array, it would through an exception, stating that the variable could not be observed. I will try and reproduce that error and paste it here, but is there anything obvious that I am doing wrong here?

     

    Cheers,

    Arun.

    Friday, June 3, 2011 6:15 PM
  • arunchaganty replied on 01-15-2011 4:11 AM

    Hello,

         As a follow up the error I get when assigning an observed value to an element in a VariableArray is:

    "vbool[]0[vint23] is an array element.  To set the value of the array, use vbool[]0.ObservedValue"

    The code I am using is:

    arr.[ Variable.Constant( 0 ) ].ObservedValue <- true

     

    Cheers,

    Arun.

    Friday, June 3, 2011 6:15 PM
  • arunchaganty replied on 01-15-2011 5:54 AM

    Hello,

          I've coded a simplified use-case for what I am trying to do. In the code, I start out with having 3 variables, perform some inference on them, and then try to do the following:

    i) add a new variable

    ii) Observe an existing variable

    iii) Add a new factor.

     

    When I try to observe a variable, I get the following error: "No ObservedValue has been set on vbool[]0"

    When I comment out the code to observe a variable, it gives me an error in the InferAll stage: "Index was outside the bounds of the array."

    What can I do to avoid these errors and get the same functionality I do in "simpleModel" as I do in "arrayModel" ?

     

    // Learn more about F# at http://fsharp.net

    open MicrosoftResearch.Infer
    open MicrosoftResearch.Infer.Distributions
    open MicrosoftResearch.Infer.Factors
    open MicrosoftResearch.Infer.FSharp
    open MicrosoftResearch.Infer.Models

    // Simple implementation not using VariableArrays
    let simpleModel () =
        // Create 3 Bernoulli variables, with the same prior (0.8)
        let mutable n = 3
        let mutable ps = List.replicate n (Variable.Observed( 0.8 ) )
        let mutable vs = ps |> List.map (fun p -> Variable.Bernoulli( p ) )
        // Create a couple of factors - nothing special here
        let mutable fs = [
            vs.[ 0 ] &&& vs.[ 1 ];
            vs.[ 1 ] ||| vs.[ 2 ];
            ~~~( vs.[ 0 ] &&& vs.[ 2 ])
            ]
        fs |> List.iter (fun f -> Variable.ConstrainEqualRandom( f, Bernoulli( 0.8 ) ) )

       // Infer!
        let ie = new InferenceEngine()
        ie.ShowTimings <- true
        ie.InferAll( Seq.cast<IVariable> vs )
        ie.ShowTimings <- false
        let dist = vs |> List.map (fun v -> ie.Infer<Bernoulli>(v))
        List.zip vs dist |> List.iter (fun (v,b) -> printfn "%A: %A" v b )

        // Update variable priors with inferred posteriors
        List.iter2 (fun (v:Variable<float>) (b:Bernoulli) -> v.ObservedValue <- b.GetProbTrue() ) ps dist

        // Now another variable - with the same 0.8 prior
        n <- n + 1
        ps <- ps @ [ (Variable.Observed(0.8)) ]
        vs <- vs @ [(Variable.Bernoulli( ps.[ 3 ] ))]

        // One of the variables has now been observed.
        vs.[0].ObservedValue <- true

        // Also, add a new factor on the new variable.
        fs <- fs @ [(vs.[2] ||| vs.[3])]

        // Infer again!
        ie.ShowTimings <- true
        ie.InferAll( Seq.cast<IVariable> vs )
        ie.ShowTimings <- false

        let dist = vs |> List.map (fun v -> ie.Infer<Bernoulli>(v))
        List.zip vs dist |> List.iter (fun (v,b) -> printfn "%A: %A" v b )
        ()

    // The same thing using variable arrays
    let arrayModel () =
        // Create a variable array (initially size = 3); Bernoulli variables, with the same prior (0.8)
        let n = Variable.Observed( 3 )
        let rng = Range( n )
        let ps = Variable.ArrayInit rng (fun i -> Variable.Observed( 0.8 ) )
        let vs = Variable.ArrayInit rng (fun i -> Variable.Bernoulli( ps.[ i ] ) )

        // Create some factors
        let mutable fs = [
            vs.[ 0 ] &&& vs.[ 1 ];
            vs.[ 1 ] ||| vs.[ 2 ];
            ~~~( vs.[ 0 ] &&& vs.[ 2 ])
            ]
        fs |> List.iter (fun f -> Variable.ConstrainEqualRandom( f, Bernoulli( 0.8 ) ) )

        // Infer! (this works alright)
        let ie = new InferenceEngine()
        ie.ShowTimings <- true
        ie.InferAll( vs )
        ie.ShowTimings <- false
        let dist = ie.Infer<Bernoulli[]>( vs )
        printfn "%A : %A" vs dist

        // Update priors
        ps.ObservedValue <- Array.map (fun (b:Bernoulli) -> b.GetProbTrue() ) dist

        // Add a new variable
        n.ObservedValue <- 4
        // Fails here : "No ObservedValue has been set on vbool[]0"
        vs.ObservedValue.[ 0 ] <- true
        fs <- fs @ [(vs.[2] ||| vs.[3])]

        // Infer again!
        ie.ShowTimings <- true
        // Fails here: "Index was outside the bounds of the array."
        ie.InferAll( vs )
        ie.ShowTimings <- false

        printfn "%A : %A" vs (ie.Infer<Bernoulli[]>( vs ) )

        ()

    simpleModel()
    arrayModel()

    System.Console.ReadKey()

    Friday, June 3, 2011 6:15 PM
  • minka replied on 01-15-2011 9:41 AM

    This line is incorrect:
       vs.ObservedValue.[ 0 ] <- true
    It tries to get vs.ObservedValue and then set the first element to true, but vs does not have an ObservedValue yet. 

    Index out of bounds because vs has length 3 and you are accessing vs.[3] in this line:
      fs <- fs @ [(vs.[2] ||| vs.[3])]

    Friday, June 3, 2011 6:15 PM
  • minka replied on 01-15-2011 9:42 AM

    This usage is not supported.  You must observe the whole array or none of it.

    Friday, June 3, 2011 6:15 PM
  • arunchaganty replied on 01-15-2011 10:21 AM

    Hello Tom,

         Ah ok, I thought this might be the case; thanks for confirming it.

     

    Cheers,

    Arun.

    Friday, June 3, 2011 6:15 PM
  • arunchaganty replied on 01-15-2011 2:07 PM

    Hello,

         As an alternative approach, I have tried to segregate observed and unobserved variables in every iteration, creating new VariableArrays, and factors. Is there any handle to free the memory used by the variable arrays of previous iterations?

    As an example, running the following code slowly (it's only 3-4 variables) but steadily leaks memory (at about 1mb/s):

    let stressTest () =
      
        let stressTest' () =
            // Explicitly Garbage collect just in case
            System.GC.Collect()
            // Create a variable array (initially size = 3); Bernoulli variables, with the same prior (0.8)
            let mutable n = Variable.Observed( 3 )
            let mutable rng = Range( n )
            let ps = ref ( Variable.ArrayInit rng (fun i -> Variable.Observed( 0.8 ) ) )
            let vs = ref ( Variable.ArrayInit rng (fun i -> Variable.Bernoulli( (!ps).[ i ] ) ) )
       
            // Create some factors
            let mutable fs = [
                (!vs).[ 0 ] &&& (!vs).[ 1 ];
                (!vs).[ 1 ] ||| (!vs).[ 2 ];
                ~~~( (!vs).[ 0 ] &&& (!vs).[ 2 ])
                ]
            fs |> List.iter (fun f -> Variable.ConstrainEqualRandom( f, Bernoulli( 0.8 ) ) )

            // Infer! (this works alright)
            let ie = new InferenceEngine()
            ie.ShowTimings <- true
            //ie.InferAll( !vs )
            ie.ShowTimings <- false

            let dist = ie.Infer<Bernoulli[]>( !vs )
            printfn "%A : %A" (!vs) dist
        while true do
            stressTest' ()

    I have also run version with the InferenceEngine being created outside the loop.

     

    Cheers,

    Arun.

    Friday, June 3, 2011 6:15 PM
  • John Guiver replied on 01-17-2011 4:51 AM

    Hi Arun

    The reason you are seeing the apparent memory leak is that the model is getting compiled at each iteration and a new assembly loaded for the compiled model - the executable will keep these assemblies loaded and the GC cannot release them. You can see that this is happening both by the Console output, and by looking at the GeneratedSource folder under the executable folder.

    The reason the model is always recompiled is that the model variables are all scoped to be local to your inner method stressTest', and so, from Infer.NET's perpsective these are different variables each time. In general you should strive to have a single model, and use observed variables to represent the changing structure of the model (I will address this in another post). In your simple stress test, just define the variables outside the inner function:

    open MicrosoftResearch.Infer
    open MicrosoftResearch.Infer.Distributions
    open MicrosoftResearch.Infer.Factors
    open MicrosoftResearch.Infer.FSharp
    open MicrosoftResearch.Infer.Models
    let stressTest () =
     
    // Create a variable array (initially size = 3); Bernoulli variables, with the same prior (0.8)
     
    let mutable n = Variable.Observed( 3 )
     
    let mutable rng = Range( n )
     
    let ps = ref ( Variable.ArrayInit rng (fun i -> Variable.Observed( 0.8 ) ) )
     
    let vs = ref ( Variable.ArrayInit rng (fun i -> Variable.Bernoulli( (!ps).[ i ] ) ) )
     
    // Create some factors
     
    let mutable fs = [
        (!vs).[ 0 ] &&& (!vs).[ 1 ];
        (!vs).[ 1 ] ||| (!vs).[ 2 ];
        ~~~( (!vs).[ 0 ] &&& (!vs).[ 2 ])
        ]
        fs |> List.iter (
    fun f -> Variable.ConstrainEqualRandom( f, Bernoulli( 0.8 ) ) )
       
    // Infer! (this works alright)
       
    let ie = new InferenceEngine()
       
    let stressTest' () =
         
    let dist = ie.Infer<Bernoulli[]>( !vs )
          printfn
    "%A : %A" (!vs) dist
       
    while true do
         
    stressTest' ()

    Friday, June 3, 2011 6:15 PM
  • John Guiver replied on 01-17-2011 5:43 AM

    Hi Arun

    Here is an example (in C#, I'm afraid) of how you can put together your model. It may not be exactly what you want, but you should be able to extrapolate.

    public class Model
    {
     
    Variable<int> N; // Number of variables
     
    Variable<int> K; // Number of factors
     
    VariableArray<double> p;
     
    VariableArray<bool> v;
     
    VariableArray<bool> v1;
     
    VariableArray<bool> v2;
     
    VariableArray<bool> v3;
     
    VariableArray<bool> v4;
     
    VariableArray<int> index1;
     
    VariableArray<int> index2;
     
    VariableArray<int> index3;
     
    InferenceEngine engine = new InferenceEngine();

     
    public Model()
      {
        N =
    Variable.New<int>();
        K =
    Variable.New<int>();
       
    Range n = new Range(N);
       
    Range k = new Range(K);
        p =
    Variable.Array<double>(n);
        v =
    Variable.Array<bool>(n);
        v[n] =
    Variable.Bernoulli(p[n]);
        index1 =
    Variable.Array<int>(k);
        index2 =
    Variable.Array<int>(k);
        index3 =
    Variable.Array<int>(k);
       
    // Use GetItem rather than SubArray if indices may be repeated
       
    v1 = Variable.Subarray(v, index1);
        v2 =
    Variable.Subarray(v, index2);
        v3 =
    Variable.Subarray(v, index3);
        v4 =
    Variable.Array<bool>(k);
        v4[k] = (v1[k] | v2[k]) & v3[k];
      }

     
    public Bernoulli[] Run(double[] probs, int[] ind1, int[] ind2, int[] ind3)
      {
        N.ObservedValue = probs.Length;
        p.ObservedValue = probs;
        K.ObservedValue = ind1.Length;
        index1.ObservedValue = ind1;
        index2.ObservedValue = ind2;
        index3.ObservedValue = ind3;
       
    return engine.Infer<Bernoulli[]>(v4);
      }

    Then you can call this as follows:

    class Program
    {
     
    static void Main(string[] args)
      {
       
    Model m = new Model();
       
    Bernoulli[] result = m.Run(
         
    new double[] {0.5, 0.5, 0.8, 0.8, 0.7}, new int[] {0, 1, 2}, new int[] {2, 3, 4}, new int[] {1, 2, 3});
        
    foreach (Bernoulli b in result) Console.WriteLine(b.GetProbTrue());
    }

    Friday, June 3, 2011 6:15 PM
  • arunchaganty replied on 01-19-2011 1:17 PM

    Hello,

    minka:

    This usage is not supported.  You must observe the whole array or none of it.

    I have finally found the solution to my problem(s) - using Subarrays (Variable.Subarray), I have been able to observe parts of the array, and proceed with the inference. Thanks a lot for all the help!

     

    Cheers,

    Arun.

    Friday, June 3, 2011 6:15 PM