locked
Exception 'x is nan' during inference RRS feed

  • Question

  • Hello everyone,

    I am implementing a model with Infer.NET in C# and when I am running the program, the model is compiling but during the inference, it throws the exception

    An unhandled exception of type 'System.Exception' occurred in Infer.Runtime.dll
    Additional information: x is nan

    My model is, so far, a Markov chain, where for each subject and for each timestep k:

    • D[k+1] = wDD*D[k] + R[K] + <g class="gr_ gr_27 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del" data-gr-id="27" id="27">noiseD</g>
    • R ~ Bernoulli(p)

    D can be seen as a noisy moving average of R.

    D also generates two (potentially) observed variables <g class="gr_ gr_39 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" data-gr-id="39" id="39">smin</g> and <g class="gr_ gr_40 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling" data-gr-id="40" id="40">smax</g>, which are integers ranging from 0 to 10. More specifically, there is a set of thresholds with the constraint that D is between the <g class="gr_ gr_41 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" data-gr-id="41" id="41">smin</g>-<g class="gr_ gr_42 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" data-gr-id="42" id="42">th</g> and the (smax+1)-th thresholds.

    Because of this constraint, I am using an Expectation propagation algorithm and give priority to the Gaussian product as suggested in a previous thread.

    This is my code:

    static void Main(string[] args)
            {
                // ------------------------------------------------------------------------------------------------------
                // Upload data
                // ------------------------------------------------------------------------------------------------------
    
                //string filePath = ...
                int[,] data = ReadCSV(filePath);
    
                int[] split2PID = new[] {3,4};
                int[] dataPID = SelectColumnRectArray(data, 0);
                Tuple<int[,], int[,]> splitCV = SplitTrainTest(data, dataPID, split2PID);
    
                //var dataTrain = splitCV.Item1;
                //var dataTest = splitCV.Item2;
                var dataTest = splitCV.Item1;
                var dataTrain = splitCV.Item2;
    
    
                int[] patientDataTrain = SelectColumnRectArray(dataTrain, 0);
                int[] PIDTrain = Unique(patientDataTrain);
                int nPatients2 = PIDTrain.Length;
    
                // ------------------------------------------------------------------------------------------------------
                // Define counts and range
                // ------------------------------------------------------------------------------------------------------
    
                // observations
                var nObs = Variable.New<int>();
                Range observation = new Range(nObs);
    
                // patients
                var nPatients = Variable.New<int>();
                Range patient = new Range(nPatients);
                // days
                var nDays = Variable.Array<int>(patient);
                Range day = new Range(nDays[patient]);
    
                // severity and threshold
                int maxS = 10; // Maximum value of severity
                int nSev = maxS + 1;
                int nThresh = maxS + 2;
                Range sevSmin = new Range(nSev);
                Range sevSmax = new Range(nSev); // +2?
                Range thresh = new Range(nThresh);
    
                // ------------------------------------------------------------------------------------------------------
                // Define priors
                // ------------------------------------------------------------------------------------------------------
    
                VariableArray<Beta> RProbPrior;
    
                Variable<Gaussian> wDDPrior;
                Variable<Gamma> noiseDPrecisionPrior;
    
                VariableArray<Gaussian> thresholdMeanPrior;
                VariableArray<Gamma> thresholdPrecisionPrior;
    
                // ------------------------------------------------------------------------------------------------------
                // Set priors
                // ------------------------------------------------------------------------------------------------------
    
                RProbPrior = Variable.Array<Beta>(patient);
                RProbPrior.ObservedValue = Util.ArrayInit(nPatients2, t => Beta.Uniform());
    
                wDDPrior = Variable.New<Gaussian>();
                wDDPrior.ObservedValue = Gaussian.FromMeanAndPrecision(0.0, 1.0);
    
                noiseDPrecisionPrior = Variable.New<Gamma>();
                noiseDPrecisionPrior.ObservedValue = Gamma.FromShapeAndScale(1.0, 1.0);
    
                thresholdMeanPrior = Variable.Array<Gaussian>(thresh);
                thresholdMeanPrior.ObservedValue = Util.ArrayInit(nThresh, t => Gaussian.FromMeanAndPrecision(0.0, 1.0));
                thresholdPrecisionPrior = Variable.Array<Gamma>(thresh);
                thresholdPrecisionPrior.ObservedValue = Util.ArrayInit(nThresh, t => Gamma.FromShapeAndScale(1.0, 1.0));
    
                // ------------------------------------------------------------------------------------------------------
                // Define latent variables
                // ------------------------------------------------------------------------------------------------------
    
                // Define P, B, R and D (to store linear combination)
                var R = Variable.Array(Variable.Array<bool>(day), patient);
                var D = Variable.Array(Variable.Array<double>(day), patient);
    
                // Define (potentially) observed variables
                // S
                var Smin = Variable.Array(Variable.Array<int>(day), patient);
                Smin[patient][day] = Variable.DiscreteUniform(nSev).ForEach(patient, day);
                var Smax = Variable.Array(Variable.Array<int>(day), patient);
                Smax[patient][day] = Variable.DiscreteUniform(nSev).ForEach(patient, day);
    
                // R
                var RProb = Variable.Array<double>(patient);
                RProb[patient] = Variable<double>.Random(RProbPrior[patient]);
                R[patient][day] = Variable.Bernoulli(RProb[patient]).ForEach(day);
    
                // Equation D
                // wDD
                Variable<double> wDD = Variable<double>.Random(wDDPrior);
                // noiseD
                Variable<double> noiseDPrecision = Variable<double>.Random(noiseDPrecisionPrior);
                // thresholds
                var thresholdMean = Variable.Array<double>(thresh);
                thresholdMean[thresh] = Variable<double>.Random(thresholdMeanPrior[thresh]);
                var thresholdPrecision = Variable.Array<double>(thresh);
                thresholdPrecision[thresh] = Variable<double>.Random(thresholdPrecisionPrior[thresh]);
                var threshold = Variable.Array(Variable.Array<double>(thresh), patient);
                threshold[patient][0] = Variable.GaussianFromMeanAndVariance(Double.NegativeInfinity, 0.0).ForEach(patient);
                for (int i = 1; i < nSev; i++)
                {
                    threshold[patient][i] = Variable.GaussianFromMeanAndPrecision(thresholdMean[i], thresholdPrecision[i]).ForEach(patient);
                }
                threshold[patient][nSev] = Variable.GaussianFromMeanAndVariance(Double.PositiveInfinity, 0.0).ForEach(patient);
    
    
    
                // ------------------------------------------------------------------------------------------------------
                // Loop
                // ------------------------------------------------------------------------------------------------------
    
                using (Variable.ForEach(patient))
                {
                    using (var block = Variable.ForEach(day))
                    {
                        var t = block.Index;
    
                        // Initialisation
                        using (Variable.If(t == 0))
                        {
                            D[patient][t] = Variable.GaussianFromMeanAndPrecision(0, noiseDPrecision);
                        }
    
                        // Equations
                        using (Variable.If(t > 0))
                        {
                            using (Variable.If(R[patient][t - 1]))
                            {
                                D[patient][t].SetTo(Variable.GaussianFromMeanAndPrecision(
                                wDD * D[patient][t - 1] + 1,noiseDPrecision));
                            }
                            using (Variable.IfNot(R[patient][t - 1]))
                            {
                                D[patient][t].SetTo(Variable.GaussianFromMeanAndPrecision(
                                    wDD * D[patient][t - 1],noiseDPrecision));
                            }
                        }
                    }
                }
    
                // ------------------------------------------------------------------------------------------------------
                // Declare training data variables and connect to latent variables
                // ------------------------------------------------------------------------------------------------------
    
                var patientData = Variable.Array<int>(observation);
                var dayData = Variable.Array<int>(observation);
                //var cData = Variable.Array<int>(observation);
                var SminData = Variable.Array<int>(observation);
                var SmaxData = Variable.Array<int>(observation);
    
                // Connect to latent variables
                int idxPatient = new int();
                using (var block = Variable.ForEach(observation))
                {
                    var i = block.Index;
    
                    idxPatient = Array.IndexOf(PIDTrain, patientData[i]); //patientData doesn't necessarily starts at 0
    
                    using (Variable.If(SminData[i] > -1)) // -1 missing value, Smin and Smax ordinal scores between 0 and 10.
                    {
                        Variable.ConstrainEqual(Smin[idxPatient][dayData[i]], SminData[i]);
                    }
    
                    using (Variable.If(SmaxData[i] > -1))
                    {
                        Variable.ConstrainEqual(Smax[idxPatient][dayData[i]], SmaxData[i]);
                    }
    
                }
    
                // ------------------------------------------------------------------------------------------------------
                // Constrains D between Smin and Smax
                // ------------------------------------------------------------------------------------------------------
    
                Smin.SetValueRange(sevSmin);
                Smax.SetValueRange(sevSmax);
                using (Variable.ForEach(patient))
                {
                    using (Variable.ForEach(day))
                    {
                        using (Variable.Switch(Smin[patient][day]))
                        {
                            using (Variable.Switch(Smax[patient][day]))
                            {
                                Variable.ConstrainBetween(D[patient][day],
                                    threshold[patient][Smin[patient][day] + 0],
                                    threshold[patient][Smax[patient][day] + 1]);
                            }
                        }
                    }
                }
    
    
                // ------------------------------------------------------------------------------------------------------
                // Training data
                // ------------------------------------------------------------------------------------------------------
    
                // Observe counts
                nObs.ObservedValue = patientDataTrain.Length;
                nPatients.ObservedValue = PIDTrain.Length;
                nDays.ObservedValue = ComputeDays(patientDataTrain);
    
                int[] dayDataTrain = SelectColumnRectArray(dataTrain, 1); // Make dayData starts at 0 instead of 1
                for (int i = 0; i < dayDataTrain.Length; i++)
                    dayDataTrain[i] = dayDataTrain[i] - 1;
    
                // Observe data
                patientData.ObservedValue = patientDataTrain;
                dayData.ObservedValue = dayDataTrain;
                //cData.ObservedValue = SelectColumnRectArray(dataTrain, 2);
                SminData.ObservedValue = SelectColumnRectArray(dataTrain, 3);
                SmaxData.ObservedValue = SelectColumnRectArray(dataTrain, 4);
    
                // ------------------------------------------------------------------------------------------------------
                // Run inference
                // ------------------------------------------------------------------------------------------------------
    
                InferenceEngine ie = new InferenceEngine();
                ie.Algorithm = new ExpectationPropagation();
                ie.Compiler.GivePriorityTo(typeof(GaussianProductOp_SHG09)); // For multiplying two gaussians
                //InferenceEngine.DefaultEngine.ShowFactorGraph = true;
    
                // Compute posterior
                var RPosterior = ie.Infer(R);
                
                var wDDPosterior = ie.Infer(wDD);
                var noiseDPrecisionPosterior = ie.Infer(noiseDPrecision);
    
                var thresholdMeanPosterior = ie.Infer(thresholdMean);
                var thresholdPrecisionPosterior = ie.Infer(thresholdPrecision);
    
                }
    
            public static int[,] ReadCSV(string filePath)
            {
                string[][] data = File.ReadLines(filePath).Select(x => x.Split(',')).ToArray();
    
                int r = data.GetLength(0);
    
                int c = data[0].GetLength(0);
                int[,] res = new int[r, c];
    
                for (int j = 0; j < r; j++)
                {
                    for (int i = 0; i < c; i++)
                    {
                        int number;
                        bool ok = int.TryParse(data[j][i], out number);
                        if (ok)
                        {
                            res[j, i] = number;
                        }
                        else
                        {
                            res[j, i] = -1; // Missing value
                        }
                    }
                }
                return (res);
            }
    
            public static T[] SelectColumnRectArray<T>(T[,] arr, int idx)
            {
                T[] res = new T[arr.GetLength(0)];
                for (int i = 0; i < res.GetLength(0); i++)
                {
                    res[i] = arr[i, idx];
                }
                return (res);
            }
    
            public static Tuple<T[,], T[,]> SplitTrainTest<T>(T[,] old, int[] PID, int[] PIDTest)
            {
                Array.Sort(PIDTest);
                int nTest = 0;
                int cpt = 0;
                // Count the number of rows for train
                for (int i = 0; i < old.GetLength(0); i++)
                {
                    if (PIDTest[cpt] < PID[i] & cpt < PIDTest.Length - 1)
                        cpt++;
    
                    if (PID[i] == PIDTest[cpt])
                        nTest++;
                }
    
                T[,] test = new T[nTest, old.GetLength(1)];
                T[,] train = new T[old.GetLength(0) - nTest, old.GetLength(1)];
    
                int iTrain = 0;
                int iTest = 0;
                cpt = 0;
                for (int i = 0; i < old.GetLength(0); i++)
                {
                    if (PIDTest[cpt] < PID[i] & cpt < PIDTest.Length - 1)
                        cpt++;
                    if (PID[i] == PIDTest[cpt])
                    {
                        for (int j = 0; j < old.GetLength(1); j++)
                            test[iTest, j] = old[i, j];
                        iTest++;
                    }
                    else
                    {
                        for (int j = 0; j < old.GetLength(1); j++)
                            train[iTrain, j] = old[i, j];
                        iTrain++;
                    }
    
                }
                return (Tuple.Create(train, test));
            }
    
            public static int[] Unique(int[] inp)
            {
                Array.Sort(inp);
                List<int> res = new List<int>();
                res.Add(inp[0]);
                for (int i = 0; i < inp.Length; i++)
                {
                    if (res.Last() != inp[i])
                    {
                        res.Add(inp[i]);
                    }
                }
                return (res.ToArray());
            }
    
            public static int[] ComputeDays(int[] patientData)
            {
                int[] nDays = new int[Unique(patientData).Length];
    
                int PID = patientData[0];
                int indexPID = 0;
                int count = 0;
                for (int i = 0; i < patientData.Length; i++)
                {
                    if (patientData[i] != PID)
                    {
                        nDays[indexPID] = count;
                        indexPID++;
                        PID = patientData[i];
                        count = 0;
                    }
                    count++;
                }
                nDays[indexPID] = count;
    
                return (nDays);
    
            }
    
        }
    }

    Do you have ideas on how to solve my problem?

    It occurs in the following line of the generated code, which I believe is when D[t] is computed from D[t-1] and R[t-1] (vdouble____0 should be D, index1 the range “patient” and index2 the range “day”).

    // Message to 'vdouble____0_index1_index2__0_' from Gaussian factor
    						vdouble____0_index1_index2__0__F[index1][index2] = GaussianOp.SampleAverageConditional(this.vdouble____0_uses_B_index1__index2__toDef[index1][index2], vdouble82_F[index1][index2], vdouble6_rep2_rep_F[index1][index2], this.vdouble6__0__B[index1][index2]);

    Thanks for the help.


    • Edited by ghurault Friday, January 26, 2018 2:16 PM
    Friday, January 26, 2018 2:13 PM

Answers

  • Part of the problem is the way that you have defined the model.  I don't think that you intended to average over all possible ConstrainBetween constraints when Smin and Smax are not observed; instead there should not be any constraints at all.  Also, you cannot use Array.IndexOf to get the patient number in the observation loop.  You need patientData to be observed as the mapped indices.  I got inference to run with the following replacement for the observation and constraint loop:

                D[patient][day].InitialiseTo(new Gaussian(0, 1));
    
                using (var block = Variable.ForEach(observation))
                {
                    var i = block.Index;
    
                    var patientOfObs = patientData[i];
                    var dayOfObs = dayData[i];
    
                    using (Variable.If(SminData[i] > -1 & SmaxData[i] > -1)) // -1 missing value, Smin and Smax ordinal scores between 0 and 10.
                    {
                        Variable.ConstrainBetween(D[patientOfObs][dayOfObs],
                            threshold[patientOfObs][SminData[i] + 0],
                            threshold[patientOfObs][SmaxData[i] + 1]);
                    }
                }
    
                patientData.ObservedValue = patientDataTrain.Select(i => i - 1).ToArray();

    It is also important to initialize D in this model.


    Tuesday, February 6, 2018 7:46 PM
    Owner

All replies

  • Try making your Gamma priors sharper.  Also, if you name your variables, you will get more readable generated code.
    Saturday, January 27, 2018 2:55 PM
    Owner
  • Thank you Tom, I really appreciate your help. Apparently the prior was indeed the problem.

    However, I am having a similar issue a bit later, but with the additional information that it is 'Not converging for n=29,x=NaN'

    This time it is related to the following line in my code, when I want to check that D is between the thresholds corresponding to smin and smax+1.

    Variable.ConstrainBetween(D[patient][day],
                                    threshold[patient][Smin[patient][day]+0],
                                    threshold[patient][Smax[patient][day]+1]);

    I am wondering if I can do anything at all or if I need to change algorithm.

    I tried Gibbs sampling, but the model does not compile with the error in mscorlib.dll: 'An item with the same key has already been added.' I am a bit confused, what do you think?



    • Edited by ghurault Monday, January 29, 2018 12:26 PM
    Monday, January 29, 2018 12:20 PM
  • How sharp are the priors in this second problem?
    Monday, January 29, 2018 12:35 PM
    Owner
  • Thanks for the quick answer.

    I tried several priors but none of them seems to be working. Currently they are:

                thresholdMeanPrior = Variable.Array<Gaussian>(thresh).Named("thresholdMeanPrior");
                thresholdMeanPrior.ObservedValue = Util.ArrayInit(nThresh, t => Gaussian.FromMeanAndPrecision(0.0, 0.1));
                thresholdPrecisionPrior = Variable.Array<Gamma>(thresh).Named("thresholdPrecisionPrior");
                thresholdPrecisionPrior.ObservedValue = Util.ArrayInit(nThresh, t => Gamma.FromShapeAndScale(2.0, 1.0));

    Do you think it is not sharp enough? I don't have a good prior on what the distributions should be.

    Monday, January 29, 2018 12:55 PM
  • Try making the thresholdPrecisionPrior sharper.  Note there is a method Gamma.FromMeanAndVariance.
    Monday, January 29, 2018 1:42 PM
    Owner
  • Unfortunately it doesn't seem to help: no matter how sharp it is, I am still having the same error message...

    My data is already constrained to have smax>= smin, but I am wondering if C# really takes the +1 into account in threshold[patient][Smax[patient][day]+1] (see below).

    For example, removing the +0 in threshold[patient][Smin[patient][day]+0] produces an error that I can't index threshold by Smin.

                Smin.SetValueRange(sevSmin);
                Smax.SetValueRange(sevSmax);
                using (Variable.ForEach(patient))
                {
                    using (Variable.ForEach(day))
                    {
                        using (Variable.Switch(Smin[patient][day]))
                        {
                            using (Variable.Switch(Smax[patient][day]))
                            {
                                Variable.ConstrainBetween(D[patient][day],
                                    threshold[patient][Smin[patient][day]+0],
                                    threshold[patient][Smax[patient][day]+1]);
                            }
                        }
                    }
                }

    Monday, January 29, 2018 2:31 PM
  • Can you send us a dataset that reproduces the problem?
    Monday, January 29, 2018 2:37 PM
    Owner
  • Sure, please find a download link to a toy dataset below:

    https://we.tl/gacngkIN38

    And here is the latest version of my code:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    using System.IO;
    using MicrosoftResearch.Infer;
    using MicrosoftResearch.Infer.Models;
    using MicrosoftResearch.Infer.Maths;
    using MicrosoftResearch.Infer.Distributions;
    using MicrosoftResearch.Infer.Utils;
    using MicrosoftResearch.Infer.Factors;
    
    namespace SimplifiedModelEqD
    {
        class Program
        {
            static void Main(string[] args)
            {
                List<string> res = new List<string>(); // For saving results
    
                // ------------------------------------------------------------------------------------------------------
                // Upload data
                // ------------------------------------------------------------------------------------------------------
    
                string filePath = @"...\toydataset.csv";
                int[,] data = ReadCSV(filePath);
    
                int[] split2PID = new[] {1,2};
                int[] dataPID = SelectColumnRectArray(data, 0);
                Tuple<int[,], int[,]> splitCV = SplitTrainTest(data, dataPID, split2PID);
    
                //var dataTrain = splitCV.Item1;
                //var dataTest = splitCV.Item2;
                var dataTest = splitCV.Item1;
                var dataTrain = splitCV.Item2;
    
    
                int[] patientDataTrain = SelectColumnRectArray(dataTrain, 0);
                int[] PIDTrain = Unique(patientDataTrain);
                int nPatients2 = PIDTrain.Length;
    
                // Need nPatients2 for setting prior on R
                // Need PIDTrain for observing S
    
                // ------------------------------------------------------------------------------------------------------
                // Define counts and range
                // ------------------------------------------------------------------------------------------------------
    
                // observations
                var nObs = Variable.New<int>().Named("nObs"); // Naming variable for the generated code
                Range observation = new Range(nObs).Named("observation");
    
                // patients
                var nPatients = Variable.New<int>().Named("nPatients");
                Range patient = new Range(nPatients).Named("patient");
                // days
                var nDays = Variable.Array<int>(patient).Named("nDays");
                Range day = new Range(nDays[patient]).Named("day");
    
                // severity and threshold
                int maxS = 10; // Maximum value of severity
                int nSev = maxS + 1;
                int nThresh = maxS + 2;
                Range sevSmin = new Range(nSev).Named("sevSmin");
                Range sevSmax = new Range(nSev).Named("sevSmax"); // +2?
                Range thresh = new Range(nThresh).Named("thresh");
    
                // ------------------------------------------------------------------------------------------------------
                // Define priors
                // ------------------------------------------------------------------------------------------------------
    
                VariableArray<Beta> RProbPrior;
    
                Variable<Gaussian> wDDPrior;
                Variable<Gamma> noiseDPrecisionPrior;
    
                VariableArray<Gaussian> thresholdMeanPrior;
                VariableArray<Gamma> thresholdPrecisionPrior;
    
                // ------------------------------------------------------------------------------------------------------
                // Set priors
                // ------------------------------------------------------------------------------------------------------
    
                RProbPrior = Variable.Array<Beta>(patient).Named("RProbPrior");
                RProbPrior.ObservedValue = Util.ArrayInit(nPatients2, t => Beta.Uniform());
    
                wDDPrior = Variable.New<Gaussian>().Named("wDDPrior");
                wDDPrior.ObservedValue = Gaussian.FromMeanAndPrecision(0.0, 0.1);
    
                noiseDPrecisionPrior = Variable.New<Gamma>().Named("noiseDPrecisionPrior");
                noiseDPrecisionPrior.ObservedValue = Gamma.FromShapeAndScale(2.0, 1.0);
    
                thresholdMeanPrior = Variable.Array<Gaussian>(thresh).Named("thresholdMeanPrior");
                thresholdMeanPrior.ObservedValue = Util.ArrayInit(nThresh, t => Gaussian.FromMeanAndPrecision(0.0, 0.1));
                thresholdPrecisionPrior = Variable.Array<Gamma>(thresh).Named("thresholdPrecisionPrior");
                thresholdPrecisionPrior.ObservedValue = Util.ArrayInit(nThresh, t => Gamma.FromMeanAndVariance(2.0, 0.0001));
    
                // ------------------------------------------------------------------------------------------------------
                // Define latent variables
                // ------------------------------------------------------------------------------------------------------
    
                // Define P, B, R and D (to store linear combination)
                var R = Variable.Array(Variable.Array<bool>(day), patient).Named("R");
                var D = Variable.Array(Variable.Array<double>(day), patient).Named("D");
    
                // Define (potentially) observed variables
                // S
                var Smin = Variable.Array(Variable.Array<int>(day), patient).Named("Smin");
                Smin[patient][day] = Variable.DiscreteUniform(nSev).ForEach(patient, day);
                var Smax = Variable.Array(Variable.Array<int>(day), patient).Named("Smax");
                Smax[patient][day] = Variable.DiscreteUniform(nSev).ForEach(patient, day);
    
                // R
                var RProb = Variable.Array<double>(patient).Named("RProb");
                RProb[patient] = Variable<double>.Random(RProbPrior[patient]);
                R[patient][day] = Variable.Bernoulli(RProb[patient]).ForEach(day);
    
                // Equation D
                // wDD
                Variable<double> wDD = Variable<double>.Random(wDDPrior).Named("wDD");
                // noiseD
                Variable<double> noiseDPrecision = Variable<double>.Random(noiseDPrecisionPrior).Named("noiseDPrecision");
                // thresholds
                var thresholdMean = Variable.Array<double>(thresh).Named("thresholdMean");
                thresholdMean[thresh] = Variable<double>.Random(thresholdMeanPrior[thresh]);
                var thresholdPrecision = Variable.Array<double>(thresh).Named("thresholdPrecision");
                thresholdPrecision[thresh] = Variable<double>.Random(thresholdPrecisionPrior[thresh]);
                var threshold = Variable.Array(Variable.Array<double>(thresh), patient).Named("threshold");
                threshold[patient][0] = Variable.GaussianFromMeanAndVariance(Double.NegativeInfinity, 0.0).ForEach(patient);
                for (int i = 1; i < nSev; i++)
                {
                    threshold[patient][i] = Variable.GaussianFromMeanAndPrecision(thresholdMean[i], thresholdPrecision[i]).ForEach(patient);
                }
                threshold[patient][nSev] = Variable.GaussianFromMeanAndVariance(Double.PositiveInfinity, 0.0).ForEach(patient);
    
    
    
                // ------------------------------------------------------------------------------------------------------
                // Loop
                // ------------------------------------------------------------------------------------------------------
    
                using (Variable.ForEach(patient))
                {
                    using (var block = Variable.ForEach(day))
                    {
                        var t = block.Index;
    
                        // Initialisation
                        using (Variable.If(t == 0))
                        {
                            D[patient][t] = Variable.GaussianFromMeanAndPrecision(0, noiseDPrecision);
                        }
    
                        // Equations
                        using (Variable.If(t > 0))
                        {
                            using (Variable.If(R[patient][t - 1]))
                            {
                                D[patient][t].SetTo(Variable.GaussianFromMeanAndPrecision(
                                wDD * D[patient][t - 1] + 1,noiseDPrecision));
                            }
                            using (Variable.IfNot(R[patient][t - 1]))
                            {
                                D[patient][t].SetTo(Variable.GaussianFromMeanAndPrecision(
                                    wDD * D[patient][t - 1],noiseDPrecision));
                            }
                        }
                    }
                }
    
                // ------------------------------------------------------------------------------------------------------
                // Declare training data variables and connect to latent variables
                // ------------------------------------------------------------------------------------------------------
    
                var patientData = Variable.Array<int>(observation).Named("patientData");
                var dayData = Variable.Array<int>(observation).Named("dayData");
                //var cData = Variable.Array<int>(observation).Named("cData");
                var SminData = Variable.Array<int>(observation).Named("SminData");
                var SmaxData = Variable.Array<int>(observation).Named("SmaxData");
    
                // Connect to latent variables
                int idxPatient = new int();
                using (var block = Variable.ForEach(observation))
                {
                    var i = block.Index;
    
                    idxPatient = Array.IndexOf(PIDTrain, patientData[i]); //patientData doesn't necessarily starts at 0
    
                    using (Variable.If(SminData[i] > -1)) // -1 missing value, Smin and Smax ordinal scores between 0 and 10.
                    {
                        Variable.ConstrainEqual(Smin[idxPatient][dayData[i]], SminData[i]);
                    }
    
                    using (Variable.If(SmaxData[i] > -1))
                    {
                        Variable.ConstrainEqual(Smax[idxPatient][dayData[i]], SmaxData[i]);
                    }
    
                }
    
                // ------------------------------------------------------------------------------------------------------
                // Constrains D between Smin and Smax
                // ------------------------------------------------------------------------------------------------------
    
                Smin.SetValueRange(sevSmin);
                Smax.SetValueRange(sevSmax);
                using (Variable.ForEach(patient))
                {
                    using (Variable.ForEach(day))
                    {
                        using (Variable.Switch(Smin[patient][day]))
                        {
                            using (Variable.Switch(Smax[patient][day]))
                            {
                                Variable.ConstrainBetween(D[patient][day],
                                    threshold[patient][Smin[patient][day]+0],
                                    threshold[patient][Smax[patient][day]+1]);
                                
                            }
                        }
                    }
                }
    
    
                // ------------------------------------------------------------------------------------------------------
                // Training data
                // ------------------------------------------------------------------------------------------------------
    
                // Observe counts
                nObs.ObservedValue = patientDataTrain.Length;
                nPatients.ObservedValue = PIDTrain.Length;
                nDays.ObservedValue = ComputeDays(patientDataTrain);
    
                int[] dayDataTrain = SelectColumnRectArray(dataTrain, 1); // Make dayData starts at 0 instead of 1
                for (int i = 0; i < dayDataTrain.Length; i++)
                    dayDataTrain[i] = dayDataTrain[i] - 1;
    
                // Observe data
                patientData.ObservedValue = patientDataTrain;
                dayData.ObservedValue = dayDataTrain;
                //cData.ObservedValue = SelectColumnRectArray(dataTrain, 2);
                SminData.ObservedValue = SelectColumnRectArray(dataTrain, 3);
                SmaxData.ObservedValue = SelectColumnRectArray(dataTrain, 4);
    
                // ------------------------------------------------------------------------------------------------------
                // Run inference
                // ------------------------------------------------------------------------------------------------------
                
                InferenceEngine ie = new InferenceEngine();
                ie.Algorithm = new ExpectationPropagation();
                ie.Compiler.GivePriorityTo(typeof(GaussianProductOp_SHG09)); // For multiplying two gaussians
                //ie.NumberOfIterations = 50;
                //InferenceEngine.DefaultEngine.ShowFactorGraph = true;
    
                // Compute posterior
                var RPosterior = ie.Infer(R);
                
                var wDDPosterior = ie.Infer(wDD);
                var noiseDPrecisionPosterior = ie.Infer(noiseDPrecision);
    
                var thresholdMeanPosterior = ie.Infer(thresholdMean);
                var thresholdPrecisionPosterior = ie.Infer(thresholdPrecision);
    
                res.Add("R Posterior:" + RPosterior);
                res.Add("wDD Posterior:" + wDDPosterior);
                res.Add("noise D Precision Posterior:" + noiseDPrecisionPosterior);
                res.Add("Thresholds Mean Posterior:" + thresholdMeanPosterior);
                res.Add("Thresholds Precision Posterior:" + thresholdPrecisionPosterior);
    
                // ------------------------------------------------------------------------------------------------------
                // Make prediction
                // ------------------------------------------------------------------------------------------------------
    
    
                // ------------------------------------------------------------------------------------------------------
                // Save results
                // ------------------------------------------------------------------------------------------------------
    
                string savePath = @"...";
                string fileName = savePath + "results.txt";
    
                System.IO.File.WriteAllLines(fileName, res.ToArray()); // Save results to text
    
                Console.ReadLine();
    
            }
    
            public static int[,] ReadCSV(string filePath)
            {
                string[][] data = File.ReadLines(filePath).Select(x => x.Split(',')).ToArray();
    
                int r = data.GetLength(0);
    
                int c = data[0].GetLength(0);
                int[,] res = new int[r, c];
    
                for (int j = 0; j < r; j++)
                {
                    for (int i = 0; i < c; i++)
                    {
                        int number;
                        bool ok = int.TryParse(data[j][i], out number);
                        if (ok)
                        {
                            res[j, i] = number;
                        }
                        else
                        {
                            res[j, i] = -1; // Missing value
                        }
                    }
                }
                return (res);
            }
    
            public static T[] SelectColumnRectArray<T>(T[,] arr, int idx)
            {
                T[] res = new T[arr.GetLength(0)];
                for (int i = 0; i < res.GetLength(0); i++)
                {
                    res[i] = arr[i, idx];
                }
                return (res);
            }
    
            public static Tuple<T[,], T[,]> SplitTrainTest<T>(T[,] old, int[] PID, int[] PIDTest)
            {
                Array.Sort(PIDTest);
                int nTest = 0;
                int cpt = 0;
                // Count the number of rows for train
                for (int i = 0; i < old.GetLength(0); i++)
                {
                    if (PIDTest[cpt] < PID[i] & cpt < PIDTest.Length - 1)
                        cpt++;
    
                    if (PID[i] == PIDTest[cpt])
                        nTest++;
                }
    
                T[,] test = new T[nTest, old.GetLength(1)];
                T[,] train = new T[old.GetLength(0) - nTest, old.GetLength(1)];
    
                int iTrain = 0;
                int iTest = 0;
                cpt = 0;
                for (int i = 0; i < old.GetLength(0); i++)
                {
                    if (PIDTest[cpt] < PID[i] & cpt < PIDTest.Length - 1)
                        cpt++;
                    if (PID[i] == PIDTest[cpt])
                    {
                        for (int j = 0; j < old.GetLength(1); j++)
                            test[iTest, j] = old[i, j];
                        iTest++;
                    }
                    else
                    {
                        for (int j = 0; j < old.GetLength(1); j++)
                            train[iTrain, j] = old[i, j];
                        iTrain++;
                    }
    
                }
                return (Tuple.Create(train, test));
            }
    
            public static int[] Unique(int[] inp)
            {
                Array.Sort(inp);
                List<int> res = new List<int>();
                res.Add(inp[0]);
                for (int i = 0; i < inp.Length; i++)
                {
                    if (res.Last() != inp[i])
                    {
                        res.Add(inp[i]);
                    }
                }
                return (res.ToArray());
            }
    
            public static int[] ComputeDays(int[] patientData)
            {
                int[] nDays = new int[Unique(patientData).Length];
    
                int PID = patientData[0];
                int indexPID = 0;
                int count = 0;
                for (int i = 0; i < patientData.Length; i++)
                {
                    if (patientData[i] != PID)
                    {
                        nDays[indexPID] = count;
                        indexPID++;
                        PID = patientData[i];
                        count = 0;
                    }
                    count++;
                }
                nDays[indexPID] = count;
    
                return (nDays);
    
            }
    
        }
    }
    

    Thanks again for taking the time to help me.

    Monday, January 29, 2018 3:19 PM
  • Part of the problem is the way that you have defined the model.  I don't think that you intended to average over all possible ConstrainBetween constraints when Smin and Smax are not observed; instead there should not be any constraints at all.  Also, you cannot use Array.IndexOf to get the patient number in the observation loop.  You need patientData to be observed as the mapped indices.  I got inference to run with the following replacement for the observation and constraint loop:

                D[patient][day].InitialiseTo(new Gaussian(0, 1));
    
                using (var block = Variable.ForEach(observation))
                {
                    var i = block.Index;
    
                    var patientOfObs = patientData[i];
                    var dayOfObs = dayData[i];
    
                    using (Variable.If(SminData[i] > -1 & SmaxData[i] > -1)) // -1 missing value, Smin and Smax ordinal scores between 0 and 10.
                    {
                        Variable.ConstrainBetween(D[patientOfObs][dayOfObs],
                            threshold[patientOfObs][SminData[i] + 0],
                            threshold[patientOfObs][SmaxData[i] + 1]);
                    }
                }
    
                patientData.ObservedValue = patientDataTrain.Select(i => i - 1).ToArray();

    It is also important to initialize D in this model.


    Tuesday, February 6, 2018 7:46 PM
    Owner
  • I was looking into this problem and trying to figure it out! I was wondering about those constraints but it didn't come to my mind that these should not be set.

    By the way, what is the name of the model (1st order markov chain?) and what is the application? This is perhaps a question to author... Anyways, I learned a lot from the model & code.

    UPDATE: I managed the inference running too. The problem was that patientData was observed twice in the code. I need to keep only your version.

    • Edited by usptact Wednesday, February 7, 2018 1:17 AM
    Wednesday, February 7, 2018 12:25 AM
  • When I replace the original block of code for observations and constraints, and use your block of code, I get an error message about index out of bounds. Did I miss anything? Maybe something else was modified in the code?

    Compiling model...done.
    
    Unhandled Exception:
    System.IndexOutOfRangeException: Index was outside the bounds of the array.
      at MicrosoftResearch.Infer.Models.User.Model_EP.Changed_numberOfIterationsDecreased_nObs_patientData_nPatients_dayData_nDays_noiseDPrecisionPrior_wD5 (System.Int32 numberOfIterations) [0x010c3] in <9c21a1186c6f4d798ff56e26e1f58cf1>:0 
      at MicrosoftResearch.Infer.Models.User.Model_EP.Execute (System.Int32 numberOfIterations, System.Boolean initialise) [0x000be] in <9c21a1186c6f4d798ff56e26e1f58cf1>:0 
      at MicrosoftResearch.Infer.Models.User.Model_EP.Execute (System.Int32 numberOfIterations) [0x00001] in <9c21a1186c6f4d798ff56e26e1f58cf1>:0 
      at MicrosoftResearch.Infer.InferenceEngine.Execute (MicrosoftResearch.Infer.IGeneratedAlgorithm ca) [0x00058] in <1c09b726d63a41118a8cecdae81e3110>:0 
      at MicrosoftResearch.Infer.InferenceEngine.InferAll (System.Boolean inferOnlySpecifiedVars, MicrosoftResearch.Infer.Models.IVariable var) [0x00009] in <1c09b726d63a41118a8cecdae81e3110>:0 
      at MicrosoftResearch.Infer.InferenceEngine.Infer[TReturn] (MicrosoftResearch.Infer.Models.IVariable var) [0x00000] in <1c09b726d63a41118a8cecdae81e3110>:0 
      at Patients.Program.Main (System.String[] args) [0x00837] in <c57da25561ea43f8a6cf6ac44f0560db>:0 
    [ERROR] FATAL UNHANDLED EXCEPTION: System.IndexOutOfRangeException: Index was outside the bounds of the array.
      at MicrosoftResearch.Infer.Models.User.Model_EP.Changed_numberOfIterationsDecreased_nObs_patientData_nPatients_dayData_nDays_noiseDPrecisionPrior_wD5 (System.Int32 numberOfIterations) [0x010c3] in <9c21a1186c6f4d798ff56e26e1f58cf1>:0 
      at MicrosoftResearch.Infer.Models.User.Model_EP.Execute (System.Int32 numberOfIterations, System.Boolean initialise) [0x000be] in <9c21a1186c6f4d798ff56e26e1f58cf1>:0 
      at MicrosoftResearch.Infer.Models.User.Model_EP.Execute (System.Int32 numberOfIterations) [0x00001] in <9c21a1186c6f4d798ff56e26e1f58cf1>:0 
      at MicrosoftResearch.Infer.InferenceEngine.Execute (MicrosoftResearch.Infer.IGeneratedAlgorithm ca) [0x00058] in <1c09b726d63a41118a8cecdae81e3110>:0 
      at MicrosoftResearch.Infer.InferenceEngine.InferAll (System.Boolean inferOnlySpecifiedVars, MicrosoftResearch.Infer.Models.IVariable var) [0x00009] in <1c09b726d63a41118a8cecdae81e3110>:0 
      at MicrosoftResearch.Infer.InferenceEngine.Infer[TReturn] (MicrosoftResearch.Infer.Models.IVariable var) [0x00000] in <1c09b726d63a41118a8cecdae81e3110>:0 
      at Patients.Program.Main (System.String[] args) [0x00837] in <c57da25561ea43f8a6cf6ac44f0560db>:0 


    Wednesday, February 7, 2018 1:00 AM
  • Thank you Tom, this reformulation of the problem makes a lot of sense!

    I finally managed to run the inference on the toy dataset, however, I am still having some issues when scaling up to my actual data (I have approximately 300 observations per patient). I can reproduce the problem when I increase the number of observations in the toy dataset (here is a link to a bigger dataset https://we.tl/YKUepPdFdb). The model is compiling but the inference is crashing at ConstrainBetween constraints with the error message “result is nan” or “model has zero probability” depending on my prior for thresholdPrecision (for example, for the latter, Gamma.FromMeanAndVariance(1.0, 0.001)).

    Is there anything more I could try?

    By the way, I am glad to see your interest in this model usptact! This is still ongoing research so I cannot go into a lot of details but I am trying to model the dynamic of eczema. For the moment, I am merely computing an average D of R with a forgetting rate wDD.

    D is a latent symptom score and R corresponds to an event (like a new symptom). The observed variables Smin and Smax refer to two very similar, self-assessed, ordinal symptom scores which, I assume, are derived from D. When this part works, I would like to extend this model in order to predict R with two other variables and set of equations.

    Hope that helps!

    Monday, February 12, 2018 11:30 AM
  • Try different priors.  The bigger dataset works for me with:

                noiseDPrecisionPrior.ObservedValue = Gamma.FromMeanAndVariance(1.0, 1e-1);
    

    Monday, February 12, 2018 1:15 PM
    Owner
  • You are right, a sharper prior on noiseDPrecision also works for me.

    Thanks for everything!

    Monday, February 12, 2018 3:13 PM
  • By the way, instead of changing the prior, you can also change the initialization, like so:

                noiseDPrecision.InitialiseTo(Gamma.FromMeanAndVariance(1.0, 1e-1));
    

    Infer.NET uses the prior as the initialization, which is why these two approaches give similar results.
    Monday, February 12, 2018 3:46 PM
    Owner
  • ghurault,

    Thanks for the explanations on the model. I was wondering what kind of application is it and whether this is a some kind of standard model that you adapted for your research problem. If you based the model on some published model, I would be glad to read about it more.

    My interest stems from curiosity what kinds of probabilistic models people use with Infer.NET to solve practical problems. I work in the field of natural language processing (NLP) and have found probabilistic models to be useful when dealing with noisy crowd annotations. I am constantly looking for more.

    Thanks

    Monday, February 12, 2018 6:43 PM
  • I see, but are the two approaches, changing the prior or changing the initialization, exactly equivalent or is one better than the other?

    usptact, this model is indeed based on a previous one (DOI: 10.1016/j.jaci.2016.10.026), that we adapted and refined so that we can fit our data and use it in a Bayesian framework.

    Tuesday, February 13, 2018 2:56 PM
  • They are not equivalent.  Changing the prior changes your assumptions, so the true posterior will be different.  Changing the initialization does not affect the true posterior, though it may affect the approximation.
    Tuesday, February 13, 2018 3:06 PM
    Owner