Asked by:
Recommended method for implementing timebased models
Question

The Infer.NET FAQ states that HMMs (and presumably, other discretetime models such as DBNs) need to be unrolled and cannot use a single set of variables to represent all time steps.
Suppose we'd like to create a discrete time model that has Markov (or semiMarkov) properties and where the observations for each datum in each time step depend on some of the past state, and that there are absorbing states (so the number of observations is "jagged" across observations.)
Is the only way to estimate this model with Infer.NET to create a different variable for each datum and each time step where there is an observation? It appears we can't even use VariableArrays for the time steps because there is a varying number of steps for each piece of data. If this is the case, for what orders of magnitude of n (number of data) and m (average number of timesteps) will Infer.NET be able to estimate this model in a reasonable amount of time?
Am I missing a more efficient way to structure this model? I would love any pointers if something better exists.
Thanks!
Wednesday, May 7, 2014 5:00 AM
All replies

You can use the method described in How to represent large irregular graphs to create any wiring between elements of VariableArrays, though you will not get a customized message schedule. I could give more help if you explain a bit more how you want the variables in your model to be connected.Wednesday, May 7, 2014 5:57 PMOwner

Hi Tom,
I have some data that corresponds to observations of user behavior over time. Each observation consists of a user's "session" which is a sequence of observed activities. I'd like to model each activity in this sequence as a random variable that we observed, generated from parent variables that can either represent latent types or latent states  in the simplest case, just a Markov dependency on the past state. Each sequence has a different length, and we'd like to learn what these latent values are for each user. Is there a way to connect this together without making a separate variable for each time step and for each user?
I can phrase this question in a more abstract way: let's say we wanted to estimate an HMM for observation sequences of different lengths  i.e., there is an absorbing state. The FAQ already suggests we need to unroll the HMM for each time step. However, if each observation has a different number of observed time steps, do we now need to make a separate set of variables for each observation?
If the answer to the above question is "no", then I think I can generalize whatever approach that fixes it to the more complicated timedependent models.
Thanks,
Andrew
Thursday, May 8, 2014 3:41 AM 
Hi Tom,
I figured out that I could create this model by basically unrolling an HMM to have the maximum number of timesteps that I observed over my data. When I input the data, I put in arbitrary values for users for which I don't observe the full sequence of data, which should be ignored in the blocks below. The model looks like the following:
// Set up the HMM states k = new Range(numStates).Named("k"); initialProbs = Variable.DirichletUniform(k).Named("initialProbs"); transitionProbs = Variable.Array<Vector>(k).Named("transitionProbs"); using (Variable.ForEach(k)) { transitionProbs[k] = Variable.DirichletUniform(k); } taskCountDist = Variable.Array<double>(k).Named("taskCountDist"); returnProbDist = Variable.Array<double>(k).Named("returnProbDist"); using (Variable.ForEach(k)) { taskCountDist[k] = Variable.GammaFromShapeAndScale(1000, 0.01); returnProbDist[k] = Variable.Beta(1, 1); } numInputs = Variable.New<int>().Named("numInputs"); d = new Range(numInputs).Named("d"); state = new VariableArray<int>[maxSteps]; tasksDone = new VariableArray<int>[maxSteps]; returned = new VariableArray<bool>[maxSteps]; // Set up the unrolled HMM steps for (int n = 0; n < maxSteps; n++) { state[n] = Variable.Array<int>(d); tasksDone[n] = Variable.Array<int>(d); returned[n] = Variable.Array<bool>(d); using (Variable.ForEach(d)) { if (n == 0) { // First state is drawn from initial probs state[n][d] = Variable.Discrete(initialProbs); using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } else { // State is drawn from previous state, if user returned using (Variable.If(returned[n  1][d])) { state[n][d] = Variable.Discrete(transitionProbs[state[n  1][d]]); using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } } } }
However, when I try to do inference on this model, I'm getting the following error:
Additional information: Cannot index vDirichlet1 by vint__0[d] since vDirichlet1 has an implicit dependency on k. Try making the dependency explicit by putting vDirichlet1 into an array indexed by k
I have a feeling that the following line is causing this:
state[n][d] = Variable.Discrete(transitionProbs[state[n  1][d]]);
This line is basically trying to model the Markov transition probability of the HMM. However because `state[n1][d]` is not in a with block, I'm guessing I cannot index into `transitionProbs` with it. Do you have any suggestions for how I should approach this or about the way I've set up the model in genera?
Any help with this would be really appreciated.
Thanks,
Andrew
Friday, May 16, 2014 6:04 AM 
Defining the transition probabilities a different way seems to have eliminated that error. However, it's not clear to me how this is different than the using(Variable.ForEach(k)) method.
transitionProbs = Variable.Array<Vector>(k).Named("transitionProbs"); transitionProbs[k] = Variable.DirichletUniform(k).ForEach(k);
Now, I'm getting a different error, as shown below. Clearly this has to do with the fact that the state in the next timestep is only defined if the user has returned in the previous timestep. Since not all data is observed in all timesteps for all users, do I still need to define the value of the state for when users have gone?
Exception:Thrown: "GateTransform failed with 14 error(s) and 0 warning(s): Error 0: 'state_1' is not defined in all cases. It is only defined for (returned_0[d]=true) in { Dirichlet vDirichlet0; vDirichlet0 = Dirichlet.Uniform(3); Vector initialProbs; initialProbs = Factor.Random<Vector>(vDirichlet0); Vector[] transitionProbs = new Vector[3]; Dirichlet vDirichlet1; vDirichlet1 = Dirichlet.Uniform(3); for(int k = 0; k<3; k++) { transitionProbs[k] = Factor.Random<Vector>(vDirichlet1); } double[] taskCountDist = new double[3]; for(int k = 0; k<3; k++) { taskCountDist[k] = Gamma.Sample(1000.0, 0.01); } double[] returnProbDist = new double[3]; Beta vBeta0 = Beta.Uniform(); for(int k = 0; k<3; k++) { returnProbDist[k] = Factor.Random<double>(vBeta0); } InferNet.Infer(numInputs); int[] state_0 = new int[numInputs]; InferNet.Infer(tasksDone_0); InferNet.Infer(returned_0); for(int d = 0; d<numInputs; d++) { state_0[d] = Factor.Discrete(initialProbs); } for(int d = 0; d<numInputs; d++) { for(int k =" (MicrosoftResearch.Transforms.TransformFailedException) A MicrosoftResearch.Transforms.TransformFailedException was thrown: "GateTransform failed with 14 error(s) and 0 warning(s): Error 0: 'state_1' is not defined in all cases. It is only defined for (returned_0[d]=true) in { Dirichlet vDirichlet0; vDirichlet0 = Dirichlet.Uniform(3); Vector initialProbs; initialProbs = Factor.Random<Vector>(vDirichlet0); Vector[] transitionProbs = new Vector[3]; Dirichlet vDirichlet1; vDirichlet1 = Dirichlet.Uniform(3); for(int k = 0; k<3; k++) { transitionProbs[k] = Factor.Random<Vector>(vDirichlet1); } double[] taskCountDist = new double[3]; for(int k = 0; k<3; k++) { taskCountDist[k] = Gamma.Sample(1000.0, 0.01); } double[] returnProbDist = new double[3]; Beta vBeta0 = Beta.Uniform(); for(int k = 0; k<3; k++) { returnProbDist[k] = Factor.Random<double>(vBeta0); } InferNet.Infer(numInputs); int[] state_0 = new int[numInputs]; InferNet.Infer(tasksDone_0); InferNet.Infer(returned_0); for(int d = 0; d<numInputs; d++) { state_0[d] = Factor.Discrete(initialProbs); } for(int d = 0; d<numInputs; d++) { for(int k =" Time: 5/16/2014 2:17:14 AM Thread:<No Name>[11376]
Friday, May 16, 2014 6:23 AM 
In an attempt to appease the compiler, I tried the following modification:
// State is drawn from previous state, if user returned using (Variable.If(returned[n  1][d])) { state[n][d] = Variable.Discrete(transitionProbs[state[n  1][d]]); using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } // Do some crap to shut up the model using (Variable.IfNot(returned[n  1][d])) { state[n][d] = Variable.Constant(0); tasksDone[n][d] = Variable.Constant(0); returned[n][d] = Variable.Constant(false); }
This results in the error
Exception:Thrown: "IndexingTransform failed with 14 error(s) and 0 warning(s): Error 0: Indexing by a random variable 'state_0'. You must wrap this statement with Variable.Switch(state_0[d]) in transitionProbs[state_0[d]] Error 1: Indexing by a random variable 'state_1'. You must wrap this statement with Variable.Switch(state_1[d]) in transitionProbs[state_1[d]] Error 2: Indexing by a random variable 'state_2'. You must wrap this statement with Variable.Switch(state_2[d]) in transitionProbs[state_2[d]] Error 3: Indexing by a random variable 'state_3'. You must wrap this statement with Variable.Switch(state_3[d]) in transitionProbs[state_3[d]] Error 4: Indexing by a random variable 'state_4'. You must wrap this statement with Variable.Switch(state_4[d]) in transitionProbs[state_4[d]] Error 5: Indexing by a random variable 'state_5'. You must wrap this statement with Variable.Switch(state_5[d]) in transitionProbs[state_5[d]] Error 6: Indexing by a random variable 'state_6'. You must wrap this statement with" Time: 5/16/2014 2:29:55 AM Thread:<No Name>[3280]
However, it seems impossible to use a Variable.Switch(state[n1][d]) block, as it results in the following error:
Exception:Thrown: "Range 'k' is already open in a ForEach or Switch block" (System.InvalidOperationException) A System.InvalidOperationException was thrown: "Range 'k' is already open in a ForEach or Switch block" Time: 5/16/2014 2:27:41 AM Thread:<No Name>[10876]
Will I need to unroll the range K for each stage of the HMM as well?
It seems like I am just shooting in the dark here. I'd appreciate any direction about structuring any of the above issues.
Friday, May 16, 2014 6:33 AM 
Using a prior thread, I believe I've figured out how to wire up the HMM in an acceptable way. However, the absorbing state is still causing some issues for me. I'll ask a question about that in a separate thread.
// State is drawn from previous state, if user returned; figured this out from // http://social.microsoft.com/Forums/enUS/cd249a2372f54fe7bfcb79601998ca2d/aninfluencemodelmigratedfromcommunityresearchmicrosoftcom?forum=infer.net using (Variable.If(returned[n1][d])) { using (Variable.Switch(state[n1][d])) { state[n][d] = Variable.Discrete(transitionProbs[state[n  1][d]]); } using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } // Do some crap to shut up the model using (Variable.IfNot(returned[n1][d])) { state[n][d] = Variable.Constant(0); tasksDone[n][d] = Variable.Constant(0); returned[n][d] = Variable.Constant(false); }
Friday, May 16, 2014 4:51 PM 
This is the final model I've come up with that seems to compile and work, at least for small amounts of data. Specifically, the unrolled HMM part is below.
My dataset has d = 55,000 and n = 640, and the model takes an exceedingly long time to compile (it's been going for an hour so far.) Is there any way to rewire the code so that this can happen faster? Specifically, I feel that the way the absorbing state is modeled is inefficient and has been mainly done to get the compiler to stop complaining. There must be a better way to do this.
Thanks much!
k = new Range(numStates).Named("k"); initialProbs = Variable.DirichletUniform(k).Named("initialProbs"); transitionProbs = Variable.Array<Vector>(k).Named("transitionProbs"); transitionProbs[k] = Variable.DirichletUniform(k).ForEach(k); taskCountDist = Variable.Array<double>(k).Named("taskCountDist"); taskCountDist[k] = Variable.GammaFromShapeAndScale(1000, 0.01).ForEach(k); returnProbDist = Variable.Array<double>(k).Named("returnProbDist"); returnProbDist[k] = Variable.Beta(1, 1).ForEach(k); numInputs = Variable.New<int>().Named("numInputs"); d = new Range(numInputs).Named("d"); state = new VariableArray<int>[maxSteps]; tasksDone = new VariableArray<int>[maxSteps]; returned = new VariableArray<bool>[maxSteps]; var noTasks = Variable.Constant(0); var absorbed = Variable.Constant(false); // Set up the unrolled HMM steps for (int n = 0; n < maxSteps; n++) { state[n] = Variable.Array<int>(d).Named("state_" + n); tasksDone[n] = Variable.Array<int>(d).Named("tasksDone_" + n); returned[n] = Variable.Array<bool>(d).Named("returned_" + n); using (Variable.ForEach(d)) { if (n == 0) { // First state is drawn from initial probs state[n][d] = Variable.Discrete(initialProbs); using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } else { // State is drawn from previous state, if user returned; figured this out from // http://social.microsoft.com/Forums/enUS/cd249a2372f54fe7bfcb79601998ca2d/aninfluencemodelmigratedfromcommunityresearchmicrosoftcom?forum=infer.net using (Variable.If(returned[n1][d])) { using (Variable.Switch(state[n1][d])) { state[n][d] = Variable.Discrete(transitionProbs[state[n  1][d]]); } using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } // Do some crap to shut up the model using (Variable.IfNot(returned[n1][d])) { state[n][d] = state[n1][d]; // Otherwise null pointer exception tasksDone[n][d] = noTasks; returned[n][d] = absorbed; } } } }
Friday, May 16, 2014 6:21 PM 
With d = 1000 and n=64, the model compiles, but inference terminates abnormally. Because there are only 64 stages of VariableArrays, this seems like it should be pretty efficient. Could you give a sense of the order of magnitude for which this model would work?
I am able to compile and estimate the model for d=50000 and n=30, so it is clearly the unrolling of the Markov process that is causing scalability issues. However, because the way the model is set up instantiates a lot of dummy variables even for users for which we don't have data, is it possible to eliminate the `Variable.IfNot` component shown above? Or do you have any other suggestions for making the code more efficient?
Thanks,
Andrew
 Edited by Andrew Mao Friday, May 16, 2014 8:09 PM
Friday, May 16, 2014 7:09 PM 
To handle that many observations, you will need to use batching with SharedVariables. Below I've got you started with a running example.
int numStates = 2; int maxSteps = 50; var k = new Range(numStates).Named("k"); #if useShared int batchCount = 1; var probPrior = Dirichlet.Uniform(numStates); SharedVariable<Vector> initialProbsShared = SharedVariable<Vector>.Random(probPrior).Named("initialProbsShared"); SharedVariableArray<Vector> transitionProbsShared = SharedVariable<Vector>.Random(k, new DistributionRefArray<Dirichlet,Vector>(numStates, i => probPrior)).Named("transitionProbsShared"); var taskCountPrior = Gamma.FromShapeAndScale(1000, 0.01); SharedVariableArray<double> taskCountDistShared = SharedVariable<double>.Random(k, new DistributionStructArray<Gamma, double>(numStates, i => taskCountPrior)); SharedVariableArray<double> returnProbDistShared = SharedVariable<double>.Random(k, new DistributionStructArray<Beta, double>(numStates, i => Beta.Uniform())); Model model = new Model(batchCount); var initialProbs = initialProbsShared.GetCopyFor(model).Named("initialProbs"); initialProbs.SetValueRange(k); var transitionProbs = transitionProbsShared.GetCopyFor(model).Named("transitionProbs"); transitionProbs.SetValueRange(k); var taskCountDist = taskCountDistShared.GetCopyFor(model); var returnProbDist = returnProbDistShared.GetCopyFor(model); #else var initialProbs = Variable.DirichletUniform(k).Named("initialProbs"); var transitionProbs = Variable.Array<Vector>(k).Named("transitionProbs"); transitionProbs[k] = Variable.DirichletUniform(k).ForEach(k); var taskCountDist = Variable.Array<double>(k).Named("taskCountDist"); taskCountDist[k] = Variable.GammaFromShapeAndScale(1000, 0.01).ForEach(k); var returnProbDist = Variable.Array<double>(k).Named("returnProbDist"); returnProbDist[k] = Variable.Beta(1, 1).ForEach(k); #endif var numInputs = Variable.New<int>().Named("numInputs"); var d = new Range(numInputs).Named("d"); var state = new VariableArray<int>[maxSteps]; var tasksDone = new VariableArray<int>[maxSteps]; var returned = new VariableArray<bool>[maxSteps]; var noTasks = Variable.Constant(0); var absorbed = Variable.Constant(false); var stateInit = new VariableArray<Discrete>[maxSteps]; // Set up the unrolled HMM steps for (int n = 0; n < maxSteps; n++) { state[n] = Variable.Array<int>(d).Named("state_" + n); tasksDone[n] = Variable.Array<int>(d).Named("tasksDone_" + n); returned[n] = Variable.Array<bool>(d).Named("returned_" + n); stateInit[n] = Variable.Array<Discrete>(d).Named("stateInit_" + n); state[n][d].InitialiseTo(stateInit[n][d]); using (Variable.ForEach(d)) { if (n == 0) { // First state is drawn from initial probs state[n][d] = Variable.Discrete(initialProbs); using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } else { // State is drawn from previous state, if user returned; figured this out from // http://social.microsoft.com/Forums/enUS/cd249a2372f54fe7bfcb79601998ca2d/aninfluencemodelmigratedfromcommunityresearchmicrosoftcom?forum=infer.net using (Variable.If(returned[n  1][d])) { using (Variable.Switch(state[n  1][d])) { state[n][d] = Variable.Discrete(transitionProbs[state[n  1][d]]); } using (Variable.Switch(state[n][d])) { tasksDone[n][d] = Variable.Poisson(taskCountDist[state[n][d]]); returned[n][d] = Variable.Bernoulli(returnProbDist[state[n][d]]); } } using (Variable.IfNot(returned[n  1][d])) { state[n][d] = Variable.Copy(state[n  1][d]); // Otherwise null pointer exception tasksDone[n][d] = noTasks; returned[n][d] = absorbed; } } } } int numUsers = 50; numInputs.ObservedValue = numUsers; for (int i = 0; i < maxSteps; i++) { tasksDone[i].ObservedValue = Util.ArrayInit(numUsers, j => j); returned[i].ObservedValue = Util.ArrayInit(numUsers, j => (j < 50)); stateInit[i].ObservedValue = Util.ArrayInit(numUsers, u => Discrete.PointMass(Rand.Int(numStates), numStates)); } Vector initialProbsTrue = Vector.FromArray(0.7, 0.3); Vector[] transitionProbsTrue = new Vector[] { Vector.FromArray(0.8, 0.2), Vector.FromArray(0.2, 0.8) }; double[] taskCountDistTrue = new double[] { 5, 10 }; for (int u = 0; u < numUsers; u++) { int st = Rand.Sample(initialProbsTrue); for (int i = 0; i < maxSteps; i++) { if (i > 0) st = Rand.Sample(transitionProbsTrue[st]); tasksDone[i].ObservedValue[u] = Poisson.Sample(taskCountDistTrue[st]); } } InferenceEngine engine = new InferenceEngine(); engine.Algorithm = new VariationalMessagePassing(); #if useShared for (int pass = 0; pass < 1; pass++) { // Run the inference on each data set for (int c = 0; c < batchCount; c++) { // set numInputs, stateInit, etc. here model.InferShared(engine, c); } } var initialProbsPost = initialProbsShared.Marginal<Dirichlet>(); var transitionProbsPost = transitionProbsShared.Marginal<IList<Dirichlet>>(); var taskCountPost = taskCountDistShared.Marginal<IList<Gamma>>(); #else var initialProbsPost = engine.Infer<Dirichlet>(initialProbs); var transitionProbsPost = engine.Infer<IList<Dirichlet>>(transitionProbs); var taskCountPost = engine.Infer(taskCountDist); #endif Console.WriteLine(StringUtil.JoinColumns("initialProbs = ", initialProbsPost.GetMean(), " truth = ", StringUtil.ToString(initialProbsTrue))); Console.WriteLine(StringUtil.JoinColumns("transitionProbs = ", StringUtil.ToString(transitionProbsPost.Select(p => p.GetMean())), " truth = ", StringUtil.ToString(transitionProbsTrue))); Console.WriteLine(StringUtil.JoinColumns("taskCountDist = ", taskCountPost, " truth = ", StringUtil.ToString(taskCountDistTrue)));
Saturday, May 17, 2014 2:48 PMOwner 
Did you compile your application for a 64bit architecture? Just verifying if you made use of all available memory...Sunday, May 18, 2014 10:22 PM

Hi everyone,
Thanks for your responses.
@Yordan  I am running on 64bit. I have verified that the app uses over 4GB of RAM. However, I imagine the stack overflow exception is not necessarily related to RAM usage.
@Tom  thanks for the pointer to use SharedVariables to improve performance, and I'll try to understand what they do and put them to use. Sorry that you had to figure out the boilerplate code to set up the model yourself.
I have some other questions, if you would please give a quick pointer:
 For the cases where the Markov process is absorbed and there are no further observations, is `state[n][d] = Variable.Copy(state[n  1][d])` the optimal way to express this? What is the purpose of the stateInit variable array?
 When I ran this code, I copied a practice from elsewhere that the Discrete distributions corresponding to state were initialized randomly "to break symmetry", which I imagine is necessary for one of the inference algorithms. It seems you haven't done this  when is this necessary?
Tuesday, May 20, 2014 4:31 PM 
In my code, symmetry breaking is done by stateInit. The way that you model state in the "not returned" case shouldn't matter. I'm not sure what you mean by "optimal" here.Tuesday, May 20, 2014 4:42 PMOwner

I meant optimal as the best way to make use of available memory when doing inference, and not instantiating more variables that would slow down the algorithm.
My ultimate goal with this model is to be able to train it on a group of observations, and then make predictions about a user's future behavior given some truncated (partial) observations, even moving away from the strictly Markov case. I could compute this manually for now using the properties of a Markov chain, but ideally I'd like to be able to use the model to do it.
 Does the above configuration allow for this, or will I need to do it a different way?
 Will I need to create dummy sets of variables to allow for these types of queries?
For example, if I set the observedValues of some variables earlier in the chain, and ask for the distribution of a user's state later in the chain, it seems that the way that the absorbing state is modeled is hard to use right now. A simple query would be "the probability that the user is still present at X sessions after the last observation". Is there a better way to structure the model for this?
Tuesday, May 20, 2014 4:53 PM 
You will probably want to make a separate model set up specifically for prediction. But otherwise I don't see any problems with what you propose.Tuesday, May 27, 2014 11:10 AMOwner

Is the idea to set up the priors on the separate model to the approximated posterior distributions in the estimated model, then modify the observed values to make one prediction at a time?Saturday, June 21, 2014 5:52 AM

Yes.Monday, June 23, 2014 8:39 AMOwner