Answered by:
Partially observed values
Question

Hi All,
I'm new to Infer.NET and I find it is really interesting.
When I play with some examples and tutorials, I found one question that I don't know how to resolve. Basically, I am wondering can can we specify the partially observed values for Array variables.
For example, in the document http://research.microsoft.com/enus/um/cambridge/projects/infernet/docs/cloning%20ranges.aspx
It takes the MixedMembership Stochastic Blockmodel as an example for Cloning Range.
When trying to hook up the data, it has the following,
Y.ObservedValue = YObs;
where YObs is a 2D array of all observed data. My question is that what if some of the entry is not observed, how can we specify this for the InferenceEngine.
I tried to do thing like ClearObservedValue for individual entries of Y,
Y[a][b].ClearObservedValue();
but it failed. The error message told me to use the function for the array instead of individual entry.
What should I do to allow some entries as unobserved? Should I change the declaration of Y?
Thanks,
Zheng
No Signature yet
Tuesday, June 18, 2013 7:57 AM
Answers

Hi Zheng
There are a couple ways to partially observed array variables. http://research.microsoft.com/enus/um/cambridge/projects/infernet/docs/How%20to%20handle%20missing%20data.aspx gives one way  i.e. condition on an observed VariableArray<bool> (or more complex array). Alternatively you can use the SubArray factor and use an observed VariableArray<int> to pull out a subset of the elements. In both of these methods, you can just observe the observation pattern at runtime so there will be no recompilation of the model (as you would get if you were able to clear individual values).
John
 Marked as answer by JakeZChen Wednesday, June 26, 2013 4:02 AM
Wednesday, June 19, 2013 7:52 AMOwner 
Hi Zheng
If you want to feed posteriorPi back into a prediction model, you need to do something like the following:
var vPiPosterior = Variable.Array<Dirichlet>(p); vPiPosterior.ObservedValue = posteriorPi; var pPi = Variable.Array<Vector>(p); pPi[p] = Variable<Vector>.Random(vPiPosterior[p]);
However, you don't need a separate model for what you want to do. Here is a way you can modify the existing model to learn and predict at the same time:
Add two variables YPred and YIsObserved:
var YPred = Variable.Array(Variable.Array<bool>(q), p); var YIsObserved = Variable.Array(Variable.Array<bool>(q), p); // Example values indicating what is observed YIsObserved.ObservedValue = new bool[][] { new bool[] {true, true, false, true, false}, new bool[] {true, true, false, true, true}, new bool[] {false, true, false, true, false}, new bool[] {true, false, false, true, true}, new bool[] {true, true, true, true, true}, };
Then modify the inner loop as follows:
using (Variable.ForEach(p)) { using (Variable.ForEach(q)) { var z1 = Variable.Discrete(pi[p]).Named("z1"); // Draw initiator membership indicator var z2 = Variable.Discrete(pi[q]).Named("z2"); // Draw receiver membership indicator z2.SetValueRange(kq); using (Variable.Switch(z1)) { using (Variable.Switch(z2)) { using (Variable.If(YIsObserved[p][q])) { Y[p][q] = Variable.Bernoulli(B[z1, z2]); // Sample interaction value YPred[p][q] = Variable.Copy(Y[p][q]); } using (Variable.IfNot(YIsObserved[p][q])) { YPred[p][q] = Variable.Bernoulli(B[z1, z2]); // Sample interaction value } } } } }
Then you can get the predictions as follows:
var posteriorYPred = engine.Infer<Bernoulli[][]>(YPred);
John
 Marked as answer by JakeZChen Wednesday, June 26, 2013 4:09 AM
Monday, June 24, 2013 8:52 AMOwner
All replies

Hi Zheng
There are a couple ways to partially observed array variables. http://research.microsoft.com/enus/um/cambridge/projects/infernet/docs/How%20to%20handle%20missing%20data.aspx gives one way  i.e. condition on an observed VariableArray<bool> (or more complex array). Alternatively you can use the SubArray factor and use an observed VariableArray<int> to pull out a subset of the elements. In both of these methods, you can just observe the observation pattern at runtime so there will be no recompilation of the model (as you would get if you were able to clear individual values).
John
 Marked as answer by JakeZChen Wednesday, June 26, 2013 4:02 AM
Wednesday, June 19, 2013 7:52 AMOwner 
Hi John,
Thank you so much for the answer. I think I got the idea of how to set unobserved variable now. However, I still have some questions regarding the prediction using mixed membership stochastic model. I mainly followed the example at the "Cloning Ranges" page and the prediction example at the "BayesPointMachineExample"
Basically I am interested in the link prediction problem. After the inference of pi and B given a partially observed Y, I would like to know the posterior of Y (for the entries that are not unobserved)
Here is what I did for the missing data.
Say I generated a mask 'testMask' which set the entries of missing links to True.
VariableArray2D<bool> isTestingVar = Variable.Observed(isMissing, p, q); using (Variable.ForEach(p)) { using (Variable.ForEach(q)) { using (Variable.IfNot(isTestingVar[p, q])) { var z1 = Variable.Discrete(pi[p]).Named("z1"); // Draw initiator membership indicator var z2 = Variable.Discrete(pi[q]).Named("z2"); // Draw receiver membership indicator z2.SetValueRange(kq); using (Variable.Switch(z1)) using (Variable.Switch(z2)) Y[p][q] = Variable.Bernoulli(B[z1, z2]); // Sample interaction value } } }
where
isMissing
is the missing value indicator 2Darray.
Run the code as in the example, I can get the posterior of pi and B.
Dirichlet[] posteriorPi = engine.Infer<Dirichlet[]>(pi); Beta[,] posteriorB = engine.Infer<Beta[,]>(B);
Inspired by the tutorial of "Bayes Point Machine". I assume that I need to generate a new model for the purpose of prediction. The framework, I think, should be creating a new pi and B, using posteriorPi and posteriorB as pirors respectively. Create a new yTest variable has the same structure as Y, but without any observed data. yTest is linked to other variables exactly as in the original tutorial (without the missing variable part).
Here is what I do for the prediction.
var yTest = Variable.Array(Variable.Array<bool>(q), p); VariableArray<Vector> pPi = Variable.Array<Vector>(p); VariableArray2D<double> pB = Variable.Array<double>(kp, kq); for (int i = 0; i < N; ++i) { pPi[i] = Variable.Random(posteriorPi[i]); } for (int i = 0; i < K; ++i) { for (int j = 0; j < K; ++j) { pB[i, j] = Variable.Random(posteriorB[i, j]); } } using (Variable.ForEach(p)) { using (Variable.ForEach(q)) { var z1 = Variable.Discrete(pPi[p]).Named("z1"); // Draw initiator membership indicator var z2 = Variable.Discrete(pPi[q]).Named("z2"); // Draw receiver membership indicator z1.SetValueRange(kp); // It seems that I have to add this line which is not in the tutorial z2.SetValueRange(kq); using (Variable.Switch(z1)) using (Variable.Switch(z2)) yTest[p][q] = Variable.Bernoulli(pB[z1, z2]); // Sample interaction value } } var postY = engine.Infer<Bernoulli[,]>(yTest);
The loop part seems very ugly but I cannot find a better way to do it. I tried the following
VariableArray<Vector> pPi = Variable.Random(posteriorPi)
but it doesn't work as it seems that the engine cannot directly figure out the Random Variable corresponding to posteriorPi should be a VariableArray<Vector>. Please instruct me if there is more elegant way to do it.
My hope is that the postY variable in the end would be the posterior Bernoulli distribution that fills the unobserved variables.
However, the results is strange because all the mean value of postY or the point estimate lead all entries to 0. I think it might be something wrong with code. At least the postY should be recover part of those training entries in YObs.
Thanks a lot for answering my questions.
Best Regards,
Zheng
No Signature yet
Thursday, June 20, 2013 4:33 AM 
Hi Zheng
If you want to feed posteriorPi back into a prediction model, you need to do something like the following:
var vPiPosterior = Variable.Array<Dirichlet>(p); vPiPosterior.ObservedValue = posteriorPi; var pPi = Variable.Array<Vector>(p); pPi[p] = Variable<Vector>.Random(vPiPosterior[p]);
However, you don't need a separate model for what you want to do. Here is a way you can modify the existing model to learn and predict at the same time:
Add two variables YPred and YIsObserved:
var YPred = Variable.Array(Variable.Array<bool>(q), p); var YIsObserved = Variable.Array(Variable.Array<bool>(q), p); // Example values indicating what is observed YIsObserved.ObservedValue = new bool[][] { new bool[] {true, true, false, true, false}, new bool[] {true, true, false, true, true}, new bool[] {false, true, false, true, false}, new bool[] {true, false, false, true, true}, new bool[] {true, true, true, true, true}, };
Then modify the inner loop as follows:
using (Variable.ForEach(p)) { using (Variable.ForEach(q)) { var z1 = Variable.Discrete(pi[p]).Named("z1"); // Draw initiator membership indicator var z2 = Variable.Discrete(pi[q]).Named("z2"); // Draw receiver membership indicator z2.SetValueRange(kq); using (Variable.Switch(z1)) { using (Variable.Switch(z2)) { using (Variable.If(YIsObserved[p][q])) { Y[p][q] = Variable.Bernoulli(B[z1, z2]); // Sample interaction value YPred[p][q] = Variable.Copy(Y[p][q]); } using (Variable.IfNot(YIsObserved[p][q])) { YPred[p][q] = Variable.Bernoulli(B[z1, z2]); // Sample interaction value } } } } }
Then you can get the predictions as follows:
var posteriorYPred = engine.Infer<Bernoulli[][]>(YPred);
John
 Marked as answer by JakeZChen Wednesday, June 26, 2013 4:09 AM
Monday, June 24, 2013 8:52 AMOwner 
Hi John,
Thanks again for your reply. I was able to make it work using the second method after I post my questions. Sorry that I couldn't post earlier for my own solution.
The first solution seems to be a little bit obscure for me. For my understanding the posteriorPi is a Dirichlet[] type. And the Variable.Random should take distributions as argument and construct random variables. Why do we need a intermediate layer to transform it into a VariableVector, set it's observedValues, and then use the Variable.Random function? Or in other words, in the Bayes Point Machine's tutorial, why is there no such indirection but directly applied the Variable.Random function on the inferred the posterior?
My final question is regarding the ''setValueRange'' function. Why it is needed for the z2 inside the loop but no z1? I tried to search for the related documents but cannot find any clue.
In the end, thanks again for your detailed answer and example code. It is really very helpful!
Best regards,
Zheng
No Signature yet
Wednesday, June 26, 2013 4:08 AM 
Hi Zheng
Two good questions!
 You are correct that you do not need the additional layer  in other words you can use the unary Random factor directly on the distribution. Wrapping this in a VariableArray<Dirichlet> does one additional thing  it means the distribution itself can be observed at run time so that you can use the same model for training and testing. Anything that is wrapped in Variable or VariableArray will cause a corresponding variable of the associated dotNET type to be generated into the inference code. Usually these will be random variables that we want to infer, but sometimes they will be variables that we want to observe at run time. When the type in the Variable or Variable array is a distribution it will always be because we want to observe the distribution at run time (we cannot infer a distribution type  we can only infer its parameters).
 z1 derives its value range kp automatically from pi (which is a Dirichlet over kp). As z2 is also derived from pi, it will also, by default, pick up the value range kp. So we need to explicitly modify this to kq.
John
Wednesday, June 26, 2013 8:43 AMOwner 
Hi John,
Thank you again for the detailed answers. I think I get it now. I'll keep playing with the codes and examples and will come to the forum again if there is anything unclear.
Best regards,
Zheng
No Signature yet
Thursday, June 27, 2013 6:13 PM