Asked by:
Feedback and Questions on documentation
Question

1) ObservedValue is/was a source of confusion for me:
First it is not defined what it does in the documentation: http://research.microsoft.com/enus/um/cambridge/projects/infernet/codedoc/html/P_MicrosoftResearch_Infer_Models_Variable_1_ObservedValue.htm
So the observed value is only one per variable. You can not have more than one observed value in the same time, right?
2) Why we need to define a random variable for each data point in Tutorial 3: Learning a Gaussian?
for (int i = 0; i < data.Length; i++)
{
Variable<double> x = Variable.GaussianFromMeanAndPrecision(mean, precision);
x.ObservedValue = data[i];
}Why create and assign "x" without actually using it? How do we combine all these variables to produce the posterior of the mean and precision. How do we do the learn phase: set a value as a prior, infer, use the result as prior for the next step? Is this applied here or not?
 Edited by Tonata Monday, June 22, 2015 4:13 PM
Monday, June 22, 2015 3:55 PM
All replies

So it creates a 100 Gaussian factors and then the inference actually works backwords to reconstruct the mean and precision in the case were these factors have the provided observed values. Is that correct?
 Edited by Tonata Monday, June 22, 2015 4:17 PM
Monday, June 22, 2015 4:11 PM 
Hi Tonata,
1) The observed value is one per variable. If the variable is an array, the whole array has to be either observed or unobserved.
2) We do use the variable "x" because we created a reference to it. The variables are combined using factors.You understanding of learning is correct, but let me elaborate a little bit on it. I'd rather order the steps as 1) define a model, 2) set the priors, 3) observe data, and 4) run inference. The posteriors are inferred at step 4). At this point you can set the priors to the inferred posteriors and observe more training data. Running inference again in this setup is called online/incremental learning. Or you can observe the priors to the inferred posteriors, unobserve the training data, and infer the parameters that were previously observed to the training data. This is called prediction.
Please take a look at the answer of this forum post and let me know if you have any further questions.
Y
 Proposed as answer by Yordan ZaykovMicrosoft employee Monday, June 22, 2015 11:37 PM
Monday, June 22, 2015 9:25 PM 
Thank you for your answer.
I think that there is a difference between the scope of a C# variable and Random variable. I suspect that once declared a Random Variable then it stays until program termination. Is that correct? That is why all the "x" stays. They get internal names or they are named explicitly. They are all stored in a list that is used by the Infer engine. My deduction is supported by the fact that the Infer engine knows automatically about all Random Variables.
In Figaro probabilistic programming language this scope is called "universe". And in Figaro you can change universes and have variables with same names in each one. So I suppose in Infer.net there is a single universe. Question: Can you remove variables from the universe in Infer.net?
Could you please elaborate a bit on factors and random variables? We do not explicitly create factors, only as part of variables? The leaves in the factor graph of LearningAGaussian are random variables of type "observed". In the documentation it says "A stochastic factor such as a Gaussian, represents the distribution of the random variable that the edge points to, conditioned by the factor’s input variables. " So a factor uses other variables as an input and its own distribution (based on the type of the factor) to generate a new random variable. Is that correct?
 Edited by Tonata Tuesday, June 23, 2015 12:11 PM
Tuesday, June 23, 2015 12:08 PM 
 Your understanding regarding the scope of the variables in Infer.NET is correct.
 I don't know a way of undeclaring a variable in Infer.NET. But if you want to do this, why declare it in the first place?
 Factors connect variables in order to form a factor graph. When you call Infer() on a given variable, all variables connected to this one are considered in inference. The Infer.NET compiler creates a schedule to pass messages in this factor graph. The output of this operation is C# code which we call a generated algorithm. This then goes over several iterations to come up with the marginal you're looking for.
 If you write Variable<double> x = Variable.GaussianFromMeanAndPrecision(a, b), then you can see that the Gaussian factor has 3 ports (or edges)  one for each of the variables "x", "a", and "b". Depending on the schedule, messages will be passed in various directions through these 3 edges. The variable "a" can be assigned a prior distribution "aPrior" using the Random factor. This will add an extra edge in the factor graph from "a" to "aPrior". "b" can be observed, and "x" can be multiplied by "y" using the Product factor (or equivalently the * operator). You can continue adding variables to this factor graph using the Infer.NET modelling API until you're satisfied with the assumptions captured by the model.
Y
Tuesday, June 23, 2015 12:46 PM 
Thanks again.
So the line:
controlGroup[i] = Variable.Bernoulli(probAll).ForEach(i);
Is not an assignment (in terms of .NET it is), but rather a "connect/link" that connects each variable "controlGroup[i]" and probAll. And the edge between the two in the graph is a factor that uses the Bernoulli distribution. That is why we can perform an assignment many times in the code "controlGroup[i] =" and each time we enlarge the graph without destroying the old links in the factor graph. That is not obvious for a C# programmer :)
Can we assume that a factor roughly equals a distribution? If not then what is the main difference?
Thanks again.
Tuesday, June 23, 2015 2:02 PM 
The code above will create the following model:
The variables here are round and the factors are rectangular. So the factors are not edges in the factor graph. You might find it useful to set engine.ShowFactorGraph = true before running inference. This will display the factor graph, although it won't show arrays as nicely as in the picture above.
As to your question on whether factors are distributions, the answer is no. Please take a look at our list of factors here.
Y
Tuesday, June 23, 2015 5:28 PM