Answered by:
Multinomial observations?
Question

I'm trying to learn a simple model with binomial and multinomial observations:
 there's a certain number of datasets;
 among each dataset, a random share of the points are used (this share should also be learned);
 for each used point, there's a discrete observation with 7 outcomes and probability vector theta, so it's multinomial in total.
Here's the code; the point is to learn share and theta given some observations of pointsUsed and results:Range pRange = new Range(7); Variable<double> share = Variable.Beta(1, 1); Variable<Vector> theta = Variable.Dirichlet(pRange, new double[] { 1, 1, 1, 1, 1, 1, 1 } ); Variable<int> numDatasets = Variable.New<int>(); Range dRange = new Range(numDatasets); VariableArray<int> pointsInDataset = Variable.Array<int>(dRange); VariableArray<int> pointsUsed = Variable.Array<int>(dRange); using (Variable.ForEach(dRange)) pointsUsed[dRange] = Variable.Binomial( pointsInDataset[dRange], share ); VariableArray<int[]> results = Variable.Array<int[] >(dRange); using (Variable.ForEach(dRange)) { results[dRange] = Variable.Multinomial( pointsUsed[dRange], theta ); } ie = new InferenceEngine();
This model compiles fine (as a C# program) but does not compile as a model, with any of the three algorithms, all of them reporting unsupported Factor.Multinomial. I don't think I've grokked the differences in usage between VariableArray, VariableArray2D, and regular C# arrays, so I hope it's possible and I just didn't get it right. Also, numDatasets and pointsInDataset will always be observed, so if it might help to hardcode it somehow (Variable.Observed?), I can.Thank you for your help!
 Edited by Sergey Nikolenko Friday, December 9, 2011 4:36 PM
Friday, December 9, 2011 4:23 PM
Answers

Hi Sergey
I've lost track of what you exactly want to do  for example, results does not enter into your second bit of code. Also it seems that the number of points in the data set is observed (just the sum of the observed counts. Also I would advise against using .NET arrays unless the numDataSets is very small.
The following code may be what you want  let me know.
Range outcomeRange = new Range(numOutcomes);
Variable<Vector> theta = Variable.Dirichlet(outcomeRange, Vector.Constant(numOutcomes, 1.0));
Variable<int> numDatasets = Variable.New<int>().Named("NumDataSets");
Range dRange = new Range(numDatasets).Named("d");
VariableArray<int> numPointsInDS = Variable.Array<int>(dRange).Named("NumPtsInDS");
Range pRange = new Range(numPointsInDS[dRange]).Named("p");
var results = Variable.Array(Variable.Array<int>(pRange), dRange).Named("results");
results[dRange] = Variable.Multinomial(numPointsInDS[dRange], theta);
InferenceEngine engine = new InferenceEngine();
results.ObservedValue = new int[][] { new int[] { 3, 2, 1, 2, 3, 4, 1 }, new int[] { 2, 3, 4, 5, 3, 2, 1 } };
numDatasets.ObservedValue = results.ObservedValue.Length;
numPointsInDS.ObservedValue = results.ObservedValue.Select(r => r.Sum()).ToArray();
Dirichlet thetaPosterior = engine.Infer<Dirichlet>(theta);
 Marked as answer by Sergey Nikolenko Monday, December 19, 2011 4:03 PM
Tuesday, December 13, 2011 2:04 PMOwner
All replies

I have found on the forum and in the factor table that Multinomial factors may not be working yet. Still, is there a way around? Suppose my datasets are not too large and I can represent them as individual Discrete variables. Also, I realized that a combination of binomial and multinomial variables is just a multinomial variable with one more possible outcome. :) So in total, this gets me here:
Range dRange = new Range(numDatasets); VariableArray<int> [] points = new VariableArray<int>[numDatasets]; VariableArray<int[]>[] results = new Variable<int>[numDatasets][]; for (int i=0; i < numDatasets; ++i ) { results[i] = new Variable<int>[8]; pRange[i] = new Range(pointsInDataset[i]); points[i] = Variable.Array<int>(pRange[i]); points[i][pRange[i]] = Variable.Discrete(theta).ForEach(pRange[i]); }
and this compiles fine. The question now is: how do I count results[i] given all these variables? I haven't found anything like Count or Add or IncrementByOne (which I could use inside the loop via Variable.Switch(points[i][pRange[i]])). Maybe SumWhere can help, but it appears like I'd need to have a completely separate threedimensional array of stochastic Boolean variables and an auxiliary array of constant ones as inputs, which seems way too complicated. What is the true way to count the results? Edited by Sergey Nikolenko Monday, December 12, 2011 9:30 AM
Monday, December 12, 2011 9:29 AM 
Hi Sergey
I've lost track of what you exactly want to do  for example, results does not enter into your second bit of code. Also it seems that the number of points in the data set is observed (just the sum of the observed counts. Also I would advise against using .NET arrays unless the numDataSets is very small.
The following code may be what you want  let me know.
Range outcomeRange = new Range(numOutcomes);
Variable<Vector> theta = Variable.Dirichlet(outcomeRange, Vector.Constant(numOutcomes, 1.0));
Variable<int> numDatasets = Variable.New<int>().Named("NumDataSets");
Range dRange = new Range(numDatasets).Named("d");
VariableArray<int> numPointsInDS = Variable.Array<int>(dRange).Named("NumPtsInDS");
Range pRange = new Range(numPointsInDS[dRange]).Named("p");
var results = Variable.Array(Variable.Array<int>(pRange), dRange).Named("results");
results[dRange] = Variable.Multinomial(numPointsInDS[dRange], theta);
InferenceEngine engine = new InferenceEngine();
results.ObservedValue = new int[][] { new int[] { 3, 2, 1, 2, 3, 4, 1 }, new int[] { 2, 3, 4, 5, 3, 2, 1 } };
numDatasets.ObservedValue = results.ObservedValue.Length;
numPointsInDS.ObservedValue = results.ObservedValue.Select(r => r.Sum()).ToArray();
Dirichlet thetaPosterior = engine.Infer<Dirichlet>(theta);
 Marked as answer by Sergey Nikolenko Monday, December 19, 2011 4:03 PM
Tuesday, December 13, 2011 2:04 PMOwner 
Thank you John! Looks like my problem was that I haven't been able to think of the construction
var results = Variable.Array(Variable.Array<int>(pRange), dRange).Named("results");
I might still get stuck along the way (I need more complex observations), but this problem is definitely solved, thank you very much!
Monday, December 19, 2011 4:02 PM