# Training the model on multiple input values (unfair dice) • ### Question

• I'm trying to play with Infer.NET. As a simple example, I tried to train an unfair dice by creating a Discrete variable with uniform prior distribution and then providing several observations for it:

```  var dice = Variable.DiscreteUniform(6).Named("dice");
var n = new Range(200).Named("n");
var values = Variable.Array<int>(n).Named("values");
values[n].SetTo(dice.ForEach(n)); // ???
values.ObservedValue = Enumerable.Range(0, 200).Select(i => i < 100 ? 0 : (i - 100) / 20 + 1).ToArray();
Console.WriteLine(new InferenceEngine().Infer(dice));

```

What I wanted to say in the line marked with ??? is "for each i in n, values[i] is the value of the random variable dice".

However, Infer.NET model does not compile: "System.InvalidOperationException: Variable 'vint[]1[n]' was consumed by variable.SetTo().  It can no longer be used or inferred."

I looked through the examples but didn't find similar situation — basically, I'm trying to train a distribution given a number of observations. Could you please hint on how to do that?

Sunday, May 29, 2011 3:35 PM

### All replies

• The issue here is that you are trying to learn a distribution over the probabilities of each dice value (i.e. a set of 6 numbers that add up to 1).  In Infer.NET this set of probabilities is represented as a Vector and the relevant distribution over Vectors that add up to one is the Dirichlet distribution.

This code should do what you want:

var dice = Variable.DirichletUniform(6);
var n = new Range(200).Named("n");
var values = Variable.Array<int>(n).Named("values");
values[n] = Variable.Discrete(dice).ForEach(n);
values.ObservedValue = Enumerable.Range(0, 200).Select(i => i < 100 ? 0 : (i - 100) / 20 + 1).ToArray();
var dist = new InferenceEngine().Infer(dice);
Console.WriteLine(dist);
Console.WriteLine("Mean of distribution over dice probabilities: "+((Dirichlet)dist).GetMean());

The last two lines print out the Dirichlet distribution parameters and the mean of this distribution - which is probably what you were looking for.  Note however that this code allows you to become more or less certain about the probabilities as you try and learn from more or fewer examples.

Cheers and welcome to the new forum!

John W.

• Proposed as answer by Tuesday, May 31, 2011 4:28 PM
Tuesday, May 31, 2011 2:36 PM
• Thanks for the answer, John! I suspected I'll have to learn the distribution over the target distribution parameters. But how then could I do conditioning on the discrete variable (say, the same dice)?
Let me explain why I need this. I need to learn the following conditional probability: p(c = c_j | s = s_k & m=m_l), where c, s, and m are all discrete variables with values in the ranges like 1..30. The training data is organized in two levels: low-level units (let's call them segments) comprise higher-level units (let's call them buckets), and s is equal for all segments inside the bucket, while c and m are different for each segment. In the buckets used for the training, for each segment distribution parameters of c and m (the probability vectors) are observed. I wanted then for each bucket to infer the conditional probability (so that the result on (i-1)-th bucket serves as a prior to the i-th bucket) and the s variable.
Now, initially I thought to organize it to be like that:
```// n is the range of segments inside each bucket
using (Variable.ForEach(n))
{
foreach (var mVal in Enumerable.Range(0, mRange.SizeAsInt))
{
using (Variable.Case(m, mVal))
{
foreach (var sVal in Enumerable.Range(0, sRange.SizeAsInt))
{
using (Variable.Case(s, sVal))
{
c[n].SetTo(Variable.DirichletUniform(cRange).Named("prior_m" + mVal + "_s" + sVal));
}
}
}
}
}
```
The problem is, as I want to learn the s variable, so that I have now a Dirichlet distribution instead of it, what do I have instead of the Variable.Case(s, sVal)?

Wednesday, June 1, 2011 2:06 PM
• I think you're still confusing int and Vector valued variables.  A dice is represented by a Vector of probabilities (p_1,p_2,p_3...) representing the probability of throwing a 1,2,3 etc. A throw of a dice is represented by an int, indicating the number thrown.  In the code I posted the variable 'dice' is a Variable<Vector> - in other words an (uncertain) Vector representing the dice, in terms of the probabilities of each side. The array variable 'values' is an array of ints representing throws of this dice.  Because the 'dice' variable is used as the argument to Variable.Discrete() for all elements of this array, we are saying that all the throws come from the same dice. Note that Variable.Discrete() creates a variable representing a throw of the dice from a variable representing the dice probabilities.

So for your new problem, you need to be clear when you are trying to infer vectors of probabilities (like dice probabilities) and when you are trying to infer integer values (like individual dice throws). It would help to make a list of what the vectors/ints are, which ones you observe and which ones you are trying to infer.

Best

John W.

Wednesday, June 1, 2011 3:31 PM
• John, I observe vectors of probabilities at each segment (c and m), and I need to infer vectors of probabilities — s and cPrior. However, that cPrior is actually the conditional probability on the integer values of c and m ( the results of "dice throws" for the current segment) — that's what caused some misunderstanding on my side.

Let me explain the background to make the problem more clear. I need to train a simple model of song. I've got segmented song information, from which I derive probabilities for the current segment's mode (m); also, I've got chord labels, from which I obtain probabilities vector for the current segment's chord (c).s is the vector of probabilities of the discrete variable that has the sense of the style of the song, and I want to train it. Further, some chords are more likely to appear within the given mode in one style and less likely in another; this leads to the need to train the conditional probability cPrior = p(c = someChord | m = someMode ^ s = someStyle). At the moment, I don't understand how it could be programmed using Infer.NET.

Given your previous answer, I thought that it should be like that:

``` var songLength = Variable.New<int>().Named("songLength");  var n = new Range(songLength).Named("n");  var chords = Variable.Array<Vector>(n).Named("chords");  var modes = Variable.Array<Vector>(n).Named("modes");  var style = Variable.DirichletUniform(NStyles);  var chordRange = new Range(Chords.StandardChords.Count).Named("cr");  var modeRange = new Range(Modes.StandardModes.Count).Named("mr");  var styleRange = new Range(NStyles).Named("sr");  var chordsPrior = Variable.Array(Variable.Array<Vector>(styleRange), modeRange);chordsPrior[modeRange][styleRange] = Variable.DirichletUniform(chordRange).ForEach(modeRange).ForEach(styleRange);
var s = Variable.Discrete(style);
using (Variable.ForEach(n))
{ var m = Variable.Discrete(modes[n]);  foreach (var mVal in Enumerable.Range(0, modeRange.SizeAsInt)) { using (Variable.Case(m, mVal))
{ foreach (var sVal in Enumerable.Range(0, styleRange.SizeAsInt)) {  using (Variable.Case(s, sVal))  {  chords[n] = Variable.Dirichlet(chordsPrior[mVal][sVal]);
} } } } }// Set up observed values for chords[n] and modes[n]

var engine = new InferenceEngine();
engine.Infer(style);
engine.Infer(chordsPrior);

```
However, when compiling the model, I get multiple pairs of error messages of the kind:

``` System.ArgumentException:  System.ArgumentException: Vector is not of type Dirichlet for argument 1 of method DirichletFromPseudoCountsOp.LogEvidenceRatio Parameter Provided Expected --------- -------- -------- sample Vector Dirichlet pseudoCounts Dirichlet Vector  System.ArgumentException: Dirichlet is not of type Vector for argument 2 of method DirichletFromPseudoCountsOp.LogEvidenceRatio Parameter Provided Expected --------- -------- -------- sample Vector Vector pseudoCounts Dirichlet Vector at MicrosoftResearch.Infer.Transforms.FactorManager.FactorInfo.GetMessageFcnInfo(FactorManager factorManager, String methodSuffix, String targetParameter, IDictionary`2 parameterTypes) at MicrosoftResearch.Infer.Transforms.MessageTransform.GetOperatorStatement(IAlgorithm alg, FactorInfo info, IDictionary`2 msgInfo, String methodSuffix, String targetParameter, IExpression index, IDictionary`2 argumentTypes, Boolean isVariableFactor)   ```
``` System.ArgumentException: System.MissingMethodException: pseudoCountAverageConditional not found in DirichletFromPseudoCountsOp using parameter types: [sampleFromPseudoCounts] Vector,[pseudoCount] Dirichlet,[to_pseudoCount] Dirichlet,[result] Dirichlet at MicrosoftResearch.Infer.Transforms.FactorManager.FactorInfo.GetMessageFcnInfo(FactorManager factorManager, String methodSuffix, String targetParameter, IDictionary`2 parameterTypes) at MicrosoftResearch.Infer.Transforms.MessageTransform.GetOperatorStatement(IAlgorithm alg, FactorInfo info, IDictionary`2 msgInfo, String methodSuffix, String targetParameter, IExpression index, IDictionary`2 argumentTypes, Boolean isVariableFactor) ```

I also noted that the same compile error appears when I'm trying to infer a Dirichlet variable with the pseudo-count taken from another Dirichlet variable:

``` var dd = Variable.Dirichlet(new[]{0.1, 0.1, 0.5, 0.1, 0.1, 0.1});
var engine = new InferenceEngine();
var dirichlet = Variable.Dirichlet(dd);
var dist = engine.Infer<Dirichlet>(dirichlet); // 1
// var dist = engine.Infer<Discrete>(Variable.Discrete(dd)); // 2
Console.WriteLine(dist);
Console.WriteLine(dist.GetMean());

```

If I comment line 1 and uncomment line 2, the inference works fine; if I try to compile the model as it is in the sample, I get errors similar to the above.

Could you please point me on how to implement the model correctly?

Thursday, June 2, 2011 2:05 PM
• (As an aside: Gosh, the "Reply"/"Edit" form is nearly unusable! Not only there is no preview, the text as you see it before hitting Submit has nothing in common with the comment that appears on the forum, but it simply doesn't work in IE9 (nothing happens after Submit) and Firefox (no formatting at all)!) :((
Friday, June 3, 2011 12:20 PM
• I am not sure I fully understand you model yet, but if you want a posterior of for the chord probabilities conditioned on the style and mode, you need something more like the following (note that this will not give anything sensible yet because no symmetry breaking and only one song):

```    static void Main(string[] args)
{
var numModes = Variable.New<int>().Named("numModes");
var numStyles = Variable.New<int>().Named("numStyles");
var numChords = Variable.New<int>().Named("numChords");

var mRange = new Range(numModes).Named("mRange");
var sRange = new Range(numStyles).Named("sRange");
var cRange = new Range(numChords).Named("cRange");
var mProbs = Variable.DirichletUniform(mRange).Named("mProbs");
var sProbs = Variable.DirichletUniform(sRange).Named("sProbs");
// Chord probabilities are conditioned on style and mode:
var cProbs = Variable.Array(Variable.Array<Vector>(mRange), sRange).Named("cProbs");
cProbs[sRange][mRange] = Variable.DirichletUniform(cRange).ForEach(sRange, mRange);

var songLength = Variable.New<int>().Named("songLength");
var n = new Range(songLength).Named("n");
var modes = Variable.Array<Vector>(n).Named("modes");

var chords = Variable.Array<int>(n).Named("chords");

var s = Variable.Discrete(sProbs).Named("s");
using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(mProbs).Named("m");
using (Variable.Switch(m))
{
chords[n] = Variable.Discrete(cProbs[s][m]);
}
}
}

numModes.ObservedValue = 2;
numStyles.ObservedValue = 2;
numChords.ObservedValue = 2;
int[] chordVals = new int[] { 0, 1, 0, 1 };
songLength.ObservedValue = chordVals.Length;
chords.ObservedValue = chordVals;

var engine = new InferenceEngine();
Console.WriteLine(engine.Infer(s));
Console.WriteLine(engine.Infer(cProbs));
}
```
Friday, June 3, 2011 1:44 PM
• Yes - the edit function does seem to be weak. On the plus side, the code block functionality for new posts is much better than on our previous forum site.

John

Friday, June 3, 2011 1:50 PM
• John, I observe chord probabilities, not the actual values of the chords as it is stated in your model. This is because I have a limited set of "basic" chords (namely, all triads) and represent the actual complex chords (like seventh chords) as probabilities vector that has high values for the corresponding triads. Also, I observe mode probabilities, which I derive from audio.

So instead of

```int[] chordVals = new int[] { 0, 1, 0, 1 };
songLength.ObservedValue = chordVals.Length;
chords.ObservedValue = chordVals;
```

I have something like

```double[][] observedChordProbs = new [] { new[] {0.8, 0.2}, new[] {0.1, 0.9}, new[] {0.75, 0.25}, new[]{0.3, 0.7} };
songLength.ObservedValue = observedChordProbs.Length;
chords.ObservedValue = observedChordProbs;

```
```Please take a look at the recent edit of my previous post — it seems to be very close to what you posted, with the difference that not the chords are observed, but vectors of probabilities.
```
```To restate the question — I don't understand the compile error I am getting, what can be fixed to compile the model successfully?
```

Friday, June 3, 2011 2:06 PM
• I am confused about your use of chordsPrior. A prior is a distribution, not a random variable as you have it in the code. Can you clarify this?
Friday, June 3, 2011 3:14 PM
• The motivation behind chordsPrior is the following: in different styles, some chords are more likely to appear with the given mode than others. So, chordsPrior anwsers the following question: given the song in style s and a song segment in mode m, what is the probability that chord c is played in the segment? I have some songs labeled with chords, and I have songs not labeled with chords. I want to learn chordsPrior on the labeled songs. To do that, I want to declare that for each segment, the prior probability of the random variable, values of which are chord probabilities, is chordsPrior[m][s] given the m and s.

```chords[n] = Variable.Dirichlet(chordsPrior[mVal][sVal]);
```
Here, I want to say that the vector of chord probabilities is drawn from the Dirichlet distribution with parameters taken from chordsPrior[mVal][sVal].

Afterwards, I want to set the observed values for chord probabilities and mode probabilities and tell Infer.NET to infer chordsPrior and song style.

Friday, June 3, 2011 3:30 PM
• I'm still trying to track down the compilation error arising from using a Dirichlet variable with pseudo counts specified as an another Dirichlet variable (see my post on Thursday, June 02, 2011 2:05 PM)

I see that the code transformation fails at the Message step because a method LogEvidenceRatio is not found that accepts a Dirichlet and a Vector. Looking at the sources, I found this:

`namespace MicrosoftResearch.Infer.Factors{`
```	public static class DirichletFromPseudoCountsOp {
// ....

[Skip]
public static double LogEvidenceRatio(Dirichlet sample, Vector pseudoCounts) { return 0.0; }
}
```

Seems like the method is here, but it is not implemented. Is it the limitation of Infer.NET? Is there some workaround in this case?

Friday, June 3, 2011 3:50 PM
• We don't support the learning of a pseudo-count random variable drawn from a Dirichlet in Infer.NET. It doesn't make a lot of sense, because typically pseudo-counts will not be constrained to sum to 1 - i.e. the domain of a Dirichlet-distributed variable is the wrong domain for pseudo-count vectors.

There are other ways to model this. You might want to consider your probabilities as being generated multivariate Gaussian of unknown mean and precision (to be inferred). If, as I assume, you want the probabilities to be positive and sum to one, you will then need to put the ouput through a Softmax factor to generate your observed chord probabilities. Alternatively you can use univariate Gaussians. Let me know if you need help with this.

John

Friday, June 3, 2011 4:43 PM
• Sorry, but there's another technical question. I'm trying this approach on a toy problem: we are observing an array of "vectors of probabilities", which we consider as being drawn from a Softmaxed multivariate Gaussian. I have this code:

``` var n = new Range(2000).Named("n");
var values = Variable.Array<Vector>(n).Named("values");
var means = Variable.VectorGaussianFromMeanAndPrecision(Vector.Constant(6, 1.0/6), PositiveDefiniteMatrix.IdentityScaledBy(6, 0.1));
var precisions = Variable.WishartFromShapeAndScale(100, PositiveDefiniteMatrix.IdentityScaledBy(6, 0.1));
values[n] = Variable.Softmax(Variable.VectorGaussianFromMeanAndPrecision(means, precisions)).ForEach(n); // 2
var truth = new Dirichlet(0.1, 0.1, 0.5, 0.1, 0.1, 0.1);
values.ObservedValue = Enumerable.Range(0, n.SizeAsInt).Select(_ => truth.Sample()).ToArray();
var engine = new InferenceEngine(new VariationalMessagePassing());
engine.Compiler.GenerateInMemory = false;
engine.Compiler.WriteSourceFiles = true;
engine.Compiler.IncludeDebugInformation = true;
var gaussian = engine.Infer<VectorGaussian>(Variable.Softmax(Variable.VectorGaussianFromMeanAndPrecision(means, precisions))); // 1
Console.WriteLine(gaussian);
Console.WriteLine(gaussian.GetMean());
var vector = Vector.Zero(6);
var prec = PositiveDefiniteMatrix.Identity(6);
gaussian.GetMeanAndPrecision(vector, prec);
Console.WriteLine(vector);
Console.WriteLine(prec);

```

Unfortunately, model compilation fails with an ArgumentOutOfRangeException:

```System.ArgumentOutOfRangeException was unhandled
Message="Index was out of range. Must be non-negative and less than the size of the collection.\r\nParameter name: index"
Source="mscorlib"
ParamName="index"
StackTrace:
at System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource)
at System.ThrowHelper.ThrowArgumentOutOfRangeException()
at System.Collections.Generic.List`1.get_Item(Int32 index)
at MicrosoftResearch.Infer.Transforms.ModelAnalysisTransform.GetArrayLengthExpression(IExpression source, Object target) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Transforms\ModelAnalysisTransform.cs:line 385
at MicrosoftResearch.Infer.Transforms.ModelAnalysisTransform.InferMarginalPrototype(IAssignExpression iae) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Transforms\ModelAnalysisTransform.cs:line 624
at MicrosoftResearch.Infer.Transforms.ModelAnalysisTransform.ConvertAssign(IAssignExpression iae) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Transforms\ModelAnalysisTransform.cs:line 837
at MicrosoftResearch.Transforms.CopyTransform.DoConvertExpression(IExpression expr) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 561
at MicrosoftResearch.Infer.Transforms.ModelAnalysisTransform.DoConvertExpression(IExpression expr) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Transforms\ModelAnalysisTransform.cs:line 1128
at MicrosoftResearch.Transforms.CopyTransform.ConvertExpression(IExpression expr) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 554
at MicrosoftResearch.Transforms.CopyTransform.ConvertExpressionStatement(IExpressionStatement ies) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 471
at MicrosoftResearch.Transforms.CopyTransform.DoConvertStatement(IStatement ist) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 408
at MicrosoftResearch.Transforms.CopyTransform.ConvertStatement(IStatement ist) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 392
at MicrosoftResearch.Transforms.CopyTransform.ConvertStatements(IStatementCollection outputs, IStatementCollection inputs) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 351
at MicrosoftResearch.Transforms.CopyTransform.DoConvertMethodBody(IStatementCollection outputs, IStatementCollection inputs) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 278
at MicrosoftResearch.Transforms.CopyTransform.ConvertMethod(IMethodDeclaration md, IMethodDeclaration imd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 262
at MicrosoftResearch.Infer.Transforms.ModelAnalysisTransform.ConvertMethod(IMethodDeclaration md, IMethodDeclaration imd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Transforms\ModelAnalysisTransform.cs:line 62
at MicrosoftResearch.Transforms.CopyTransform.DoConvertMethod(IMethodDeclaration imd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 139
at MicrosoftResearch.Infer.Transforms.ModelAnalysisTransform.DoConvertMethod(IMethodDeclaration imd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Transforms\ModelAnalysisTransform.cs:line 51
at MicrosoftResearch.Transforms.CopyTransform.ConvertMethods(ITypeDeclaration td, ITypeDeclaration itd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 122
at MicrosoftResearch.Transforms.CopyTransform.ConvertType(Object owner, ITypeDeclaration itd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 57
at MicrosoftResearch.Transforms.CopyTransform.Transform(ITypeDeclaration itd) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CopyTransform.cs:line 39
at MicrosoftResearch.Transforms.CodeTransformer.TransformToDeclaration(ITypeDeclaration typeDecl) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\CodeTransformer.cs:line 65
at MicrosoftResearch.Transforms.TransformerChain.TransformToDeclaration(ITypeDeclaration itd, AttributeRegistry`2 inputAttributes, Boolean trackTransform, Boolean showProgress, Boolean showWarnings) in C:\infernetBuilds\17-12-2010_15-34\Compiler\TransformFramework\TransformerChain.cs:line 46
at MicrosoftResearch.Infer.ModelCompiler.GetTransformedDeclaration(ITypeDeclaration itd, MethodBase method, AttributeRegistry`2 inputAttributes) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\ModelCompiler.cs:line 414
at MicrosoftResearch.Infer.ModelCompiler.CompileWithoutParams(ITypeDeclaration itd, MethodBase method, AttributeRegistry`2 inputAttributes) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\ModelCompiler.cs:line 426
at MicrosoftResearch.Infer.InferenceEngine.Compile() in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\InferenceEngine.cs:line 208
at MicrosoftResearch.Infer.InferenceEngine.BuildAndCompile(Boolean inferOnlySpecifiedVars, IEnumerable`1 vars) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\InferenceEngine.cs:line 602
at MicrosoftResearch.Infer.InferenceEngine.GetCompiledInferenceAlgorithm(Boolean inferOnlySpecifiedVars, IVariable var) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\InferenceEngine.cs:line 571
at MicrosoftResearch.Infer.InferenceEngine.InferAll(Boolean inferOnlySpecifiedVars, IVariable var) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\InferenceEngine.cs:line 488
at MicrosoftResearch.Infer.InferenceEngine.Infer[TReturn](IVariable var) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\InferenceEngine.cs:line 288
at Sandbox.UnfairDice.DiceProbsAsArray() in Z:\dev\Harmodelic\Sandbox\UnfairDice.cs:line 39

```

As debugging suggests, this occurs when transformer is processing the expression "values[n] = MMath.Softmax(vVector2)". This looks like it's processing the line marked as // 1.

If I am trying to infer not Softmax(VectorGaussian) but just VectorGaussian, I get the same error, this time with line marked // 2.

Could you help on this exception?

(Of course, I can remove the Softmax from the model whatsoever and just infer multivariate Gaussian, and then apply MMath.Softmax to the GetMean() of the inferred distribution — but it is quite not what I wanted, because it distorts the relations between probabilities. In this fair dice example, the inferred multivariate gaussian distribution had mean vector close enough to

(0.1, 0.1, 0.5, 0.1, 0.1, 0.1),

while the softmax was

(0.1538 0.1542 0.2307 0.1543 0.1543 0.1526)

)

Saturday, June 4, 2011 3:09 PM
• OK - this is quite a tricky one to answer, as there are several aspects.

1. The softmax operators don't take a VectorGaussian distributed message - we have to convert these to Gaussian messages (using the ArrayFromVector factor)
2. Some of the Softmax implementations do not allow a derived variable (the output of a deterministic factor such as ArrayFromVector) as input, so we have to tell the model compiler to give priority to an implementation that does support this
3. The Softmax is not designed to observe probabilities (which is quite a unique requirement on your part), so we have to work around that by constraining the outputs to point mass distributions (using ConstrainEqualRandom)
4. With the above, learning the precision of the VectorGaussian can often lead to improper messages. We may be able to get around this (with initialisation for example), but I suggest fixing the precision for now, and just learning the means

Putting this altogther looks as follows. Apologies for not working through this in more detail on my original post.

```      var n = new Range(2000).Named("n");
var k = new Range(6);
double precision = 10.0;
var values = Variable.Array<Vector>(n).Named("values");
var means = Variable.VectorGaussianFromMeanAndPrecision(Vector.Constant(6, 1.0 / 6), PositiveDefiniteMatrix.IdentityScaledBy(6, 0.1));
var precisions = PositiveDefiniteMatrix.IdentityScaledBy(6, precision);
var vg = Variable.Array<Vector>(n).Named("vg");
vg[n] = Variable.VectorGaussianFromMeanAndPrecision(means, precisions).ForEach(n);

using (Variable.ForEach(n))
{
var ag = Variable.ArrayFromVector(vg[n], k);
values[n] = Variable.Softmax(ag);
}

var obs = Variable.Array<Dirichlet>(n).Named("obs");
Variable.ConstrainEqualRandom(values[n], obs[n]);
var truth = new Dirichlet(0.1, 0.1, 0.5, 0.1, 0.1, 0.1);
obs.ObservedValue = Enumerable.Range(0, n.SizeAsInt).Select(_ => Dirichlet.PointMass(truth.Sample())).ToArray();

var engine = new InferenceEngine(new VariationalMessagePassing());
engine.Compiler.GivePriorityTo(typeof(MicrosoftResearch.Infer.Factors.SaulJordanSoftmaxOp_NCVMP));
Console.WriteLine(engine.Infer(means));
```

Monday, June 6, 2011 12:44 PM
• Hi all,
The softmax factor does not support observed output because it is a deterministic factor (i.e. given the value of the input the output is known), and Variational Message Passing cannot support deterministic factors with observed output (note that Expectation Propagation can, and this will be supported in the next release of Infer.NET for the analogous "max" factor).
The solution John proposed will work but the uncertainties will be larger than for your original model. This is because the softmax factor actually treats the Dirichlet point mass as a single Discrete observation. There are two solutions to this I would suggest you try.
1) The simplest is just to multiply the probabilities you have by some large(ish) value, T: the larger T is, the more certain you are claiming to be about these uncertainties. The limit you are asking for of observed probabilities corresponds to the limit as T goes to infinity. You can interpret T as the equivalent number of Discrete observations (dice throws in the dice example) you used to estimate your probability vector. Code for this would be:
```      var truthd = new double[][]{
new double[] { 0.1, 0.1, 0.5, 0.1, 0.1, 0.1 },
new double[] { 0.1, 0.1, 0.2, 0.1, 0.4, 0.1 },
new double[] { 0.1, 0.3, 0.2, 0.1, 0.2, 0.1 },
};
var n = new Range(truthd.Length).Named("n");
var k = new Range(truthd.Length);
var values = Variable.Array<Vector>(n).Named("values");
var means = Variable.Array<double>(k).Named("means");
means[k] = Variable.GaussianFromMeanAndPrecision(0, 1).ForEach(k);
var precisions = Variable.Array<double>(k).Named("precisions");
precisions.ObservedValue = System.Linq.Enumerable.Range(0, k.SizeAsInt).Select(_ => 1.0).ToArray();
var g = Variable.Array(Variable.Array<double>(k), n).Named("vg");
g[n][k] = Variable.GaussianFromMeanAndPrecision(means[k], precisions[k]).ForEach(n);
values[n] = Variable.Softmax(g[n]);

var obs = Variable.Array<Dirichlet>(n).Named("obs");
Variable.ConstrainEqualRandom(values[n], obs[n]);

double T = 10;
truthd = truthd.Select(o => o.Select(p => p * T).ToArray()).ToArray();
var truth = truthd.Select(o => new Dirichlet(o)).ToArray();
obs.ObservedValue = truth;

var engine = new InferenceEngine(new VariationalMessagePassing());
engine.Compiler.GivePriorityTo(typeof(MicrosoftResearch.Infer.Factors.SaulJordanSoftmaxOp_NCVMP));
Console.WriteLine(engine.Infer(means));
```

2) Since softmax is a deterministic function you can simply manually invert it to get the "log odds". The only catch is that the model is overparameterised: fixing the probability only gives you the log odds up to an additive constant. An easy solution is to constrain the log odds to have zero mean. Note that this will only work when you have all non-zero probabilities (otherwise you'll be taking log(0)). The following code implements this:
```      var truthd = new double[][]{
new double[] { 0.1, 0.1, 0.5, 0.1, 0.1, 0.1 },
new double[] { 0.1, 0.1, 0.2, 0.1, 0.4, 0.1 },
new double[] { 0.1, 0.3, 0.2, 0.1, 0.2, 0.1 },
};
int K = truthd.Length;
var n = new Range(truthd.Length).Named("n");
var k = new Range(K);
var values = Variable.Array<Vector>(n).Named("values");
var means = Variable.Array<double>(k).Named("means");
means[k] = Variable.GaussianFromMeanAndPrecision(0, 1).ForEach(k);
var precisions = Variable.Array<double>(k).Named("precisions");
precisions.ObservedValue = System.Linq.Enumerable.Range(0, k.SizeAsInt).Select(_ => 1.0).ToArray();
var g = Variable.Array(Variable.Array<double>(k), n).Named("vg");
g[n][k] = Variable.GaussianFromMeanAndPrecision(means[k], precisions[k]).ForEach(n);

g.ObservedValue = truthd.Select(o =>
{
var result = o.Select(p => Math.Log(p));
return result.Select(p => p - result.Sum() / (double)K).ToArray();
}).ToArray();

var engine = new InferenceEngine(new VariationalMessagePassing());
engine.Compiler.GivePriorityTo(typeof(MicrosoftResearch.Infer.Factors.SaulJordanSoftmaxOp_NCVMP));
Console.WriteLine(engine.Infer(means));

```
You'll have to play around a bit to see which solution seems to work better for your application.
Hope that helps!
David.
Monday, June 6, 2011 4:07 PM
• Hi David and John,

thanks for your replies! In the meanwhile, I was trying to get away without Softmax — instead, I constrain the generated probabilities to sum to 1. As I see from my mini-tests with some real data, the inferred means' means seem to be quite good. I'll certainly consider using Softmax if I encounter any problems with obscure results.

However, there is another technical problem with my main model (the one that trains chord probabilities conditioned on mode and style).

Even for songs of quite modest length, the generated model is so big the program ends up with OutOfMemoryException.

First, the current model code:

```public static void TrainModelUni(int nChords, int nModes, int nModesInType, int nStyles, Vector[] observedChords, Vector[] observedModes, double precision)
{
var cr = new Range(nChords).Named("cr");
// This differs from the previous models I posted here -
// I actually have not many kinds of modes, all other modes are simply cyclic shifts of them,
// so I use this fact to minimize the model size
var mtr = new Range(nModes/nModesInType).Named("mtr");
var sr = new Range(nStyles).Named("sr");

Debug.Assert(observedChords.Length == observedModes.Length);
var n = new Range(observedChords.Length).Named("n");

var pc = Variable.Array(Variable.Array<double>(cr), n).Named("pc");
var pm = Variable.Array<Vector>(n).Named("pm");
var ps = Variable.DirichletUniform(sr).Named("ps");

var pcMeans = Variable.Array(Variable.Array(Variable.Array<double>(cr), mtr), sr)
.Named("chordsProbsMeans");
// Chord probabilities conditioned on style and mode
pcMeans[sr][mtr][cr] = Variable.GaussianFromMeanAndPrecision(0, 1)
.ForEach(sr, mtr, cr);
var pcPrecs = Variable.Array(Variable.Array<double>(mtr), sr).Named("chordProbsPrecs");
pcPrecs[sr][mtr] = Variable.GammaFromShapeAndScale(3, 3)
.ForEach(sr, mtr);

// This is a collection of index arrays: [0, 1, 2 ... n-1], [n-1, 0, 1 ... n-2] ... [1, 2, ... 0]
var leftShift = CollectionUtil.AllRightRotations(Enumerable.Range(0, nChords).ToArray())
.Select(a=>Variable.Constant(a)).ToArray();

var s = Variable.Discrete(ps).Named("s");
using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(pm[n]).Named("m_" + n);
foreach (var mVal in Enumerable.Range(0, nModes))
{
using (Variable.Case(m, mVal))
{
var p = Variable.Array<double>(cr);
p[cr] = Variable.GaussianFromMeanAndPrecision(pcMeans[s][mVal/nModesInType][cr],
pcPrecs[s][mVal/nModesInType]);
// The observed chord probabilities are generated by the variable with parameters
// taken from the first mode of the current mode type,
// here they are shifted to obtain the probabilities for the current mode mVal
pc[n] = Variable.Subarray(p, leftShift[mVal%nModesInType]);
Variable.ConstrainEqual(Variable.Sum(pc[n]), 1.0);
}
}
}
}

pm.ObservedValue = observedModes;
pc.ObservedValue = observedChords.Select(c=>c.ToArray()).ToArray();

var engine = new InferenceEngine(new VariationalMessagePassing());
Console.WriteLine(engine.Infer(s));
Console.WriteLine(engine.Infer(pcMeans));
}
```

I'm trying to make it work on at least one song. The parameters are as follows.

```nChords = 24
nModes = 36
nModesInType = 12
nStyles = 5it means that pcMeans is a 5x3x24 array of Gaussians (total 360).
observedChords.Count = observedModes.Count = 756```

When I inspect the heap, I observe the following:

```898,988,328 bytes are eaten up by Gaussian[],
among them
3538080 Gaussian[] objects of size 204 bytes, and
436518 Gaussian[] objects of size 396 bytes.
```

I wonder if one can somehow reduce the count of the arrays of Gaussian, it seems too much given the above metrics.

I tried using multivariate Gaussians, but they along with their precisions eat way more memory in the form of Double[] (in that case, the most memory is consumed by 291395 arrays of size 4620 bytes (= size of double[567=24*24]) , and 179506 arrays of size 204 (= size of double).

Could you please give any hints on how to reduce the model? Perhaps I just don't see the duplication looking at the model code.

Monday, June 6, 2011 8:16 PM
• The reason there is much more memory used than you expect is that by default every message for every possible value of switch, and every possible case value, and every data point needs to be in memory to do the inference. It would help if you could post everything I need to run your code (toy data + utility routines such as CollectionUtil methods). That way, I can look at the generated code. As you have named all your variables and ranges, we should be able to see directly in the fields of the generated class which arrays use the most memory.

There are standard ways to split up the graphical model so that the operator graph is not all in memory at once. See the sections in the user guide on Shared Variables. I can help with this, but I recommend that you stabilise/refine your model first before adding the additional complexity. For example, it would be nice to get rid of the Case statement (it'll clean up the generated code), and use Switch instead - you should be able to do this by defining index arrays as observed variable arrays.

John

Tuesday, June 7, 2011 9:51 AM
• Thanks, John!

Here's all the code to run it in one place:

```using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using MicrosoftResearch.Infer;
using MicrosoftResearch.Infer.Distributions;
using MicrosoftResearch.Infer.Maths;
using MicrosoftResearch.Infer.Models;
using Enumerable = System.Linq.Enumerable;

namespace ForForum
{
class Program
{
static void Main(string[] args)
{
const int len = 1000;
const int nModes = 4;
const int nChords = 3;
const int nStyles = 3;
var rand = new Random();
var trueModes = Enumerable.Range(0, len).Select(_ => rand.Next(nModes)).ToArray();
var trueChordsByMode = new[]
{
// TYPE 1
new[] {0.6, 0.2, 0.2},
new[] {0.2, 0.6, 0.2}, // - this one is just a cyclic shift of the previous one
// TYPE 2
new[] {0.1, 0.3, 0.6},
new[] {0.6, 0.1, 0.3}, // - this one is just a cyclic shift of the previous one
}.Select(a => new Dirichlet(a)).ToArray();

const double d = 0.5 / nModes;
const double e = 1.0 / nModes;
var observedModes = trueModes.Select(m => Vector.FromArray(Enumerable.Range(0, nModes).Select(i => i == m ? e + d * (nModes - 1) : e - d).ToArray()));
var observedChords = trueModes.Select(m => trueChordsByMode[m].Sample());
TrainModelUni(nChords, nModes, 2, nStyles, observedChords.ToArray(), observedModes.ToArray(), 10.0);
}

public static void TrainModelUni(int nChords, int nModes, int nModesInType, int nStyles,
Vector[] observedChords, Vector[] observedModes, double precision)
{
var cr = new Range(nChords).Named("cr");
// This differs from the previous models I posted here -
// I actually have not many kinds of modes, all other modes are simply cyclic shifts of them,
// so I use this fact to minimize the model size
var mtr = new Range(nModes/nModesInType).Named("mtr");
var sr = new Range(nStyles).Named("sr");

Debug.Assert(observedChords.Length == observedModes.Length);
var n = new Range(observedChords.Length).Named("n");

var pc = Variable.Array(Variable.Array<double>(cr), n).Named("pc");
var pm = Variable.Array<Vector>(n).Named("pm");
var ps = Variable.DirichletUniform(sr).Named("ps");

var pcMeans = Variable.Array(Variable.Array(Variable.Array<double>(cr), mtr), sr)
.Named("pcMeans");
// Chord probabilities conditioned on style and mode
pcMeans[sr][mtr][cr] = Variable.GaussianFromMeanAndPrecision(0, 1)
.ForEach(sr, mtr, cr);
var pcPrecs = Variable.Array(Variable.Array<double>(mtr), sr)
.Named("pcPrecs");
pcPrecs[sr][mtr] = Variable.GammaFromShapeAndScale(3, 3)
.ForEach(sr, mtr);

// This is a collection of index arrays: [0, 1, 2 ... n-1], [n-1, 0, 1 ... n-2] ... [1, 2, ... 0]
var leftShift = AllRightRotations(Enumerable.Range(0, nChords).ToArray())
.Select(a=>Variable.Constant(a)).ToArray();

var s = Variable.Discrete(ps).Named("s");
using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(pm[n]).Named("m_" + n);
foreach (var mVal in Enumerable.Range(0, nModes))
{
using (Variable.Case(m, mVal))
{
var p = Variable.Array<double>(cr);
p[cr] = Variable.GaussianFromMeanAndPrecision(pcMeans[s][mVal/nModesInType][cr],
pcPrecs[s][mVal/nModesInType]);
// The observed chord probabilities are generated by the variable with parameters
// taken from the first mode of the current mode type,
// here they are shifted to obtain the probabilities for the current mode mVal
pc[n] = Variable.Subarray(p, leftShift[mVal%nModesInType]);
Variable.ConstrainEqual(Variable.Sum(pc[n]), 1.0);
}
}
}
}

pm.ObservedValue = observedModes;
pc.ObservedValue = observedChords.Select(c=>c.ToArray()).ToArray();

var engine = new InferenceEngine(new VariationalMessagePassing());
Console.WriteLine(engine.Infer(s));
Console.WriteLine(engine.Infer(pcMeans));
}

public static IEnumerable<T[]> AllRightRotations<T>(params T[] a)
{
int len = a.Length;
var range = Enumerable.Range(0, len);
return range.Select(start => range.Select(i => a[(i - start + len) % len]).ToArray());
}
}
}

```

Tuesday, June 7, 2011 1:28 PM
• I was trying to make pcMeans a shared variable array. Sadly, I found no examples on how to create a jagged shared variable array, although there is a method in the API that seems to do the job.

I can't answer the following questions when trying to share pcMeans.

1. The level of sharing, or What's the type: should it be a
"shared variable array of variable array of variable array"
, or
"shared variable array of shared variable array of shared variable array" ?
2. In either case, I found it difficult to provide a proper prior. I just don't understand what prior should be for the outer-level arrays. Of course, the innermost has prior DistributionStructArray<Gaussian, double>; but what's the prior for that?

How do I make pcMeans shared?

Yet another question regards the possibility of reducing the model memory usage by splitting a song into several chunks and giving them. Problem is, I have some tracks with known style, and I have tracks with unknown style; how do I constrain chunks from the same song to be of the same style?

Tuesday, June 7, 2011 8:52 PM
• I'll address the shared variable question in a follow-up post. In the mean-time, you can get rid of the Case statement as follows (check the outer indexing on AllRightRotations, because it was not clear from your original code whether the outer index should be the mtr range or the chord range):

```using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using MicrosoftResearch.Infer;
using MicrosoftResearch.Infer.Distributions;
using MicrosoftResearch.Infer.Maths;
using MicrosoftResearch.Infer.Models;
using Enumerable = System.Linq.Enumerable;

namespace ForForum
{
class Program
{
static void Main(string[] args)
{
const int len = 1000;
const int nModes = 4;
const int nChords = 3;
const int nStyles = 3;
var rand = new Random();
var trueModes = Enumerable.Range(0, len).Select(_ => rand.Next(nModes)).ToArray();
var trueChordsByMode = new[]
{
// TYPE 1
new[] {0.6, 0.2, 0.2},
new[] {0.2, 0.6, 0.2}, // - this one is just a cyclic shift of the previous one
// TYPE 2
new[] {0.1, 0.3, 0.6},
new[] {0.6, 0.1, 0.3}, // - this one is just a cyclic shift of the previous one
}.Select(a => new Dirichlet(a)).ToArray();

const double d = 0.5 / nModes;
const double e = 1.0 / nModes;
var observedModes = trueModes.Select(m => Vector.FromArray(Enumerable.Range(0, nModes).Select(i => i == m ? e + d * (nModes - 1) : e - d).ToArray()));
var observedChords = trueModes.Select(m => trueChordsByMode[m].Sample());
TrainModelUni(nChords, nModes, 2, nStyles, observedChords.ToArray(), observedModes.ToArray(), 10.0);
}

public static void TrainModelUni(int nChords, int nModes, int nModesInType, int nStyles,
Vector[] observedChords, Vector[] observedModes, double precision)
{
var cr = new Range(nChords).Named("cr");
var cr1 = cr.Clone().Named("cr1");
// This differs from the previous models I posted here -
// I actually have not many kinds of modes, all other modes are simply cyclic shifts of them,
// so I use this fact to minimize the model size
var mtr = new Range(nModes / nModesInType).Named("mtr");
var sr = new Range(nStyles).Named("sr");

Debug.Assert(observedChords.Length == observedModes.Length);
var n = new Range(observedChords.Length).Named("n");

var pc = Variable.Array(Variable.Array<double>(cr), n).Named("pc");
var pm = Variable.Array<Vector>(n).Named("pm");
var ps = Variable.DirichletUniform(sr).Named("ps");

var mr = new Range(nModes).Named("mr");
var m_div_mt = Variable.Array<int>(mr).Named("m_div_mt");
var m_mod_mt = Variable.Array<int>(mr).Named("m_mod_mt");
var leftShift = Variable.Array(Variable.Array<int>(cr), mtr);

var pcMeans = Variable.Array(Variable.Array(Variable.Array<double>(cr), mtr), sr)
.Named("pcMeans");
// Chord probabilities conditioned on style and mode
pcMeans[sr][mtr][cr] = Variable.GaussianFromMeanAndPrecision(0, 1)
.ForEach(sr, mtr, cr);
var pcPrecs = Variable.Array(Variable.Array<double>(mtr), sr)
.Named("pcPrecs");
pcPrecs[sr][mtr] = Variable.GammaFromShapeAndScale(3, 3)
.ForEach(sr, mtr);

var s = Variable.Discrete(ps).Named("s");
using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(mr, pm[n]).Named("m_" + n);
using (Variable.Switch(m))
{
var p = Variable.Array<double>(cr);
p[cr] = Variable.GaussianFromMeanAndPrecision(pcMeans[s][m_div_mt[m]][cr],
pcPrecs[s][m_div_mt[m]]);
// The observed chord probabilities are generated by the variable with parameters
// taken from the first mode of the current mode type,
// here they are shifted to obtain the probabilities for the current mode mVal
pc[n] = Variable.Subarray(p, leftShift[m_mod_mt[m]]);
Variable.ConstrainEqual(Variable.Sum(pc[n]), 1.0);
}
}
}

pm.ObservedValue = observedModes;
pc.ObservedValue = observedChords.Select(c => c.ToArray()).ToArray();
m_div_mt.ObservedValue = Enumerable.Range(0, nModes).Select(i => i / nModesInType).ToArray();
m_mod_mt.ObservedValue = Enumerable.Range(0, nModes).Select(i => i % nModesInType).ToArray();
leftShift.ObservedValue =
AllRightRotations(Enumerable.Range(0, nChords).ToArray()).Take(nModes / nModesInType).ToArray();

var engine = new InferenceEngine(new VariationalMessagePassing());
Console.WriteLine(engine.Infer(s));
Console.WriteLine(engine.Infer(pcMeans));
}

public static IEnumerable<T[]> AllRightRotations<T>(params T[] a)
{
int len = a.Length;
var range = Enumerable.Range(0, len);
return range.Select(start => range.Select(i => a[(i - start + len) % len]).ToArray());
}
}
}

```
Wednesday, June 8, 2011 9:12 AM
• Thank you very much, John! The improvement is dramatical — I tested the switch solution on the same song, and I observe 4x reduction in the memory consumed by Gaussians[]. A closer look reveals that due to your suggestion we got rid of all 3,538,080 Gaussian[] arrrays of size 204. (looks like they were 12-element arrays).
Wednesday, June 8, 2011 10:09 AM
• OK - here is the shared variable implementation. The syntax is quite tricky.

John

```using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using MicrosoftResearch.Infer;
using MicrosoftResearch.Infer.Distributions;
using MicrosoftResearch.Infer.Maths;
using MicrosoftResearch.Infer.Models;
using Enumerable = System.Linq.Enumerable;

namespace ForForum
{
using GaussianArr3 = DistributionRefArray<DistributionRefArray<DistributionStructArray<Gaussian, double>, double[]>, double[][]>;
using VarDoubArr2 = VariableArray<VariableArray<double>, double[][]>;

class Program
{
static void Main(string[] args)
{
const int len = 1000;
const int nModes = 4;
const int nChords = 3;
const int nStyles = 3;
var rand = new Random();
var trueModes = Enumerable.Range(0, len).Select(_ => rand.Next(nModes)).ToArray();
var trueChordsByMode = new[]
{
// TYPE 1
new[] {0.6, 0.2, 0.2},
new[] {0.2, 0.6, 0.2}, // - this one is just a cyclic shift of the previous one
// TYPE 2
new[] {0.1, 0.3, 0.6},
new[] {0.6, 0.1, 0.3}, // - this one is just a cyclic shift of the previous one
}.Select(a => new Dirichlet(a)).ToArray();

const double d = 0.5 / nModes;
const double e = 1.0 / nModes;
var observedModes = trueModes.Select(m => Vector.FromArray(Enumerable.Range(0, nModes).Select(i => i == m ? e + d * (nModes - 1) : e - d).ToArray()));
var observedChords = trueModes.Select(m => trueChordsByMode[m].Sample());
TrainModelUni(nChords, nModes, 2, nStyles, observedChords.ToArray(), observedModes.ToArray(), 10.0);
}

public static GaussianArr3 InitGaussianArr3(Gaussian val, int outer, int mid, int inner)
{
var g3d =
Enumerable.Repeat(
Enumerable.Repeat(
Enumerable.Repeat((Gaussian)val.Clone(), inner).ToArray(), mid).ToArray(), outer).ToArray();
return (GaussianArr3)Distribution<double>.Array(g3d);
}

public static void TrainModelUni(int nChords, int nModes, int nModesInType, int nStyles,
Vector[] observedChords, Vector[] observedModes, double precision)
{
// Split the data into chunks. TODO - modify the chunking logic to
// work when the last chunk is not the same size as the others
int numChunks = 10;
var obsChordsSplit = new Vector[numChunks][];
var obsModesSplit = new Vector[numChunks][];
int numPerChunk = observedChords.Length / numChunks;
int idx = 0;
for (int i = 0; i < numChunks; i++)
{
obsChordsSplit[i] = new Vector[numPerChunk];
obsModesSplit[i] = new Vector[numPerChunk];
for (int j = 0; j < numPerChunk; j++, idx++)
{
obsChordsSplit[i][j] = observedChords[idx];
obsModesSplit[i][j] = observedModes[idx];
}
}
var model = new Model(numChunks);

var cr = new Range(nChords).Named("cr");
var mtr = new Range(nModes / nModesInType).Named("mtr");
var sr = new Range(nStyles).Named("sr");

var numChordsInChunk = Variable.New<int>().Named("numCIC");
var n = new Range(numChordsInChunk).Named("n");

var pc = Variable.Array(Variable.Array<double>(cr), n).Named("pc");
var pm = Variable.Array<Vector>(n).Named("pm");
var ps = Variable.DirichletUniform(sr).Named("ps");

var mr = new Range(nModes).Named("mr");
var m_div_mt = Variable.Array<int>(mr).Named("m_div_mt");
var m_mod_mt = Variable.Array<int>(mr).Named("m_mod_mt");
var leftShift = Variable.Array(Variable.Array<int>(cr), mtr).Named("leftShift");

// The priors
var pcMeansPrior =
InitGaussianArr3(Gaussian.FromMeanAndPrecision(0, 1), sr.SizeAsInt, mtr.SizeAsInt, cr.SizeAsInt);

// Chord probabilities conditioned on style and mode
var pcMeans = SharedVariable<double[][][]>.Random<VarDoubArr2, double[][][], GaussianArr3>(
Variable.Array(Variable.Array<double>(cr), mtr), sr, pcMeansPrior).Named("pcMeans");
var pcMeansCopy = pcMeans.GetCopyFor(model).Named("pcMeansCopy");
var pcPrecs = Variable.Array(Variable.Array<double>(mtr), sr)
.Named("pcPrecs");
pcPrecs[sr][mtr] = Variable.GammaFromShapeAndScale(3, 3).ForEach(sr, mtr);

var s = Variable.Discrete(ps).Named("s");

using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(mr, pm[n]).Named("m");
using (Variable.Switch(m))
{
var p = Variable.Array<double>(cr).Named("p");
p[cr] = Variable.GaussianFromMeanAndPrecision(
pcMeansCopy[s][m_div_mt[m]][cr],
pcPrecs[s][m_div_mt[m]]);
// The observed chord probabilities are generated by the variable with parameters
// taken from the first mode of the current mode type,
// here they are shifted to obtain the probabilities for the current mode mVal
pc[n] = Variable.Subarray(p, leftShift[m_mod_mt[m]]);
Variable.ConstrainEqual(Variable.Sum(pc[n]), 1.0);
}
}
}

m_div_mt.ObservedValue = Enumerable.Range(0, nModes).Select(i => i / nModesInType).ToArray();
m_mod_mt.ObservedValue = Enumerable.Range(0, nModes).Select(i => i % nModesInType).ToArray();
leftShift.ObservedValue =
AllRightRotations(Enumerable.Range(0, nChords).ToArray()).Take(nModes / nModesInType).ToArray();

var engine = new InferenceEngine(new VariationalMessagePassing());

engine.NumberOfIterations = 10;
for (int pass = 0; pass < 5; pass++)
{
for (int i = 0; i < numChunks; i++)
{
numChordsInChunk.ObservedValue = obsModesSplit[i].Length;
pm.ObservedValue = obsModesSplit[i];
pc.ObservedValue = obsChordsSplit[i].Select(c => c.ToArray()).ToArray();
model.InferShared(engine, i);
}
}
Console.WriteLine(pcMeans.Marginal<GaussianArr3>());
}

public static IEnumerable<T[]> AllRightRotations<T>(params T[] a)
{
int len = a.Length;
var range = Enumerable.Range(0, len);
return range.Select(start => range.Select(i => a[(i - start + len) % len]).ToArray());
}
}
}

```
Wednesday, June 8, 2011 1:43 PM
• Just as a note for the future, we have revisited the type parameters for the Random<ItemType, ArrayType, DistributionArrayType> method in the SharedVariable<DomainType> class. It looks like the ArrayType is not needed (always being the same as DomainType) and removing it allows C# to infer the method's type parameters from the arguments. So in the next release, you should be able to do:

var

pcMeans = SharedVariable<double[][][]>.Random(
Variable.Array(Variable.Array<double>(cr), mtr), sr, pcMeansPrior).Named("pcMeans"

);

Wednesday, June 8, 2011 1:58 PM
• Thanks, John! One question remains, though, regarding the style. The train phase consists of supervised learning (I specify the observed value for style) and unsupervised (I expect style to be inferred). Because the songs will be split into chunks anyway, I'll need to make style a shared variable.

The question is: how do I "reset" it between the runs? How do I specify that the scope of sharing style differs from the scope of sharing pcMeans?

Wednesday, June 8, 2011 2:43 PM
• The new syntax looks good, thanks! Abundance of type parameters clutters the code indeed.
Wednesday, June 8, 2011 2:45 PM
• Regarding my question about the style — I introduced a bool random variable, depending on which style distribution parameters are either taken from an observed vector or are taken from a SharedVariableArray indexed by the song number.

I have a problem with this approach — namely, when running the toy example, I provide during the supervised learning phase two songs of each of the 3 styles, so I expect the inferred pcMeans to improve in the slice corresponding to the style. However, I observe that only style 0 is being learnt, regardless of what I set as observation for the style probabilities. Slices corresponding to all other styles stay untrained.

Here is the runnable code.

```using System;
using System.Collections.Generic;
using System.Linq;
using MicrosoftResearch.Infer;
using MicrosoftResearch.Infer.Distributions;
using MicrosoftResearch.Infer.Maths;
using MicrosoftResearch.Infer.Models;

namespace ForForum
{
using Gaussian3DArray =
DistributionRefArray<
DistributionRefArray<
DistributionStructArray<
Gaussian,
double>,
double[]>,
double[][]>;

using Gamma2DArray =
DistributionRefArray<
DistributionStructArray<
Gamma,
double>,
double[]>;
using DirichletArray = DistributionRefArray<Dirichlet, Vector>;

using Double2DArray = VariableArray<VariableArray<double>, double[][]>;

/// <summary>
/// Usage:
/// <list type="number">
/// <item>Construct the model. Do not forget to specify all songs in the nSegsInSongs parameter, including those on which training is done and those on which testing is done.</item>
/// <item>Perform supervised learning specifying modes and chords data, and the known song style.</item>
/// <item>Perform unsupervised learning specifying modes and chords data for other songs.</item>
/// <item>Perform testing specifying modes data for the test songs.</item>
/// </list>
/// </summary>
public class HarmonicModel2
{
private readonly TrainModel myTrainModel;

public static void Main(string[] args)
{
const int len = 1000;
const int nModes = 4;
const int nChords = 3;
const int nStyles = 3;
const int nSongs = 2 * nStyles + 1;
const int nSongsWithKnownStyles = 2 * nStyles;

var rand = new Random();
var trueModesGenerator = Enumerable.Range(0, len).Select(_ => rand.Next(nModes));

var trueChordsByModeStyle = new[] {
new[] {
// TYPE 1
new[] {0.6, 0.2, 0.2},
new[] {0.2, 0.6, 0.2}, // - this one is just a cyclic shift of the previous one
// TYPE 2
new[] {0.1, 0.3, 0.6},
new[] {0.6, 0.1, 0.3}, // - this one is just a cyclic shift of the previous one
},
new[] {
// Type 1
new[] {0.4, 0.4, 0.2},
new[] {0.2, 0.4, 0.4},
// Type 2
new[] {0.33, 0.34, 0.33},
new[] {0.33, 0.33, 0.34},
},
new[] {
// Type 1
new[] {0.01, 0.01, 0.98},
new[] {0.98, 0.01, 0.01},
// Type 2
new[] {0.5, 0.2, 0.3},
new[] {0.3, 0.5, 0.2},
},
}.Select(sty => sty.Select(m => new Dirichlet(m)).ToArray()).ToArray();

const double d = 0.5 / nModes;
const double e = 1.0 / nModes;
var model = new HarmonicModel2(nChords, nModes, 2, nStyles, len, nSongsWithKnownStyles, Enumerable.Repeat(len, nSongs));
IEnumerable<double[]> observedModes;
IEnumerable<double[]> observedChords;
// First, train on songs, two from each style
for (int n = 0; n < 2; ++n)
{
for (int sty = 0; sty < nStyles; ++sty)
{
CreateSong(nModes, trueModesGenerator, e, d, trueChordsByModeStyle[sty], out observedModes, out observedChords);
model.LearnSupervised(observedModes.ToArray(), observedChords.ToArray(), sty);
}
}
// Let's see the output on the following song...
CreateSong(nModes, trueModesGenerator, e, d, trueChordsByModeStyle, out observedModes, out observedChords);
var style = model.LearnUnsupervised(observedModes.ToArray(), observedChords.ToArray());
Console.WriteLine(style);
}

private static void CreateSong(int nModes, IEnumerable<int> trueModesGenerator, double e, double d, Dirichlet[] trueChordsByMode, out IEnumerable<double[]> observedModes, out IEnumerable<double[]> observedChords)
{
var trueModes = trueModesGenerator.ToArray();
observedModes = trueModes.Select(m => Enumerable.Range(0, nModes).Select(i => i == m ? e + d * (nModes - 1) : e - d).ToArray());
observedChords = trueModes.Select(m => trueChordsByMode[m].Sample().ToArray());
}

#pragma warning disable 1573
/// <param name="nSegsInSongs">Regardless of the order, must contain all songs, including the training and the testing set.</param>
public HarmonicModel2(int nChords, int nModes, int nModesInType, int nStyles, int maxSegmentsInBatch, int nSongsWithKnownStyles, IEnumerable<int> nSegsInSongs)
{
var nBatchesOverall = nSegsInSongs.Aggregate(0.With(0), (nb, nSegs) => (nb.fst + nSegs / maxSegmentsInBatch + (nSegs % maxSegmentsInBatch == 0 ? 0 : 1)).With(nb.snd + 1));
myTrainModel = new TrainModel(nChords, nModes, nModesInType, nStyles, maxSegmentsInBatch, nSongsWithKnownStyles, nBatchesOverall.fst, nBatchesOverall.snd);
}
#pragma warning restore 1573

public void LearnSupervised(double[][] observedModes, double[][] observedChords, int style)
{
if (observedModes.Length != observedChords.Length) throw new ModelException("Observed modes and chords sequences do not match: " + observedModes.Length + " " + observedChords.Length);
myTrainModel.LearnSupervised(observedModes.Select(a => Vector.FromArray(a)).ToArray(), observedChords, style);
}

public Dirichlet LearnUnsupervised(double[][] observedModes, double[][] observedChords)
{
if (observedModes.Length != observedChords.Length) throw new ModelException("Observed modes and chords sequences do not match: " + observedModes.Length + " " + observedChords.Length);
return myTrainModel.LearnUnsupervised(observedModes.Select(a => Vector.FromArray(a)).ToArray(), observedChords);
}

public class ModelException : Exception
{
public ModelException(string message)
: base(message)
{
}
}

private class TrainModel
{
// ReSharper disable InconsistentNaming
private readonly InferenceEngine engine;
private readonly Model model;

private readonly Variable<int> batchLength;

/// <summary>Chord probabilities for each segment.</summary>
private readonly VariableArray<VariableArray<double>, double[][]> pc;
/// <summary>Mode probabilities for each segment.</summary>
private readonly VariableArray<Vector> pm;

#region Chord probabilities conditioned on style and mode
/// <summary>Means of distributions from which chord distributions are drawn, conditioned on mode and style.
/// The order of ranges, from the outermost to innermost: style, mode, chord.</summary>
private readonly ISharedVariableArray<Double2DArray, double[][][]> pcMeans;
/// <summary>Precisions of distributions from which chord distributions are drawn, conditioned on mode and style.
/// Precisions are shared by all chords for the same mode&style.
/// The order of ranges, from the outermost to innermost: style, mode, chord.</summary>
private readonly ISharedVariableArray<VariableArray<double>, double[][]> pcPrecs;
#endregion

#region Song style
/// <summary>Known style probabilities for supervised learning.</summary>
private readonly Variable<Vector> knownStyle;
/// <summary>Unknown style probabilities for all songs for unsupervised learning and testing.</summary>
private readonly SharedVariableArray<Vector> psUnknown;
/// <summary>Current song with unknown style - used to index into <see cref="psUnknown"/></summary>
private readonly Variable<int> unknownSongNumber;
/// <summary>Governs from which style probabilities - known or unknown - the discrete style variable is derived.</summary>
private readonly Variable<bool> isCurSongKnown;
#endregion

/// <summary>Range of chords.</summary>
private readonly Range cr;
/// <summary>Range of mode types.</summary>
private readonly Range mtr;
/// <summary>Range of styles.</summary>
private readonly Range sr;
// ReSharper restore InconsistentNaming

/// <summary>Counter for songs with unknown style.</summary>
private int myCurUnknownSong = 0;
/// <summary>Number of all songs for which the model is built.</summary>
private readonly int mySongsCount;
/// <summary>Batch counter. Each song produces one or more batches.</summary>
private int myCurSongFirstBatch = 0;

private readonly int myMaxSegmentsInBatch;

public TrainModel(int nChords, int nModes, int nModesInType, int nStyles, int maxSegsInBatch, int nSongsWithKnownStyles, int nBatchesOverall, int nSongs)
{
engine = new InferenceEngine(new VariationalMessagePassing());

model = new Model(nBatchesOverall);
myMaxSegmentsInBatch = maxSegsInBatch;
mySongsCount = nSongs;

int modePeriod = nModes / nModesInType;

cr = new Range(nChords).Named("cr");
mtr = new Range(modePeriod).Named("mtr");
sr = new Range(nStyles).Named("sr");

batchLength = Variable.New<int>().Named("songLength");
var n = new Range(batchLength);
pc = Variable.Array(Variable.Array<double>(cr), n).Named("pc");
pm = Variable.Array<Vector>(n).Named("pm");

#region Chord probabilities conditioned on style and mode
// Chord probabilities conditioned on style and mode are sampled from Gaussians
// Means of the gaussians
var pcMeansPriorElem = Gaussian.FromMeanAndPrecision(0, 1);
var pcMeansPrior = (Gaussian3DArray)Distribution<double>.Array(
Enumerable.Repeat(
Enumerable.Repeat(
Enumerable.Repeat((Gaussian)pcMeansPriorElem.Clone(), nChords).ToArray(), modePeriod).ToArray(), nStyles).ToArray());
pcMeans = SharedVariable<double[][][]>.Random<Double2DArray, double[][][], Gaussian3DArray>(
Variable.Array(Variable.Array<double>(cr), mtr), sr, pcMeansPrior).Named("pcMeans");
var pcMeansCopy = pcMeans.GetCopyFor(model).Named("pcMeansCopy");

// Precisions (the same precision for all chords for one mode and one style)
var pcPrecsPriorElem = Gamma.FromShapeAndRate(3, 3);
var pcPrecsPrior = (Gamma2DArray)Distribution<double>.Array(
Enumerable.Repeat(
Enumerable.Repeat((Gamma)pcPrecsPriorElem.Clone(), modePeriod).ToArray(), nStyles).ToArray());
pcPrecs = SharedVariable<double[][]>.Random<VariableArray<double>, double[][], Gamma2DArray>(
Variable.Array<double>(mtr), sr, pcPrecsPrior);
var pcPrecsCopy = pcPrecs.GetCopyFor(model).Named("pcPrecsCopy");
#endregion

#region Song style
knownStyle = Variable.New<Vector>().Named("knownStyle");

var unknownSongsRange = new Range(nSongs);
var psUnknownPrior = (DirichletArray)Distribution<Vector>.Array(
Enumerable.Repeat(Dirichlet.Uniform(nStyles), nSongs - nSongsWithKnownStyles).ToArray());
psUnknown = SharedVariable<Vector>.Random(unknownSongsRange, psUnknownPrior).Named("psUnknown");
var psUnknownCopy = psUnknown.GetCopyFor(model);
unknownSongNumber = Variable.New<int>().Named("unknownSongNumber");
unknownSongNumber.ObservedValue = 0;

isCurSongKnown = Variable.New<bool>().Named("isCurSongKnown");

var s = Variable.New<int>();
using (Variable.If(isCurSongKnown))
{
s.SetTo(Variable.Discrete(sr, knownStyle));
}
using (Variable.IfNot(isCurSongKnown))
{
s.SetTo(Variable.Discrete(sr, psUnknownCopy[unknownSongNumber]));
}
#endregion

var mr = new Range(nModes).Named("mr");
var mDivMt = Variable.Array<int>(mr).Named("m_div_mt");
var mModMt = Variable.Array<int>(mr).Named("m_mod_mt");
var leftShift = Variable.Array(Variable.Array<int>(cr), mtr);

using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(mr, pm[n]).Named("m_" + n);
using (Variable.Switch(m))
{
var p = Variable.Array<double>(cr);
p[cr] = Variable.GaussianFromMeanAndPrecision(pcMeansCopy[s][mDivMt[m]][cr],
pcPrecsCopy[s][mDivMt[m]]);
// The observed chord probabilities are generated by the variable with parameters
// taken from the first mode of the current mode type,
// here they are shifted to obtain the probabilities for the current mode mVal
pc[n] = Variable.Subarray(p, leftShift[mModMt[m]]);
Variable.ConstrainEqual(Variable.Sum(pc[n]), 1.0);
}
}
}

mDivMt.ObservedValue = Enumerable.Range(0, nModes).Select(i => i / nModesInType).ToArray();
mModMt.ObservedValue = Enumerable.Range(0, nModes).Select(i => i % nModesInType).ToArray();
leftShift.ObservedValue = Util.AllRightRotations(Enumerable.Range(0, nChords).ToArray()).ToArray();
}

public void LearnSupervised(Vector[] songModes, double[][] songChords, int style)
{
const double zero = 1e-6;
double one = 1 - zero * (sr.SizeAsInt - 1);
knownStyle.ObservedValue = Vector.FromArray(Enumerable.Range(0, sr.SizeAsInt).Select(i => i == style ? one : zero).ToArray());
isCurSongKnown.ObservedValue = true;

Train(songModes, songChords);

Console.WriteLine(pcMeans.Marginal<Gaussian3DArray>());
Console.WriteLine(pcPrecs.Marginal<Gamma2DArray>());
}

public Dirichlet LearnUnsupervised(Vector[] songModes, double[][] songChords)
{
unknownSongNumber.ObservedValue = myCurUnknownSong;
isCurSongKnown.ObservedValue = false;

Train(songModes, songChords);

Console.WriteLine(pcMeans.Marginal<Gaussian3DArray>());
Console.WriteLine(pcPrecs.Marginal<Gamma2DArray>());
var styles = psUnknown.Marginal<Dirichlet[]>();
Console.WriteLine(styles);
var style = styles[myCurUnknownSong];

myCurUnknownSong += 1;

return style;
}

private void Train(Vector[] songModes, double[][] songChords)
{
int songLen = songModes.Length;
// Divide the song into batches of size not more than myMaxSegsInBatch
var lastBatchLength = songLen % myMaxSegmentsInBatch;
int nFullSizeBatches = songLen / myMaxSegmentsInBatch;
int nBatches = nFullSizeBatches + (lastBatchLength > 0 ? 1 : 0);

engine.NumberOfIterations = 10;
for (int pass = 0; pass < 5; ++pass)
{
for (int batch = 0; batch < nBatches; ++batch)
{
var curBatchLen = batch < nFullSizeBatches ? myMaxSegmentsInBatch : lastBatchLength;
batchLength.ObservedValue = curBatchLen;
var curBatchStart = batch * myMaxSegmentsInBatch;
pc.ObservedValue = songChords.SubArray(curBatchStart, curBatchLen);
pm.ObservedValue = songModes.SubArray(curBatchStart, curBatchLen);
model.InferShared(engine, myCurSongFirstBatch + batch);
}
}
myCurSongFirstBatch += nBatches;
}

public Dirichlet TestLabeled(Vector[] modes, double[][] chords)
{
throw new NotImplementedException();
}
}

public Dirichlet TestLabeled(Vector[] modes, double[][] chords)
{
return myTrainModel.TestLabeled(modes, chords);
}
}

// Various utils

public class Pair<A, B>
{
public A fst { get; private set; }
public B snd { get; private set; }

public Pair(A a, B b)
{
fst = a;
snd = b;
}

public bool Equals(Pair<A, B> other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return Equals(other.fst, fst) && Equals(other.snd, snd);
}

public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof(Pair<A, B>)) return false;
return Equals((Pair<A, B>)obj);
}

public override int GetHashCode()
{
unchecked
{
return (fst.GetHashCode() * 397) ^ snd.GetHashCode();
}
}

public override string ToString()
{
return "(" + fst + ", " + snd + ')';
}
}

public static class Util
{
public static Pair<T1, T2> With<T1, T2>(this T1 fst, T2 snd)
{
return new Pair<T1, T2>(fst, snd);
}

public static T[] SubArray<T>(this T[] array, int from, int len)
{
var ret = new T[len];
for (int i = 0; i < len; ++i)
{
ret[i] = array[from + i];
}
return ret;
}

public static IEnumerable<T[]> AllRightRotations<T>(params T[] a)
{
int len = a.Length;
var range = Enumerable.Range(0, len);
return range.Select(start => range.Select(i => a[(i - start + len) % len]).ToArray());
}
}
}
```
Wednesday, June 8, 2011 9:33 PM
• Note that for known styles, I do not specify e.g. "1.0 0.0 0.0" but "0.99998 0.00001 0.00001" instead because otherwise I get "model has zero probability" exceptions. Perhaps this is a related problem.
Wednesday, June 8, 2011 9:37 PM
• This is getting a bit too long and involved for me to support efficiently and I'm starting to get lost in your code. So anything you can do to distill the posted code down to the bare essentials would be appreciated. I have some general comments.

1. Are the memory issues such that you are required to split individual songs into chunks? It would be better to split at song boundaries so that style can be local to a given copy of the model and does not need to be shared. The difficuty is that style is only shared within a song whereas the shared variable wrapper classes assume that the sharing is across all copies of the model.
2. You could do away with the isCurSongKnown switch and use the same model for both cases by either (a) setting the style prior close to a Dirichlet point mass or by (b)observing s for known songs. In (b), when you run in supervised model, you will say s.ObservedValue = ..., when you run in unsupervised mode you will say s.ClearObservedValue(). If you will be going back and forth between supervised and unsupervised, you can create two instances of you model so as to avoid the recompilation caused by switching.
3. I note that there is still a discrepancy in your leftShift variable. The outer dimension is nChords, but its use in the model requires dimension mtr. This needs to be resolved; the only reason you are not getting a run-time error is because nChords is > mtr.

John

Thursday, June 9, 2011 9:35 AM
• I have bumped into the need to have a MOD operation.

Without repeating all the code above, in the main Switch for the conditional probability of chords (on the mode and style) I have:

```// s - style, <br/>// m - mode, <br/>// pcSCopy is an array of arrays of vectors with Dirichlet priors<br/>// cSUnwrapped is an array of discrete variables which is auxiliary. <br/>// cS is an array of discrete variables which values are observed.
```
```<br/>  using (Variable.Switch(s))
{
using (Variable.ForEach(n))
{
var m = Variable.Discrete(mr, pm[n]).Named("m_" + n);
using (Variable.Switch(m))
{
cSUnwrapped[n] = Variable.Discrete(pcSCopy[s][mDivMt[m]]) + mModMt[m];<br/>   ? cS[n] = cSUnwrapped[n] % 12;
```

Now, I need to set cSUnwrapped[n] to cS[n]. cS[n] is an array of observed values. Problem is, all values are in the range 0..11, and in fact cS[n] should be cSUnwrapped[n] % 12. But I failed to find any working way to emulate it.

Is there a way to express it without writing a custom Factor?

I tried

1) using Variable.If:

```using (Variable.If(cSUnwrapped[n] < v12))
// v12 = Variable.Constant(12);
{
cS[n] = cSUnwrapped[n];
}
using (Variable.IfNot(csUnwrapped[n] < v12))
{
cS[n] = cSUnwrapped[n] - v12;
}
```

, but compilation of the model fails with ArgumentOutOfRangeException:

```at System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource)
at System.ThrowHelper.ThrowArgumentOutOfRangeException()
at System.Collections.Generic.List`1.get_Item(Int32 index)
at MicrosoftResearch.Infer.Models.Variable.IsPrefixOf(IList`1 prefix, IList`1 list) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Models\Variable.cs: line 158
at MicrosoftResearch.Infer.Models.Variable.<GetDefinitionsMadeWithin>d__0.MoveNext() in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Models\Variable.cs: line 248
at MicrosoftResearch.Infer.Models.Variable`1.SetTo(Variable`1 variable) in C:\infernetBuilds\17-12-2010_15-34\Compiler\Infer\Models\Variable.cs: line 2973
// my program up the stackat Harmodelic.HarmonicModel2.TrainModel..ctor(Int32 nChords, Int32 nChordsInType, Int32 nModes, Int32 nModesInType, Int32 nStyles, Int32 maxSegsInBatch, Int32 nBatchesOverall, Int32 nSongs) in HarmonicModel2.cs: line 243...```

2) Using an auxiliary array of variables with observed value set to a[i] = i%12; i have to use a Switch block on the index to use it, but still model generation fails (

`System.ArgumentException:  System.ArgumentException: int is not of type IDistribution<T> as required by the type constraint of method CopyOp`1.LogEvidenceRatio`

)

You should perhaps look at 1) because it seems like a bug (even if my code doesn't make sense, a more helpful way of telling it other than an obscure ArgumentOutOfRangeException inside model compiler would be great :)).

Saturday, June 11, 2011 3:00 PM
• Both approaches 1 and 2 can be made to work although you will get better results with 2.  The code for 2 would look something like this (using mod 2 instead of mod 12):

```			Range unwrappedValue = new Range(4).Named("unwrappedValue");
Variable<int> unwrapped = Variable.DiscreteUniform(unwrappedValue).Named("unwrapped");
Variable<int> wrapped = Variable.New<int>().Named("wrapped");
VariableArray<int> modulo2 = Variable.Observed(new int[] { 0, 1, 0, 1 }, unwrappedValue).Named("modulo2");
using (Variable.Switch(unwrapped)) {
wrapped.SetTo(modulo2[unwrapped]);
// Workaround for bug in 2.4 Beta 2
//wrapped.SetTo(modulo2[unwrapped]+0);
}
wrapped.ObservedValue = 1;
InferenceEngine engine = new InferenceEngine();
Console.WriteLine(engine.Infer(unwrapped));
```

As you point out, this doesn't quite work due to a bug in Factor.Copy. This will be fixed in the next release. Meanwhile, a simple workaround is to use the commented out line above, where 0 is added. We will also fix the bug in approach 1. Thanks for pointing out these issues!
Saturday, June 11, 2011 4:11 PM
• Thanks for the reply, Tom.

But there's another problem here: I get ArgumentOutOfRangeException when running the model. Seems like the problem lies in the mismatch between the switch range and the unwrapped range.

First, a bit of code with the names:

```var csPlusMsR = new Range(nChordsInType + nModesInType - 1).Named("cs+ms r");
var kModN = Variable.Observed(
//... array of length csPlusMsR.SizeAsInt
csPlusMsR).Named("k MOD ncit");

// ....

var unwrapped = v1 + v2;
unwrapped.SetValueRange(csPlusMsR);
unwrapped.Name = "d";
using (Variable.Switch(unwrapped))
{
v3 = Variable.Discrete(kModN[unwrapped]);
}

```

v1 has values in range nChordsInType ( = 3), and v2 — nModesInType ( = 2).

At the point of the exception, up the stack frame I see the following code:

```// Message to 'm_index2_cases_uses' from CasesInt factor
this.m_index2_cases_uses_B[index2][mr] = Bernoulli.FromLogOdds(IntCasesOp.LogEvidenceRatio(this.d_cases_B[index2][mr], this.d_F[index2][mr]))```

up the stack:

```		public static double LogEvidenceRatio(IList<Bernoulli> cases, Discrete i)
{
if (i.IsPointMass) return cases[i.Point].LogOdds;
else
{
double[] logOdds = new double[cases.Count];
for (int j = 0; j < cases.Count; j++)
{
logOdds[j] = cases[j].LogOdds + i.GetLogProb(j);
}
return MMath.LogSumExp(logOdds);
}
}

```
The exception is thrown in the expression i.GetLogProb(j).
Here, cases.Count = 4 (= nChordsInType + nModesInType -1), and i is Discrete (0.333, 0.333, 0.333). Note that its value range is 3.

I suspect that this is because i is the sum of two Discrete variables. It would be logical to assume that the resulting distribution will have more dimensions, but as we see it doesn't.

Is this a bug which can be worked around, or do I have to bump up the range of possible values of v1 and/or v2?

Saturday, June 11, 2011 5:43 PM
• If you suspect that Infer.NET is using the wrong number of dimensions for unwrapped, you can override this by adding the following line:

`unwrapped.AddAttribute(new MarginalPrototype(new Discrete(4)));`

This will force the marginal distribution of unwrapped to have 4 dimensions (i.e. to range over 0,1,2,3).

Saturday, June 11, 2011 6:32 PM
• Thanks, Tom — that helped!

There is another problem though. I'm getting "the model has zero probability" AllZeroException. Of course I blamed my model for that, but after trying to distill the cause of the problem I suspected something strange.

What I want to do in this case boils down to this: An observed discrete variable v is drawn from a Discrete distribution with the probabilities vector taken from the array indexed by a discrete variable d (the array emulates MOD) which is the sum of a discrete variable, m, which is drawn from an observed vector of probabilities pm, and a discrete variable c with probabilities vector drawn from the target (to be learned) Dirichlet distribution pc.

Actually, even the simplest code trying to infer the distribution of an addendum fails with AllZeroException:

```   var v = Variable.DirichletUniform(3);
var n = new Range(1000);
var values = Variable.Array<int>(n);
values[n] = (Variable.Discrete(v) + 1).ForEach(n);
values.ObservedValue = Enumerable.Range(0, 1000).Select(k => k%3 > 0 ? 1 : k%2 > 0 ? 2 : 3).ToArray();
Console.WriteLine(new InferenceEngine().Infer(Variable.Discrete(v)));

```

Given the specified observed sequence, I would expect the result to be along the lines of Discrete (2/3, 1/6, 1/6).

Saturday, June 11, 2011 11:13 PM
• The problem here is the line:

```values[n] = (Variable.Discrete(v) + 1).ForEach(n);

```

I think you meant to say:

```values[n] = Variable.Discrete(v).ForEach(n) + 1;
```

ForEach applies to the single operation it is attached to.  In the first line, you are drawing a single random value and adding 1 to it n times.  In the second line, you are drawing n random values and adding 1 to each one.

Sunday, June 12, 2011 9:20 AM