Answered by:
'VariableArrayName' is not defined in all cases. It is only defined for (...) in

I am continuing to build a simple Thurstonian model for learning to rank. The scores of two items to rank are Gaussian distributed. The means of these VectorGaussian RVs are Inner Product between a "w" parameters vector (inferred in training; fixed in prediction) and a respective features vector (given and fixed). The ranking stems from checking three cases when comparing two scores (draw, left > right, right < left) for two respective items in an example. An example is composed of a variable number of items. The dataset consists of a number of examples.
The training appears to be working as I can infer (1) the posterior for "w" which is a VectorGaussian and (2) a global, shared mean for all draw RVs (Gaussian; one per training example).
The model is written as follows:
using (Variable.ForEach(example)) { using (Variable.ForEach(item)) { var mean = Variable.InnerProduct(w, features[example][item]).Named("mean"); scores[example][item] = Variable.GaussianFromMeanAndVariance(mean, 1.0); } using (ForEachBlock pairBlock = Variable.ForEach(rank)) { var idx = pairBlock.Index; var left = scores[example][idx]; var right = scores[example][idx + 1]; using (Variable.If(Variable.IsBetween(left  right, drawMargin[example], drawMargin[example]))) ranks[example][rank] = 0; using (Variable.If(left > right + drawMargin[example])) ranks[example][rank] = 1; using (Variable.If(left < right + drawMargin[example])) ranks[example][rank] = 2; } }
The relevant model variables are defined as:
numExamples = Variable.New<int>(); example = new Range(numExamples).Named("exampleRange"); // // Jagged arrays for (items, features) // exampleSize = Variable.Array<int>(example).Named("itemSizes"); item = new Range(exampleSize[example]).Named("itemRange"); scores = Variable.Array(Variable.Array<double>(item), example).Named("scores"); features = Variable.Array(Variable.Array<Vector>(item), example).Named("features"); // // Jagged array for item pair ranks // rankSize = Variable.Array<int>(example); rank = new Range(rankSize[example]).Named("rankRange"); ranks = Variable.Array(Variable.Array<int>(rank), example).Named("pairwiseRanks"); // // Draw // drawMeanPrior = Variable.New<Gaussian>(); drawMean = Variable.Random<double, Gaussian>(drawMeanPrior); var drawMargin = Variable.Array<double>(example); using (Variable.ForEach(example)) { drawMargin[example] = Variable.GaussianFromMeanAndVariance(drawMean, 1.0).Named("drawMargin"); Variable.ConstrainPositive(drawMargin[example]); } // // Model parameters // wPrior = Variable.New<VectorGaussian>(); w = Variable.Random<Vector, VectorGaussian>(wPrior).Named("w");
At prediction stage, I observe all the variables pertaining array sizes, initial priors and feature vectors as in training. The difference this time is that (1) I observe the prior for "w" (it is trained and fixed), (2) observe draw mean prior and (3) don't observe "rank" as I want to infer it.
To infer pairwise rankings, I provide one single example that is composed of 6 feature vectors (6 items to rank). When I run the inference, I get the error:
'ranks' is not defined in all cases. It is only defined for (vbool7=true)(vbool8=true)(vbool9=true) in:
Intuitively I get that dependent on my pairwise rankings, not all "ranks" elements will be set. I am not sure how to fix this error though.
The full code is on GitHub:
https://github.com/usptact/Infer.NETLTR/tree/master/Infer.NETLTR
The data are in SVMLight format; I used LETOR MQ2007 and MQ2008 datasets for experiments and validation.
 Edited by usptact Monday, December 4, 2017 10:50 PM
Question
Answers

using (Variable.If(leftWins)) ranks[example][rank] = 1; using (Variable.IfNot(leftWins)) { using (Variable.If(rightWins)) ranks[example][rank] = 2; using(Variable.IfNot(rightWins)) ranks[example][rank] = 0; }
All replies

I searched the forum posts and found a couple of posts where it was suggested to create separate Variable<bool> variables that are used for branching. In my case I created the following three:
var isDraw = Variable.IsBetween(diff, drawMargin[example], drawMargin[example]); var leftWins = diff > drawMargin[example]; var rightWins = diff < drawMargin[example];
Then I noticed that instead of simply assigning a discrete value, people use <var>.SetTo(Variable.Constant(<x>). In my case I wrote:
using (Variable.If(isDraw)) ranks[example][idx].SetTo(Variable.Constant(0)); using (Variable.If(leftWins)) ranks[example][idx].SetTo(Variable.Constant(1)); using (Variable.If(rightWins)) ranks[example][idx].SetTo(Variable.Constant(2));
Unfortunately, I get the same error as before.
As I was trying to find the bug, I fixed the bug for observing the learned parameters "w" and "drawMargin". During prediction, these two are not inferred by the engine anymore.

Also tried to merge the two inner ForEach loops. Using the inner loop for the "item" range. Since the "ranks" array is one element smaller than scores, I put a branch on index (using (Variable.If(idx > 0))) around the former second block of model. This resulted in a weird schedule error. Now I am truly stuck with all options exhausted.

using (Variable.If(leftWins)) ranks[example][rank] = 1; using (Variable.IfNot(leftWins)) { using (Variable.If(rightWins)) ranks[example][rank] = 2; using(Variable.IfNot(rightWins)) ranks[example][rank] = 0; }

Thank you, Tom! I missed this part!
After making the change, I now get a "model zero probability" error. What are the best ways to debug this issue?
My model now looks like this:
using (Variable.ForEach(example)) { using (ForEachBlock itemBlock = Variable.ForEach(item)) { var mean = Variable.InnerProduct(w, features[example][item]); scores[example][item] = Variable.GaussianFromMeanAndVariance(mean, 1.0); } using (ForEachBlock pairBlock = Variable.ForEach(rank)) { var idx = pairBlock.Index; var diff = scores[example][idx]  scores[example][idx + 1]; var isDraw = Variable.IsBetween(diff, drawMargin[example], drawMargin[example]); var leftWins = diff > drawMargin[example]; var rightWins = diff < drawMargin[example]; using (Variable.If(leftWins)) ranks[example][rank] = 1; using (Variable.IfNot(leftWins)) { using (Variable.If(rightWins)) ranks[example][rank] = 2; using (Variable.IfNot(rightWins)) ranks[example][rank] = 0; } } }
A quick search in forum Q&A shows that this may be due to some values outside the range. How can I check that which RVs get values outside the range?
UPDATE: I think I know where the issue it. The "diff" variable must include noise, e.g. have it a Gaussian RV with nonzero variance.
UPDATE 2: The problem with zero probability is elsewhere. I suspect something is wrong with getting the scores in three intervals. One of the branches is never (?) visited.
Thanks
 Edited by usptact Tuesday, December 5, 2017 7:52 PM

As FAQ, I switched from assigning point masses to "ranks" to slightly noisy version using SetTo() method.
var isDraw = Variable.IsBetween(diff, margin, margin); var leftWins = (diff > margin).Named("leftWins"); var rightWins = (diff < margin).Named("rightWins"); using (Variable.If(isDraw)) ranks[example][rank].SetTo(Variable.Discrete(new double[] { 0.99, 0.005, 0.005 })); using (Variable.IfNot(isDraw)) { using (Variable.If(rightWins)) ranks[example][rank].SetTo(Variable.Discrete(new double[] { 0.005, 0.005, 0.99 })); using (Variable.IfNot(rightWins)) ranks[example][rank].SetTo(Variable.Discrete(new double[] { 0.005, 0.99, 0.005 })); }
The prediction in my examples is always 0 (draw) which is incorrect but at least I get past this error.


Interesting! If I understand the constraints part correctly, you refer to Variable.ConstrainBetween(), Variable.ConstrainPositive() and Variable.ConstrainNegative(). I see that there is even a Variable.Constrain() that can accept a custom factor.
In prediction part (ranks are unobserved; w is observed), I guess I still need to do the branching based on "diff".