I am attempting to implement the Thurstonian model for pairwise ranking. I ran into a problem defining elements of the pairwise ranking VariableArray<bool> based on pairs of elements from the scores VariableArray<double> array.
An item is represented as an VariableArray<double> of scores and is of length n. Each element of the array is a Gaussian RV. Mean is defined by an inner product of respective feature vector (observed) and parameters vector (VectorGaussian and is learned).
The variance is hard-coded for now.
I need to declare and statistically define the pairwise ranking VariableArray<bool> which is associated with the scores array described above. The size of this array is n-1. To define first element of this array, I need to compare first and second
RV of the scores variable array using ">" factor. To define the second element, I need to compare second and third from the scores variable array. And so forth. How can I write that in Infer.NET using "Variable.ForEach(...)" constructs?
Or differently.
To train the model (infer the parameters VectorGaussian posterior), I have an array of items. Each item has a potentially different size (different number of feature vectors) and respectively scores and respectively the outcomes. I created a jagged array
for features and a different one for outcomes to realize that I cannot index pairs from scores
VariableArray<VariableArray<double>, double[][]>
to statistically define the outcomes (
VariableArray<VariableArray<bool, bool[][]>>
). Also I cannot write a comparison "using(Variable.If(scores[item][feature]) > scores[iteam][item][feature+1])".
The problem looks twofolds (1) scores array of array and outcomes array of arrays have different internal ranges (2) unclear how to write comparison when adding/subtracting integer from range in "Variable.If(...)" construction is not allowed.
The work code is on GitHub:
https://github.com/usptact/Infer.NET-LTR/tree/master/Infer.NET-LTR
Look at the Model.cs file where I defined variables and started to design the model. The Reader.cs and Program.cs should give an idea about the data. The sample data is in a "sample.ltr" file and follows SVM-Rank format. There are two items and
each is 3 feature vectors big.
In my code I have three outcomes: ">" is true, "<" is true and draw option. Perhaps I need to simplify the code to two outcomes for now.
A route that I don't want to go, is to instead of VariableArray's use .NET arrays. I guess that would be fairly inefficient.
Thanks