Answered by:
Simplest learning to rank model
Question

I am looking for pointers to implement a simple learning to rank model in Infer.NET. I have a set of examples for training. A training example is comprised of some number of binary feature vectors and a rank (positive integer). What model could I use to learn a model from this data to rank an example with no rank information? The number of feature vectors in an example may be different from example to example.
A training example can look like this:
3 qid:10 3:1 4:1 5:1 2 qid:10 2:1 3:1 1 qid:10 1:1 4:1 5:1
Here the training example (id=10) is built from three feature vectors. The first feature vector has highest rank (3) and the feature vector is (0, 0, 1, 1, 1).
I read the SoftRank paper but it was not clear to me how exactly to implement it. May be try ordered logistic regression where the constraints would be used to enforce the rank ordering?
There are many learning to rank software packages that will work on my data. I am looking for a prediction that is not simply a rank (integer) or score (real value) but a distribution over ranks.
I tried to search some models but I am unsure where to look.
Thanks
 Edited by usptact Friday, November 10, 2017 9:40 AM
Friday, November 10, 2017 9:37 AM
Answers

I recommend a Thurstonian model (as in TrueSkill) where each item has a latent random performance variable and the ranks arise from sorting the performances. The model would end up looking exactly like TrueSkill except that the skills are computed from the feature vectors. For example, skill could be the inner product of the binary features and some weights. For prediction, the model would give a distribution over performances, not ranks, but you can convert this into ranks in various ways as described in the SoftRank paper.
 Marked as answer by usptact Friday, November 10, 2017 5:23 PM
Friday, November 10, 2017 11:51 AMOwner
All replies

I recommend a Thurstonian model (as in TrueSkill) where each item has a latent random performance variable and the ranks arise from sorting the performances. The model would end up looking exactly like TrueSkill except that the skills are computed from the feature vectors. For example, skill could be the inner product of the binary features and some weights. For prediction, the model would give a distribution over performances, not ranks, but you can convert this into ranks in various ways as described in the SoftRank paper.
 Marked as answer by usptact Friday, November 10, 2017 5:23 PM
Friday, November 10, 2017 11:51 AMOwner 
Thanks for the pointer! I get the idea! I happen to be experimenting with the TrueSkill model, I think I know how to implement this model. Less certain about converting the distribution over performances to ranks but I need to get to this stage yet. Much appreciated!
Last night I stumbled upon a paper from Cramer & Singer, 2001 "Pranking with ranking". The idea is to learn jointly a regression model (w vector; why authors omitted bias term?) and also the thresholds b_1 ... b_K (the number of thresholds K is fixed). An item to be ranked (x) must fall inside the interval to be ranked correctly. The predicted rank is the smallest integer such that the score (from dot product) is smaller than the threshold value: min r \in {1, ..., K} (r: <w, x> + < b_r)
The method is a discriminative one but I can see how it can be made Bayesian. The w vector can be Vector Gaussian as in classic regression; The K thresholds (b_1, .. , b_K) can be Gaussian RVs (?) with constraints b_1 < b_2 < .. < b_K. It is less clear about how to get the distribution over ranks here though...
 Edited by usptact Friday, November 10, 2017 5:38 PM
Friday, November 10, 2017 5:25 PM