I am trying to practice more with the framework and decided to extend the Bayes Point Machine to multiple classes. Conceptually this is not very complicated: (1) create one weights vector for each class, (2) do dot product of features vector with each weights
vector, (3) apply softmax and enforce one specific element to be also the largest among other K-1 (using ground truth).

Suppose that I have a K=3 class classifier (w_1, w_2, w_3 vectors), a features vector x (d dimensions) and ground truth vector y={false, true, false} (x belongs to class 2). My idea is to dot product x with each of w_i vectors, do softmax (Variable.SoftMax?)
and then use groundtruth "y". I don't know how to use information in "y" to tell that one of the values in the softmax result (Vector?) must be larger than other K-1 elements.

Are there any similar examples that I can look at and go from there?