# Q. Can I declare a variable for dataset having dynamic size? (Migrated from community.research.microsoft.com)

• ### Question

• sungchul kim posted on 12-06-2010 1:35 AM

Hi all,

I have a question. In my system, I want to declare 'VariableArray<Vector> or VariableArray<double[]>' for dataset having dynamic size.

AdId Term T1 T2 T3 T4 T5

1 great 1 0.4 2.3 2.1 4.1 5.2

1 hotel 11 5.3 19.2 1.4 5.2 3.2

1 book 1 13 5.2 1.2 5.2 5.2

2 cheap 3 3.2 5.1 5.1 5.2

2 price 3 12 5 23.2 1.2 5.

where T1~T5 is features for each word, however I have to use a set of term vectors having same AdId. I've searched this forum and documents many times. I cannot find the answer yet.

Friday, June 3, 2011 6:07 PM

• John Guiver replied on 12-07-2010 9:30 AM

I don't know of a good way to do this. There is a Max factor but it only takes two values - you would need to create a Max factor taking an array, and work out the mathematics for the message updates (as an example, see how the Sum factor and message operators are implemented).

Friday, June 3, 2011 6:08 PM

### All replies

• John Guiver replied on 12-06-2010 4:00 AM

Hi

The answer depends on what the emphasis of your question is. Defining random variable arrays of dynamic size is straightforward, and is described in Jagged Arrays. Another question is how do you want to use these in your model. The Sparse Bayes Point Machine gives an example of a model which uses dynamic feature vectors. Can you describe your model in more detail?

John G.

Friday, June 3, 2011 6:08 PM
• sungchul kim replied on 12-06-2010 8:36 PM

Actually, the random variable that I want to use is does not need to have dynamic dimension. Sorry for short explanation. What I want to do is if there are two nodes

A -> B

where A indicates n-dim vector and B indicates score lists (a list of scalar value). Each score B depends on certain number of vectors in A and that number is dynamic. For example, in my model, A consists of feature vectors of terms in each ad and B is ad score computed using A as follows.

A = {a1, a2, a3, a4, a5}

AdId Term T1 T2 T3 T4 T5

1 great 1 0.4 2.3 2.1 4.1 5.2

1 hotel 11 5.3 19.2 1.4 5.2 3.2

1 book 1 13 5.2 1.2 5.2 5.2

2 cheap 3 3.2 5.1 5.1 5.2

2 price 3 12 5 23.2 1.2 5.

B = {b1, b2}

1 MAX(<wa1>, <wa2>, <wa3>) SUM(<wa1>, <wa2>, <wa3>) MEAN(<wa1>, <wa2>, <wa3>)

2 MAX(<wb1>, <wb2>) SUM(<wb1>, <ba2>) MEAN(<wb1>, <wb2>)

where <wa2> means inner product with w and a1 (Assume that w is given). Each score in ad1 used 3 term vectors and that in ad2 used 2 term vectors from A. Is It possible? and if so, there a provided syntax that can be applied to this case?

Thanks again.

Friday, June 3, 2011 6:08 PM
• John Guiver replied on 12-07-2010 5:22 AM

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using MicrosoftResearch.Infer.Models;
using MicrosoftResearch.Infer.Maths;
using MicrosoftResearch.Infer.Distributions;
using MicrosoftResearch.Infer;

namespace sungchul_kim1
{

class Program

{

static void Main(string[] args)
{

// The model
var a = new Range(numAds);           // Corresponding range
var numTVs = Variable.Array<int>(a); // Number of term vectors per ad
var t = new Range(numTVs[a]);        // Corresponding range
var multiplier = Variable.Array<double>(a);
// Multiplier to create average
var tv = Variable.Array(Variable.Array<Vector>(t), a);    // Term vectors
var wPrior = Variable.New<VectorGaussian>(); // Prior weight distribution
var w = Variable.Random<Vector, VectorGaussian>(wPrior);
var w_tv = Variable.Array(Variable.Array<double>(t), a);  // Dot products
var noise = 0.1;
var score = Variable.Array<double>(a);
using (Variable.ForEach(a)){     // For each ad
using (Variable.ForEach(t)) // For each of the ad's term vectors

w_tv[a][t] = Variable.InnerProduct(w, tv[a][t]);

var average = Variable.Sum(w_tv[a]) * multiplier[a];
score[a] =
Variable.GaussianFromMeanAndPrecision(average, noise);
}

// Inference engine
var engine = new InferenceEngine();
// For training, set an uninformative prior, and observe the scores
Vector[][] trainingData = new Vector[][] {

new Vector[] {

Vector.FromArray(0.4, 2.3, 2.1, 4.1),

Vector.FromArray(5.3, 19.2, 1.4, 5.2),

Vector.FromArray(5.2, 1.2, 5.2, 5.2)},

new Vector[] {

Vector.FromArray(3.2, 5.1, 5.1, 5.2),

Vector.FromArray(5, 23.2, 1.2, 5.0)}
};

double[] trainingScores = new double[] { 1.0, 2.0 };
numTVs.ObservedValue = trainingData.Select(d => d.Length).ToArray();
multiplier.ObservedValue = trainingData.Select(d => 1.0 / d.Length).ToArray();
tv.ObservedValue = trainingData;

int numFeatures = trainingData[0][0].Count;
wPrior.ObservedValue =
VectorGaussian.FromMeanAndPrecision(
Vector.Zero(numFeatures), PositiveDefiniteMatrix.Identity(numFeatures));
score.ObservedValue = trainingScores;

var wPosterior = engine.Infer<VectorGaussian>(w);

Console.WriteLine(wPosterior.GetMean());

// For test, set weight prior to posterior from training, and clear the score obs.

Vector[][] testData = new Vector[][] {

new Vector[] {
Vector.FromArray(1.7, 2.7, 1.51, 4.5),

Vector.FromArray(3.6, 14.1, 2.4, 5.7)}};
numTVs.ObservedValue = testData.Select(d => d.Length).ToArray();
multiplier.ObservedValue = testData.Select(d => 1.0 / d.Length).ToArray();
tv.ObservedValue = testData;
wPrior.ObservedValue = (
VectorGaussian)wPosterior.Clone();
score.ClearObservedValue();

var predictedScores = engine.Infer<Gaussian[]>(score);

Console.WriteLine(predictedScores[0]);
}
}
}

Friday, June 3, 2011 6:08 PM
• sungchul kim replied on 12-07-2010 5:34 AM

Thanks a lot. Your efforts are really helpful to me and my research. I think even I can do research all night log. ^^

Friday, June 3, 2011 6:08 PM
• sungchul kim replied on 12-07-2010 5:41 AM

I have one more question. Do you think it is possible to use Variance(scores) or Max(scores) as a random variable for other score in this model?

Friday, June 3, 2011 6:08 PM
• John Guiver replied on 12-07-2010 9:30 AM

I don't know of a good way to do this. There is a Max factor but it only takes two values - you would need to create a Max factor taking an array, and work out the mathematics for the message updates (as an example, see how the Sum factor and message operators are implemented).

Friday, June 3, 2011 6:08 PM