locked
Add Hierarchical level to the Racommendation System base RRS feed

  • Question

  • Hi 

    I'm working on my master Project and i use as base the Recommender System. I'd like to add a hierarchical layer to the traits prior, but i've some troubles. I would like that for each user and Item traits vector in the Recommender system case would be sampled from a Multivariate Gaussian with mean sampled from Another multivariate Gaussian and Precision Matrix sampled from a Wishart distribution.

    The idea is to use something similar the Bayesian Probabilistic Matrix Factorizaion (www.cs.utoronto.ca/~amnih.ca/papers/bpmf.pdf page 3 image 1b) where

      mu_0,P_0   W_0,mu_0        mu_0,P_0   W_0,mu_0 
              |         |                              |         | 
         mu_u      Lamda_u              mu_v      Lamda_V
                \     /                                  \    /
                  U_j                                    V_i


    V_j, U_i are the traits Vector, mu_u and mu_v are the Means vector sampled from a Gaussian, Lamda_u and Lamda_V the precision matrices sampled from the wishart distribution. 

    I have tried to update the model as in the 'Gaussian Mixture'. So, i have the following code:

    // Define ranges
                int numRBPs = RBPs;
                int numGenes = Genes;
                int numTraits = numTrait;
                Variable<int> numObservations = Variable.Observed(tmpGene.Length).Named("numObservations");
                int numLevels = 1;
    
                // Define ranges
                Range RBP = new Range(numRBPs).Named("RBP");
                Range gene = new Range(numGenes).Named("gene");
                Range trait = new Range(numTraits).Named("trait");
                Range observation = new Range(numObservations).Named("observation");
                Range level = new Range(numLevels).Named("level");
    
                // Define latent variables
                var RBPTraits = Variable.Array<Vector>(RBP).Named("RBPTraits");
                var geneTraits = Variable.Array<Vector>(gene).Named("geneTraits");
                var RBPBias = Variable.Array<double>(RBP).Named("RBPBias");
                var geneBias = Variable.Array<double>(gene).Named("geneBias");
                var RBPThresholds = Variable.Array<double>(RBP).Named("RBPThresholds");
    
                Variable<Vector> traitPriorMean = Variable.VectorGaussianFromMeanAndPrecision(
                    Vector.Constant(numTraits, 0.0), PositiveDefiniteMatrix.IdentityScaledBy(numTraits, 2));
                Variable<PositiveDefiniteMatrix> traitPriorPrecision = Variable.WishartFromShapeAndScale(numTraits, PositiveDefiniteMatrix.IdentityScaledBy(numTraits, 2));
    
                //Variable<double> traitPriorMean = Variable.VectorGaussianFromMeanAndVariance(0, 2);
                //Variable<double> traitPriorPrecision = Variable.GammaFromMeanAndVariance(3, 3);
    
                Variable<double> biasPriorMean = Variable.GaussianFromMeanAndVariance(0, 2);
                Variable<double> biasPriorPrecision = Variable.GammaFromMeanAndVariance(3, 3);
    
    
    
                //Variable<double> traitPrior = Variable.GaussianFromMeanAndPrecision(traitPriorMean, traitPriorPrecision);
    
    
                // Define latent variables statistically
                RBPTraits[RBP] = Variable.VectorGaussianFromMeanAndPrecision(traitPriorMean, traitPriorPrecision).ForEach(RBP);
                geneTraits[gene] = Variable.VectorGaussianFromMeanAndPrecision(traitPriorMean, traitPriorPrecision).ForEach(gene);
                RBPBias[RBP] = Variable.GaussianFromMeanAndPrecision(biasPriorMean, biasPriorPrecision).ForEach(RBP);
                geneBias[gene] = Variable.GaussianFromMeanAndPrecision(biasPriorMean, biasPriorPrecision).ForEach(gene);
                RBPThresholds[RBP] = Variable.GaussianFromMeanAndPrecision(0, 1).ForEach(RBP);


    The problem now is in the Model definition:

    // Model
                using (Variable.ForEach(observation))
                {
                    VariableArray<double> products = Variable.Array<double>(trait);//.Named("products");
                    products[trait] = RBPTraits[RBPData[observation]][trait] * geneTraits[geneData[observation]][trait];
    
                    Variable<double> bias = (RBPBias[RBPData[observation]] + geneBias[geneData[observation]]);//.Named("bias");
                    Variable<double> affinity = (bias + Variable.Sum(products));//.Named("productSum")).Named("affinity");
                    Variable<double> noisyAffinity = Variable.GaussianFromMeanAndVariance(affinity, affinityNoiseVariance);//.Named("noisyAffinity");
    
                    Variable<double> noisyThresholds = Variable.GaussianFromMeanAndVariance(RBPThresholds[RBPData[observation]], thresholdsNoiseVariance);
                    ratingData[observation] = noisyAffinity > noisyThresholds;
                }


    the line 

                    products[trait] = RBPTraits[RBPData[observation]][trait] * geneTraits[geneData[observation]][trait];

    that makes the product was previously correct for RBPTraits and geneTraits that were defined as

    var RBPTraits = Variable.Array(Variable.Array<double>(trait), RBP).Named("RBPTraits");
            var geneTraits = Variable.Array(Variable.Array<double>(trait), gene).Named("geneTraits");

    But are not correct now that i'm using the Variable<Vector> sampled from multivariate Distributions.

    So, my questons are:

    _Am I done it correctly?? I mean, i have removed the "RBPTraitsPrior, geneTraitsPrior , RBPBiasPrior, geneBiasPrior" structure, but it seems to me still correct;

    _How can I access to the Variable<Vector> elements as it was VariableArray<Double>????

    Thanks in advance

    Marco









    • Edited by MarcoASDF Tuesday, June 25, 2013 4:42 PM
    Tuesday, June 25, 2013 4:37 PM

All replies

  • Hi Marco,

    You can't access separately the elements of a Variable<Vector>. The idea of this construct is that you have a distribution over vectors, so you have to work with the vectors as a whole. Unfortunately, you can't use the InnerProduct factor either, because we don't have support for two stochastic inputs there. Therefore, you'll have to convert the Vector to a VariableArray by using the ArrayFromVector factor, and then define the rest of the model as before. Don't forget to give priority to the Product_SHG09 factor for this multiplication.

    By the way, you can try to do things more incrementally. That is, firstly add the hierarchy, then test your model, and only then convert the variable arrays to vectors. You can have a hierarchy on each element of the traits instead of having it on the whole trait vector. Your mean and precision will be now of type double (with Gaussian and Gamma priors respectively) instead of types Vector and PositiveDefiniteMatrix (with VectorGaussian and Wishart priors respectively).

    Also, as mentioned in previous forum posts, you'll need to add a sequential attribute on the observation range (I forgot to include this in the example).
    observation.AddAttribute(new Sequential());
    engine.Compiler.UseSerialSchedules = true;

    Finally, let me warn you that adding a hierarchy will affect the way you make predictions. You can no longer feed in the learned posteriors over the traits as new priors, because of the hierarchy on the prior.

    Cheers,
    Yordan

    Wednesday, June 26, 2013 10:12 AM
  • Dear Yordan

    Thank you for your replay. As you suggest I have implemented a version where there is a single prior for each user and item. This means that now the prior matrices are not for the traits, but for the mean and the precision for each of the traits. The following code show how I have defined the hierarchical layer:

    // Define ranges
                int numRBPs = RBPs;
                int numGenes = Genes;
                int numTraits = numTrait;
                Variable<int> numObservations = Variable.Observed(tmpGene.Length).Named("numObservations");
                int numLevels = 1;
    
                // Define ranges
                Range RBP = new Range(numRBPs).Named("RBP");
                Range gene = new Range(numGenes).Named("gene");
                Range trait = new Range(numTraits).Named("trait");
                Range observation = new Range(numObservations).Named("observation");
                Range level = new Range(numLevels).Named("level");
    
                // Define latent variables
                var RBPTraits = Variable.Array(Variable.Array<double>(trait), RBP).Named("RBPTraits");
                var geneTraits = Variable.Array(Variable.Array<double>(trait), gene).Named("geneTraits");
                var RBPBias = Variable.Array<double>(RBP).Named("RBPBias");
                var geneBias = Variable.Array<double>(gene).Named("geneBias");
                var RBPThresholds = Variable.Array<double>(RBP).Named("RBPThresholds");
    
                var RBPTraitsMean = Variable.Array(Variable.Array<double>(trait), RBP).Named("RBPTraitsMean");
                var RBPTraitsPrec = Variable.Array(Variable.Array<double>(trait), RBP).Named("RBPTraitsPrec");
                var geneTraitsPrec = Variable.Array(Variable.Array<double>(trait), gene).Named("geneTraitsMean");
                var geneTraitsMean = Variable.Array(Variable.Array<double>(trait), gene).Named("geneTraitsPrec");
    
                var RBPBiasMean = Variable.Array<double>(RBP).Named("RBPBiasMean");
                var RBPBiasPrec = Variable.Array<double>(RBP).Named("RBPBiasPrec");
                var geneBiasMean = Variable.Array<double>(gene).Named("geneBiasMean");
                var geneBiasPrec = Variable.Array<double>(gene).Named("geneBiasPrec");
    
    
                // Define priors
    
                var RBPTraitsPrior = Variable.Array(Variable.Array<Gaussian>(trait), RBP).Named("RBPTraitsPrior");
                var geneTraitsPrior = Variable.Array(Variable.Array<Gaussian>(trait), gene).Named("geneTraitsPrior");
                var RBPBiasPrior = Variable.Array<Gaussian>(RBP).Named("RBPBiasPrior");
                var geneBiasPrior = Variable.Array<Gaussian>(gene).Named("geneBiasPrior");
                var RBPThresholdsPrior = Variable.Array<Gaussian>(RBP).Named("RBPThresholdsPrior");
    
                var RBPTraitsPriorMean = Variable.Array(Variable.Array<Gaussian>(trait), RBP).Named("RBPTraitsPriorMean");
                var RBPTraitsPriorPrec = Variable.Array(Variable.Array<Gamma>(trait), RBP).Named("RBPTraitsPriorPrec");
    
                var geneTraitsPriorMean = Variable.Array(Variable.Array<Gaussian>(trait), gene).Named("geneTraitsPriorMean");
                var geneTraitsPriorPrec = Variable.Array(Variable.Array<Gamma>(trait), gene).Named("geneTraitsPriorPrec");
    
                var RBPbiasPriorMean = Variable.Array<Gaussian>(RBP).Named("RBPBiasPriorMean");
                var RBPbiasPriorPrec = Variable.Array<Gamma>(RBP).Named("RBPBiasPriorPrec");
    
                var geneBiasPriorMean = Variable.Array<Gaussian>(gene).Named("geneBiasPriorMean");
                var geneBiasPriorPrec = Variable.Array<Gamma>(gene).Named("geneBiasPriorPrec");
    
    
                // Define latent variables statistically
    
                RBPTraitsMean[RBP][trait] = Variable<double>.Random(RBPTraitsPriorMean[RBP][trait]);
                RBPTraitsPrec[RBP][trait] = Variable<double>.Random(RBPTraitsPriorPrec[RBP][trait]);
                geneTraitsMean[gene][trait] = Variable<double>.Random(geneTraitsPriorMean[gene][trait]);
                geneTraitsPrec[gene][trait] = Variable<double>.Random(geneTraitsPriorPrec[gene][trait]);
    
                RBPBiasMean[RBP] = Variable<double>.Random(RBPbiasPriorMean[RBP]);
                RBPBiasPrec[RBP] = Variable<double>.Random(RBPbiasPriorPrec[RBP]);
                geneBiasMean[gene] = Variable<double>.Random(geneBiasPriorMean[gene]);
                geneBiasPrec[gene] = Variable<double>.Random(geneBiasPriorPrec[gene]);
    
    
                /****************************************************************************/
                RBPTraits[RBP][trait] = Variable.GaussianFromMeanAndPrecision(RBPTraitsMean[RBP][trait],
                                                                           RBPTraitsPrec[RBP][trait]);
                geneTraits[gene][trait] = Variable.GaussianFromMeanAndPrecision(geneTraitsMean[gene][trait],
                                                                           geneTraitsPrec[gene][trait]);
                RBPBias[RBP] = Variable.GaussianFromMeanAndPrecision(RBPBiasMean[RBP],
                                                                           RBPBiasPrec[RBP]);
                geneBias[gene] = Variable.GaussianFromMeanAndPrecision(geneBiasMean[gene],
                                                                           geneBiasPrec[gene]);
                RBPThresholds[RBP] = Variable<double>.Random(RBPThresholdsPrior[RBP]);
                /****************************************************************************/
    
    
                // Initialise priors
    
                Gaussian traitPriorMean = Gaussian.FromMeanAndVariance(0, 3);
                Gamma traitPriorPrec = Gamma.FromShapeAndScale(8, 2);
    
                Gaussian biasPriorMean = Gaussian.FromMeanAndVariance(0, 3);
                Gamma biasPriorPrec = Gamma.FromShapeAndScale(8, 2);
    
                /********* Create two matrices of distributions: one for the means and one for the precisions for all **********
                ********** the RBPs and genes Traits, for all the RBPs and gene bias                                  **********/
    
                RBPTraitsPriorMean.ObservedValue = Util.ArrayInit(numRBPs, u => Util.ArrayInit(numTraits, t => traitPriorMean));
                RBPTraitsPriorPrec.ObservedValue = Util.ArrayInit(numRBPs, u => Util.ArrayInit(numTraits, t => traitPriorPrec));
                geneTraitsPriorMean.ObservedValue = Util.ArrayInit(numGenes, i => Util.ArrayInit(numTraits, t => traitPriorMean));
                geneTraitsPriorPrec.ObservedValue = Util.ArrayInit(numGenes, i => Util.ArrayInit(numTraits, t => traitPriorPrec));
    
                RBPbiasPriorMean.ObservedValue = Util.ArrayInit(numRBPs, u => biasPriorMean);
                RBPbiasPriorPrec.ObservedValue = Util.ArrayInit(numRBPs, u => biasPriorPrec);
    
                geneBiasPriorMean.ObservedValue = Util.ArrayInit(numGenes, i => biasPriorMean);
                geneBiasPriorPrec.ObservedValue = Util.ArrayInit(numGenes, i => biasPriorPrec);
    
                
                RBPThresholdsPrior.ObservedValue = Util.ArrayInit(numRBPs, u => Gaussian.FromMeanAndVariance(0, 1.0));
    
               
                InferenceEngine engine = new InferenceEngine();
                engine.NumberOfIterations = iteration;
    
    
                // Set model noises explicitly
    
    			// Declare training data variables
    			var RBPData = Variable.Array<int>(observation).Named("RBPData");
    			var geneData = Variable.Array<int>(observation).Named("geneData");
    			var ratingData = Variable.Array<bool>(observation).Named("ratingData");
    
    			// Set model noises explicitly
    			Variable<double> affinityNoiseVariance = Variable.Observed(noiseVar).Named("affinityNoiseVariance");
    			Variable<double> thresholdsNoiseVariance = Variable.Observed(noiseVar).Named("thresholdsNoiseVariance");
    
    
                // Model
                using (Variable.ForEach(observation))
                {
                    VariableArray<double> products = Variable.Array<double>(trait);//.Named("products");
                    products[trait] = RBPTraits[RBPData[observation]][trait] * geneTraits[geneData[observation]][trait];
    
                    Variable<double> bias = (RBPBias[RBPData[observation]] + geneBias[geneData[observation]]);//.Named("bias");
                    Variable<double> affinity = (bias + Variable.Sum(products));//.Named("productSum")).Named("affinity");
                    Variable<double> noisyAffinity = Variable.GaussianFromMeanAndVariance(affinity, affinityNoiseVariance);//.Named("noisyAffinity");
    
                    Variable<double> noisyThresholds = Variable.GaussianFromMeanAndVariance(RBPThresholds[RBPData[observation]], thresholdsNoiseVariance);
                    ratingData[observation] = noisyAffinity > noisyThresholds;
                }
    
                // Observe training data
                GenerateData(numRBPs, numGenes, numTraits, numObservations.ObservedValue, numLevels,
                                         RBPData, geneData, ratingData, tmpRBP, tmpGene, tmpRating);
    
    
                // Allow EP to process the product factor as if running VMP
                // as in Stern, Herbrich, Graepel paper.
                engine.Compiler.GivePriorityTo(typeof(GaussianProductOp_SHG09));
                engine.Compiler.ShowWarnings = true;
                //engine.ShowFactorGraph = true;
    
    
             
                observation.AddAttribute(new Sequential());  // needed to get stable convergence 
                engine.Compiler.UseSerialSchedules = true; 
                
                
                // Run inference
    
                var RBPTraitsMeanPosterior = engine.Infer<Gaussian[][]>(RBPTraitsMean);
                var RBPTraitsPrecPosterior = engine.Infer<Gamma[][]>(RBPTraitsPrec);
                var geneTraitsMeanPosterior = engine.Infer<Gaussian[][]>(geneTraitsMean);
                var geneTraitsPrecPosterior = engine.Infer<Gamma[][]>(geneTraitsPrec);
                              
                var RBPbiasMeanPosterior = engine.Infer<Gaussian[]>(RBPBiasMean);
                var RBPbiasPrecPosterior = engine.Infer<Gamma[]>(RBPBiasPrec);
    
                var geneBiasMeanPosterior = engine.Infer<Gaussian[]>(geneBiasMean);
                var geneBiasPrecPosterior = engine.Infer<Gamma[]>(geneBiasPrec);
    
                var RBPThresholdsPosterior = engine.Infer<Gaussian[]>(RBPThresholds);
    
    
                // Feed in the inferred posteriors as the new priors
                RBPTraitsPriorMean.ObservedValue = RBPTraitsMeanPosterior;
                RBPTraitsPriorPrec.ObservedValue = RBPTraitsPrecPosterior;
                geneTraitsPriorMean.ObservedValue = geneTraitsMeanPosterior;
                geneTraitsPriorPrec.ObservedValue = geneTraitsPrecPosterior;
    
                RBPbiasPriorMean.ObservedValue = RBPbiasMeanPosterior;
                RBPbiasPriorPrec.ObservedValue = RBPbiasPrecPosterior;
    
                geneBiasPriorMean.ObservedValue = geneBiasMeanPosterior;
                geneBiasPriorPrec.ObservedValue = geneBiasPrecPosterior;
    
                RBPThresholdsPrior.ObservedValue = RBPThresholdsPosterior;

    As you can see, I have tried to 'elevate' the distributions matrix priors(that now are exactly 

    RBPTraitsPriorMean, RBPTraitsPriorPrec, geneTraitsPriorMean, geneTraitsPriorPrec, RBPbiasPriorMean, RBPbiasPriorPrec, geneBiasPriorMean, geneBiasPriorPrec, RBPThresholdsPrior )

    I this way I have tried to still have a structure for the prediction. Infact, i infer all the posterior over these distribution in order to make the system in the condition to recompute the traits from these new distributions.

    However several problems Occurs:

    1) Do you think that this new system is correct in order to make prediction??

    2) Using this structure, indipendently from the constant of the distributions i got this error during the iteration phase of the EP alg(Indeed i don't achieve neither the first iteration):

    "result is none". This error refers to the line (

                var RBPTraitsMeanPosterior = engine.Infer<Gaussian[][]>(RBPTraitsMean);

    )

    that is the first line where i do the inference. Do you have any ideas about the meaning??

    3) I don't have the previous error if I change the traits distribution using a constant instead of the Gamma Random variable as precision:

    Variable.GaussianFromMeanAndPrecision(RBPTraitsMean[RBP][trait], 2);     instead of         Variable.GaussianFromMeanAndPrecision(RBPTraitsMean[RBP][trait], BPTraitsPrec[RBP][trait]);

    for the traits and for the biases. In this configuration i got another error that is in the prediction phase. Infact, instead of recomplile the model once with the new distributions and than make the prediction without any EP iteration, the system re-Compile the model 4 time, make the iterations of the EP alg and then return an error. The following are the messages i see:

    Compiling model...done

    Iterating:

    .1

    Compiling model...done

    Compiling model...done

    Compiling model...done

    Compiling model...done

    Iterating:

    .1

    The erorr I got is:

    value.Length (225810) != this.Count (12478)

    at line   Bernoulli[] predictedRatings = engine.Infer<Bernoulli[]>(ratingData);       that is exactly the prediction phase. Have you got any Ideas about the meaning of this error and about the reason.

    Thank you in Advance

    Marco

    Wednesday, June 26, 2013 12:42 PM