locked
Why do we need to generate the worker's label? RRS feed

  • Question

  • In an attempt to understand the sample code from Microsoft (great job by the way!), I have come across this piece of code which I fail to understand. Actually, I don't understand why we need it.

     /// <summary>
            /// Initializes the ranges, the generative process and the inference engine of the BCC model.
            /// </summary>
            /// <param name="taskCount">The number of tasks.</param>
            /// <param name="labelCount">The number of labels.</param>
            public virtual void CreateModel(int taskCount, int labelCount)
            {
                Evidence = Variable<bool>.Random(this.EvidencePrior);
                var evidenceBlock = Variable.If(Evidence);
                DefineVariablesAndRanges(taskCount, labelCount);
                DefineGenerativeProcess();
                DefineInferenceEngine();
                evidenceBlock.CloseBlock();
            }
    
    
     /// <summary>
            /// Initializes the ranges, the generative process and the inference engine of the BCC model.
            /// </summary>
            /// <param name="taskCount">The number of tasks.</param>
            /// <param name="labelCount">The number of labels.</param>
            public virtual void CreateModel(int taskCount, int labelCount)
            {
                Evidence = Variable<bool>.Random(this.EvidencePrior);
                var evidenceBlock = Variable.If(Evidence);
                DefineVariablesAndRanges(taskCount, labelCount);
                DefineGenerativeProcess();
                DefineInferenceEngine();
                evidenceBlock.CloseBlock();
            }

    Why do we need the DefineGenerativeProcess function?? Why do we need to generate the worker's label?? Aren't we already assigning it a value in the input file?

    Thanks!

    • Changed type cindyak Friday, March 13, 2015 6:09 AM
    Wednesday, March 11, 2015 9:38 AM

Answers

  • Hey there,

    What we usually work with in Infer.NET is the so-called model - a set of assumptions about how the data is generated. It can sometimes be easier to even think of it as a data sampler. Let me oversimplify here for a second. The model usually contains some latent parameters (which we learn) and some observed data (which as you pointed out we have at hand, for example in an input file). The model is typically the forward process of going from the parameters to the data. Once specified in Infer.NET, the compiler generates the backward process - from the data to the parameters. And when we run this, we can learn the parameters from the data. This is typically what happens during training. Then, however, in prediction we want to run the forward process. Then we will use the very same model, but this time the observation pattern will be different. The parameters will be known (because we learned them in training) and the data that we had in training will now be unknown. We can ask the compiler to generate an algorithm for this observation pattern and thus make predictions for the worker's label.

    Note that I split the problem into training (backward process) and prediction (forward process), simply because this is a typical scenario. However, it's important to understand that model parameters are learned jointly in the whole graph where there really isn't any notion of direction. In the end, observed data is simply parameters with zero variance (or equivalently 100% precision), right?. The process of learning the values of the parameters in this fashion is called inference. Given a model (specified as a factor graph) and an observation pattern, the Infer.NET compiler can generate an algorithm which runs inference in this model and pattern.

    -Y-

    • Marked as answer by cindyak Friday, March 13, 2015 6:09 AM
    Wednesday, March 11, 2015 4:55 PM

All replies

  • Hey there,

    What we usually work with in Infer.NET is the so-called model - a set of assumptions about how the data is generated. It can sometimes be easier to even think of it as a data sampler. Let me oversimplify here for a second. The model usually contains some latent parameters (which we learn) and some observed data (which as you pointed out we have at hand, for example in an input file). The model is typically the forward process of going from the parameters to the data. Once specified in Infer.NET, the compiler generates the backward process - from the data to the parameters. And when we run this, we can learn the parameters from the data. This is typically what happens during training. Then, however, in prediction we want to run the forward process. Then we will use the very same model, but this time the observation pattern will be different. The parameters will be known (because we learned them in training) and the data that we had in training will now be unknown. We can ask the compiler to generate an algorithm for this observation pattern and thus make predictions for the worker's label.

    Note that I split the problem into training (backward process) and prediction (forward process), simply because this is a typical scenario. However, it's important to understand that model parameters are learned jointly in the whole graph where there really isn't any notion of direction. In the end, observed data is simply parameters with zero variance (or equivalently 100% precision), right?. The process of learning the values of the parameters in this fashion is called inference. Given a model (specified as a factor graph) and an observation pattern, the Infer.NET compiler can generate an algorithm which runs inference in this model and pattern.

    -Y-

    • Marked as answer by cindyak Friday, March 13, 2015 6:09 AM
    Wednesday, March 11, 2015 4:55 PM
  • Thanks Yordan, that was really helpful.
    Friday, March 13, 2015 6:09 AM