locked
difference between training and prediction models - Infer.net RRS feed

  • Question

  • Hi

    In cyclingTime example, I can't understand why the training and prediction model should be defined separately.

    Why we can't just use the inferred times as tomorrow's prediction? I mean I don't understand why do we need a separate "model" for prediction?

    Thank you very much

    Zahra

    Wednesday, June 4, 2014 3:50 PM

Answers

  • Take a look at RunCyclingTime2. As explained in the 101 paper: "RunCyclingTime2 calls SetModelData to specify the prediction model’s priors. It then calls InferTomorrowsTime to obtain the predicted distribution for tomorrow’s travel time. These priors are presumably the posteriors that were obtained from the most recent training session". So it's not like nothing is observed - the priors are. And in this model this is sufficient to make predictions.

    I think the only reason to use separate models for training and prediction here is the fact that training has an array of Gaussians derived from AverageTime and TrafficNoise, while the prediction model has only one such variable. It is perfectly fine to use this training model for prediction if you observe NumTrips to 1, infer the Gaussian[] posterior over TravelTimes and return element [0] of this array.

    As an aside, although the probabilistic model as such can be shared between training and prediction here, note that the generated code will differ. Infer.NET will compile different algorithms for training and prediction because of the different observation patterns - in training TravelTimes is observed, while in prediction it's not. This significantly affects the way messages are passed around.

    Finally, note that the approach taken for online learning on this model cannot be applied to your time series genes model. Here, TravelTimes is different for each batch, while in your model w is shared across batches (or at least that's my understanding). Hence the need for ConstrainEqualRandom, as explained in my second post here.

    • Marked as answer by RazinR Tuesday, June 10, 2014 2:03 PM
    Saturday, June 7, 2014 1:31 PM

All replies

  • I assume you are suggesting to use the inference of averageTime in place of tomorrowsTime.  If you run the code, you will see that these have different distributions.  So you cannot in general use one in place of the other.
    Wednesday, June 4, 2014 4:08 PM
    Owner
  • Thank you Tom. I understand this. But the thing I don't understand is that, in the prediction model, we don't observe anything. So why the distribution in posteriors is different from the distributions in priors?

    Without data observation, what does it mean to call infer?

    The last reply by Yordan in this post says the same thing. I just mixed things I think.

    Thursday, June 5, 2014 2:15 PM
  • Take a look at RunCyclingTime2. As explained in the 101 paper: "RunCyclingTime2 calls SetModelData to specify the prediction model’s priors. It then calls InferTomorrowsTime to obtain the predicted distribution for tomorrow’s travel time. These priors are presumably the posteriors that were obtained from the most recent training session". So it's not like nothing is observed - the priors are. And in this model this is sufficient to make predictions.

    I think the only reason to use separate models for training and prediction here is the fact that training has an array of Gaussians derived from AverageTime and TrafficNoise, while the prediction model has only one such variable. It is perfectly fine to use this training model for prediction if you observe NumTrips to 1, infer the Gaussian[] posterior over TravelTimes and return element [0] of this array.

    As an aside, although the probabilistic model as such can be shared between training and prediction here, note that the generated code will differ. Infer.NET will compile different algorithms for training and prediction because of the different observation patterns - in training TravelTimes is observed, while in prediction it's not. This significantly affects the way messages are passed around.

    Finally, note that the approach taken for online learning on this model cannot be applied to your time series genes model. Here, TravelTimes is different for each batch, while in your model w is shared across batches (or at least that's my understanding). Hence the need for ConstrainEqualRandom, as explained in my second post here.

    • Marked as answer by RazinR Tuesday, June 10, 2014 2:03 PM
    Saturday, June 7, 2014 1:31 PM
  • Thanks a lot Yordan.

    Now I understand what was my mistake.Thanks a lot for great reply on the other post also.

    Tuesday, June 10, 2014 2:08 PM