training sets (Migrated from community.research.microsoft.com) RRS feed

  • Question

  • anderson posted on 06-09-2010 1:21 PM

    How much data makes up a good training set?  How do you know you have enough for a training set to accurately be able to predict on a test set?

    Friday, June 3, 2011 5:43 PM


  • jwinn replied on 06-09-2010 1:32 PM

    The size of a 'good' training set will depend heavily on your model and the nature of the training set.  Bayesian methods, such as those implemented in Infer.NET, give you a posterior distribution over your model parameters.  So can tell when your data set is large enough, when all your parameter distributions are tight.  However, the accuracy of prediction on your test set will still depend on how good your model is!

    John W.

    Friday, June 3, 2011 5:43 PM