locked
LDA postTheta values RRS feed

  • Question

  • I am working on topic detection using Infer.NET's LDA code.  The values in postTheta are the marginal distributions over topic mixtures, I was just wondering why do they not sum to 1 per document?  Should I be normalizing the values?
    Friday, May 10, 2013 10:21 PM

All replies

  • Each marginal distribution is a Dirichlet distribution. The values that you are seeing are the pseudo-counts which are the parameters of the distribution. A Dirichlet distribution is a distribution over probability vectors. You can ask this distribution for its mean (call its GetMean method) or you can sample this distribution (call its Sample method). In all cases you will get a probability vector that sums to 1.

    Monday, May 13, 2013 8:23 AM
    Owner