Asked by:
Text Classification with BPM and Naive Bayes
Question

I would like to classify a data set of text reviews of movies using BPM and Naive Bayes implementation.
I'm new with probabilistic programming and don't know how to start with pure text classification in infer.NET, I've seen the Gender prediction tutorial but don't know what mapping is better suited for this kind of application.
About the data set: it is composed of 2000 .txt files varying classified in good and bad movies reviews, 1000 each.
My question is: how is the best way to pass data to the learner in BPM?
Any Help is appreciated
Friday, June 9, 2017 3:52 PM
All replies

Hi George,
Did you try to run the "Learner.exe" binary in the Bin directory of the Infer.NET folder? Did you look at the options and their description in the web page:
http://infernet.azurewebsites.net/docs/Infer.NET%20Learners%20%20Bayes%20Point%20Machine%20classifiers%20%20Runners.aspx
I guess the most difficult part for you will be to create the data files in the required format.
Cheers,
Vlad
Monday, June 12, 2017 3:22 PM 
Hi Vlad,
I'll take another look on that. It actually may do the job indeed..
Thanks for the reply.
What about a Naive Bayes implementation? Any help on that?
Thursday, June 15, 2017 3:09 PM 
Bayesian Point Machine (BPM) and Naive Bayes are very different. The former is completely Bayesian approach for decision (read classification) making, the latter is *not* Bayesian approach! I suggest you to read the classic on Bayesian approach to classification (e.g. Hart "Pattern Recognition" first chapter).Saturday, June 17, 2017 4:48 AM

In the context of classifying documents, Naïve Bayes generally refers a model where each word in a document of a particular class is assumed to be independently drawn from a common discrete distribution. We don't have a specific example of this model, but it is equivalent to Latent Dirichlet Allocation with one topic.
 Edited by Tom MinkaMicrosoft employee, Owner Monday, June 19, 2017 12:56 PM
Monday, June 19, 2017 12:22 PMOwner 
Bayesian Point Machine (BPM) and Naive Bayes are very different. The former is completely Bayesian approach for decision (read classification) making, the latter is *not* Bayesian approach! I suggest you to read the classic on Bayesian approach to classification (e.g. Hart "Pattern Recognition" first chapter).
Thanks, I wanted to know if there was some example in infer.NET or if it had it in some of the DLLs that come in the infer.NET package. Nonetheless I will take a look on the book you mentioned, once I don't recall reading it yet, and may be of great help
Maybe some further explanation was needed in the second question: I'm doing my course conclusion monograph(not sure right in English,sorry) and want to make a comparison between these two, BPM and Naive Bayes in infer.NET as part of the work, that's the reason of the question.
Thank you once again for helping out.
Wednesday, June 21, 2017 1:18 AM 
In the context of classifying documents, Naïve Bayes generally refers a model where each word in a document of a particular class is assumed to be independently drawn from a common discrete distribution. We don't have a specific example of this model, but it is equivalent to Latent Dirichlet Allocation with one topic.
That's what I intend to do. Will carefully look the link.
Thanks, that's surely helpful.Wednesday, June 21, 2017 1:27 AM