locked
Bayes Point Machine Discrete & Discrete Features RRS feed

  • Question

  • Hi,

    According to the specifications of the Bayes Point Machine Classifier, the predictive distribution is a linear discriminant function, and prior distribution of the weights are Gamma-Gamma. Does it infer that when features are read from an standard text file, they are being evaluated if they are 'double' or 'int' and then they are being mapped as continuous or discrete?

    If it is not the case, a quick way would be to break down the categorical features to dummy variables!

    How do recommend to include categorical or ordinal features into a Bayes Point Machine Classifer model?

    Many thanks,

    Nic


    • Edited by Nic_M_M Tuesday, February 10, 2015 2:54 PM
    Tuesday, February 10, 2015 2:54 PM

Answers

  • Hi Nic,

    The BPM doesn't do any automatic feature engineering for you. This is left out for the developer to handle. The feature values are assumed to be numeric. Therefore, the BPM won't distinguish between integers (1, 2, 3) and double-s (1.0, 2.0, 3.0). If your feature values are strings, like "Monday", "Tuesday", etc., then the BPM will throw an exception. In the latter case you might want to consider converting the strings to indicator arrays. This is also sometimes a good idea even with integers, unless it doesn't lead to feature explosion.

    If you're working with the BPM Learner, then you have one of the following two options. First, you can provide a self-defined data mapping, as explained here. This will impose the feature type to be an Infer.NET Vector, which a dense or sparse numeric array. Second, you can provide a text file, the format of which is defined here.

    -Y-

    Tuesday, February 10, 2015 5:43 PM