Using multiple string templates with observed text strings


  • My dataset is an array of short strings like

    my name is <NAME>john</NAME>
    find nearest <BUSINESS>starbucks</BUSINESS> around here

    The total number of unique strings is about 10,000. This constitutes my training data. In the application, the XML tags will be removed.

    Extracting the templates from the strings e.g. "my name is {0}", "find nearest {0} around here" is in order of low hundreds.

    I went through the tutorial with inference on strings. I would like to extend that to multiple templates. I want to obtain a probability distribution over all templates given an observed text string.

    In the tutorial only one template was used and showed how to train the model with several observed strings. How can I build a similar model but with multiple templates that give rise to the observed strings? I am thinking to use the switch logic as in Gaussian mixture model example but I am uncertain where to start.

    Thank you!

    • Edited by usptact Tuesday, May 01, 2018 5:44 PM
    Tuesday, May 01, 2018 5:36 PM