# Attempt to learn a small Bayesian HMM (Migrated from community.research.microsoft.com) • ### Question

• jvangael posted on 01-14-2009 12:37 PM

Hi,

I was trying to implement a small Bayesian HMM using Infer.NET but the inference doesn't seem to make a lot of sense to me. Here is the code in F#

let K = 4
let Krange = (new Models.Range(K)).Named("K")
let alpha = [| for i in 1..K -> 1.0 |]
let Y = [|0;1;0;1;0;1;2;3;2;3;2;3;0;1;0;1;0;1;2;0;2;0;2;0;2;0;2;0|]
let T = Y.Length

let v = Models.Variable<Vector>.Dirichlet(Krange, alpha).Named("v")
let pi = Array.init K (fun i -> Models.Variable<Vector>.Dirichlet(Krange, alpha).Named(sprintf "pi%d" i))
let theta = Array.init K (fun i -> Models.Variable<Vector>.Dirichlet(Krange, alpha).Named(sprintf "theta%d" i))

let s : Models.Variable<int> [ = Array.zero_create T
let y : Models.Variable<int> [ = Array.zero_create T

s. <- Models.Variable.Discrete(v)
for t in 1 .. T-1 do
do use c = Models.Variable.Case(s.[t-1], 0) in s.[t] <- Models.Variable.Discrete(pi.)
do use c = Models.Variable.Case(s.[t-1], 1) in s.[t] <- Models.Variable.Discrete(pi.)
do use c = Models.Variable.Case(s.[t-1], 2) in s.[t] <- Models.Variable.Discrete(pi.)
do use c = Models.Variable.Case(s.[t-1], 3) in s.[t] <- Models.Variable.Discrete(pi.)

for t in 0 .. T-1 do
do use d = Models.Variable.Case(s.[t], 0) in y.[t] <- Models.Variable.Discrete(theta.)
do use d = Models.Variable.Case(s.[t], 1) in y.[t] <- Models.Variable.Discrete(theta.)
do use d = Models.Variable.Case(s.[t], 2) in y.[t] <- Models.Variable.Discrete(theta.)
do use d = Models.Variable.Case(s.[t], 3) in y.[t] <- Models.Variable.Discrete(theta.)
y.[t].ObservedValue <- Y.[t]

let ie = new InferenceEngine()
for t in 0 .. T-1 do
printfn "Posterior on s%d = %A" t (ie.Infer<int>(s.[t]))
printfn "Posterior on theta0 = %A" (ie.Infer<Vector>(theta.))
printfn "Posterior on theta1 = %A" (ie.Infer<Vector>(theta.))
printfn "Posterior on theta2 = %A" (ie.Infer<Vector>(theta.))
printfn "Posterior on theta3 = %A" (ie.Infer<Vector>(theta.))

Using EP (which I believe should be very similar to belief propagation or forward-backward in this case) I get the following results for the output distributions:

Posterior on theta0 = Dirichlet(1 1 1 1)
Posterior on theta1 = Dirichlet(1 1 1 1)
Posterior on theta2 = Dirichlet(1 1 1 1)
Posterior on theta3 = Dirichlet(11.39 6.761 8.698 3.745)

Any suggestions? Cheers,

Jurgen

PS How can I color code the code?

Friday, June 3, 2011 5:39 PM

• Todd replied on 04-16-2010 10:23 AM

Yes, your change (with the angle brackets around "int" replaced with square brackets) solves the problem.  Thanks!

Todd

Friday, June 3, 2011 5:41 PM

### All replies

• minka replied on 01-14-2009 3:49 PM

Hi Jurgen,

The problem is these lines:

do use c = Models.Variable.Case(s.[t-1], 0) in s.[t] <- Models.Variable.Discrete(pi.)
do use c = Models.Variable.Case(s.[t-1], 1) in s.[t] <- Models.Variable.Discrete(pi.)
do use c = Models.Variable.Case(s.[t-1], 2) in s.[t] <- Models.Variable.Discrete(pi.)
do use c = Models.Variable.Case(s.[t-1], 3) in s.[t] <- Models.Variable.Discrete(pi.)

Because you are using the assignment operator in F# (<-), each line is overwriting the variable definition of the previous line.  So s.[t] is eventually only defined by the last assignment.  Similarly for y.[t].  That is why only pi. and theta. receive any data.

What you want is for s.[t] to have a conditional definition.  This is done by first creating a variable object using Variable.New and then calling SetTo for each conditional definition.  So the model definition becomes:

s. <- Models.Variable.Discrete(v)

for t in 1 .. T-1 do
s.[t] <- (Models.Variable.New<int>()).Named(sprintf "s%d" t)
for k in 0 .. K-1 do
use caseBlock = Models.Variable.Case(s.[t-1], k) in s.[t].SetTo(Models.Variable.Discrete(pi.[k]))

for t in 0 .. T-1 do
y.[t] <- (Models.Variable.New<int>()).Named(sprintf "y%d" t)
for k in 0 .. K-1 do
use caseBlock = Models.Variable.Case(s.[t], k) in y.[t].SetTo(Models.Variable.Discrete(theta.[k]))
y.[t].ObservedValue <- Y.[t]

Another thing to remember is that for a symmetric model like this one, you need to break symmetry in order to get a local solution from message-passing (this is illustrated in the mixture of Gaussians tutorial).  For the HMM, a reasonable way of breaking symmetry is by initializing theta to point masses as follows:

for

k in 0 .. K-1 do

let oneOfK = Array.init K ( fun i -> if i=k then 1.0 else 0.01)
theta.[k] <- theta.[k].InitialiseTo(Distributions.Dirichlet.PointMass(oneOfK))

This initialization worked for me to get VariationalMessagePassing to converge to a local maximum.

By the way, EP is not the same as forward-backward on this model.  This is because EP is simultaneously estimating the unknown parameters.  This has a significant impact on the results.

Friday, June 3, 2011 5:40 PM
• minka replied on 01-16-2009 7:26 AM

By the way, this model will be more compact and run more efficiently if you use a single Variable.Switch block instead of multiple Variable.Case blocks when defining s.[t] and y.[t].

Friday, June 3, 2011 5:40 PM
• Todd replied on 04-15-2010 12:34 AM

Hi,

I'm resurrecting this old thread since I'm having trouble figuring out how to implement a hidden Markov model in with Infer.NET in IronPython.  First, though, I wanted to thank the developers for building such an amazing package.  (It's also feels somewhat surreal that the package works so well on a Macintosh with Mono. )

My first question is: how do I represent the hidden states (s in the F# example above) in IronPython?  I gather that the states must be "unrolled", but I'm not clear what kind of array structure to use.  Should I use an IronPython list, a .NET System.Array, an Infer.NET VariableArray, or something else entirely?  If the answer is not an IronPython list or a .NET System.Array, how do I access just the individual elements of the array in order to specify the (first-order) Markov structure?  This last question sounds a little weird since normally it's simple to access the elements of an array, but the elements of Infer.NET Variable.Array's seem (please correct me if I'm wrong!) to be accessible only by Range objects.

Once I have the states, can I select the emission model for each state by following the example in the mixtures of Gaussians example?

Thanks,

Todd

Friday, June 3, 2011 5:40 PM
• minka replied on 04-15-2010 3:07 AM

You can use a .NET System.Array, IronPython list, or any data structure you want, as long as it is not a VariableArray.  The F# example above is using a System.Array, although I guess this might not be clear if you are unfamiliar with F#.  You can select the emission model using Variable.Switch as in the mixture of Gaussians example.

Friday, June 3, 2011 5:40 PM
• Todd replied on 04-15-2010 12:01 PM

Thanks for the prompt reply!

Considering the mixture of Gaussians example for simplicity (and ignoring efficiency issues for the moment), if the latent variable "z" is changed from a Variable.Array to a System.Array, do the "data", "means", and "precs" variables also have to be converted to System.Arrays in order to be indexed by ints rather than Ranges in the switch block?

Todd

Friday, June 3, 2011 5:40 PM
• Todd replied on 04-15-2010 8:26 PM

The answer to my question in the previous comment  about whether the other variables have to converted to System.Arrays is "no."

I think I almost have a working code, but I'm hitting one last (I hope!) problem.  In the code below, "emission" is an array containing the observations; I'm assuming that the emission is Gaussian in each state with different means and precisions.  "pInit" is the probability for the state for the first observation.  "pTrans" is the transition matrix.

nData = len(emission)

nDataRange = Range(nData)

nStates = 2

nStatesRange = Range(nStates)

pInit = Variable.Dirichlet(nStatesRange, Vector(nStates, 1.))

pTrans = Variable.Array[Vector](nStatesRange)

pTrans[nStatesRange] = Variable.Dirichlet(nStatesRange, Vector(nStates, 1.)).ForEach(nStatesRange)

states = [Variable.Discrete(pInit) for i in range(nData)]

states.InitialiseTo(Discrete.PointMass(Rand.Int(nStates), nStates))

for i in range(1, nData):

states .InitialiseTo(Discrete.PointMass(Rand.Int(nStates), nStates))

with (Variable.Switch(states[i-1])):

states .SetTo(Variable.Discrete(pTrans[states[i-1]]))

mu = Variable.Array[float](nStatesRange)

mu[nStatesRange] = Variable.GaussianFromMeanAndPrecision(0., 1.e-6).ForEach(nStatesRange)

tau = Variable.Array[float](nStatesRange)

tau[nStatesRange] = Variable.GammaFromShapeAndRate(0.001, 0.001).ForEach(nStatesRange)

y = Variable.Array[float](nDataRange)

y.ObservedValue = emission

for i in range(nData):

with (Variable.Switch(states )):

y[Variable.Constant(i)].SetTo(Variable.GaussianFromMeanAndPrecision(mu[states ], tau[states ]))

ie = InferenceEngine(VariationalMessagePassing())

ie.InferAll(pTrans, mu, tau)

When I try to execute the code, I get the following error message associated with the Switch block for the states:

SystemError: Cannot define a variable more than once in the same condition block context: ([vint2==-1])

The second Switch block for the emission variable "y" executes without complaint, so I'm guessing that the problem has something to do with having "states" on both sides of

states .SetTo(Variable.Discrete(pTrans[states[i-1]]))

But since the two instances of "states" have different indices, I'm not sure what the cause of the problem is.  Does anyone have any ideas?

Thanks for your help,

Todd

Friday, June 3, 2011 5:40 PM
• minka replied on 04-16-2010 6:06 AM

data would need to be converted to a System.Array to match z.  However, means and precs have different constraints.  If you are using Variable.Switch, these must remain VariableArrays because you are indexing them by a random variable.  Alternatively, you could use Variable.Case as in the original post and then means and precs could be System.Arrays because then you would be indexing them by an integer

Friday, June 3, 2011 5:40 PM
• Todd replied on 04-16-2010 9:01 AM

Hi Tom,

Thanks for the clarification.

I had actually figured out a lot of this on my own and written a post answering my own question and asking a further question.  However, when I submitted the post, the forum software displayed a message saying that this is a moderated forum and that the post would be reviewed by a moderator.  Do you have any idea where the post may have gone?  If it has vaporized, I'll write it again.

Todd

Friday, June 3, 2011 5:41 PM
• Todd replied on 04-16-2010 9:55 AM Todd:

Hi Tom,

Thanks for the clarification.

I had actually figured out a lot of this on my own and written a post answering my own question and asking a further question.  However, when I submitted the post, the forum software displayed a message saying that this is a moderated forum and that the post would be reviewed by a moderator.  Do you have any idea where the post may have gone?  If it has vaporized, I'll write it again.

Todd

The missing post I was referring to has now appeared above.  At least in my Safari browser, the forum software has transformed by index "i" into a jaunty light bulb.

Todd

Friday, June 3, 2011 5:41 PM
• minka replied on 04-16-2010 10:07 AM

I think the issue is this line:

states = [Variable.Discrete(pInit) for i in range(nData)]

This will create nData variables and give them definitions.  When you try to change their definition in the switch block, you get the error.  Try changing this to:

states = [Variable.New<int>() for i in range(nData)]
states = Variable.Discrete(pInit)

Friday, June 3, 2011 5:41 PM
• Todd replied on 04-16-2010 10:23 AM

Yes, your change (with the angle brackets around "int" replaced with square brackets) solves the problem.  Thanks!

Todd

Friday, June 3, 2011 5:41 PM