Extension of Hidden Markov Models – Formulas in two state HMM

Extending the hidden markov models, where the current state is affected by past two states. This can be useful in simulation of games or in less sophisticated pricing models (more accurate would be an exponential decay of all past terms).

Notations

I have kept the notations close to as provided in Chapter 6 of “Speech and Language Processing”, Second Edition by Martin and Jurafsky.

\(\lambda\) Common term for all HMM parameters. All the probabilties
will be conditioned to this term i.e \(P(……|\lambda)\)

\(T\) denotes the total number of time steps

\(N\) denotes the total number of states

\(o_{t}\) denotes the observed variable (state) at time step t

\(q_{t}\) denotes the hidden variale (state) at time step t

Formulas

Distribution of \(\alpha_{t}(i,j)\)

\(\alpha_{t}(i,j)\) denotes the joint probability distribution of all observed variables until time \(t\) and current and the last states.

\[
\alpha_{t}(i,j)=P(o_{1},o_{2},\dots,o_{t},q_{t-1}=i,q_{t}=j|\lambda)
\]

Where \(\lambda\) is the given HMM parameters.

Base case of \(\alpha_{t}(i,j)\)

Consider the following notation:

\[
a_{ij}=P(q_{t}=j|q_{t-1}=i)
\]

\[
a_{ijk}=P(q_{t+1}=k|q_{t}=j,q_{t-1}=i)
\]

then base cases are –

\[
\alpha_{1}(i=0,j)=a_{i=0,j}b_{j}(o_{1})
\]

\[
\alpha_{2}(i,j)=\alpha_{1}(0,j)a_{ij}b_{j}(o_{2})
\]

It should be noted that if the second state (\(q_{2}\() is also allowed to be entry point, then it needs an additional base case of —

\[
\alpha_{2}(i=0,j)=\sum_{i=1}^{N}\alpha_{1}(i=0,j)a_{ij}b_{j}(o_{2})
\]

Inductive step

\[
\alpha_{t}(j,k)=\sum_{i=1}^{N}\alpha_{t-1}(i,j)\times a_{ijk}\times b_{k}(o_{t})\text{ where, }3\leq t\leq T
\]

Termination step

\[
P(O|\lambda)=\sum_{i=1}^{N}\sum_{j=1}^{N}\alpha_{T}(i,j)\times a_{iF}\times a_{jF}\times a_{ijF}
\]

Estimated Expected Transitions

\(\xi_{t}\)

\(\xi_{t}\) = Probability of being in state k at t+1, j at t and i at t-1

\[
\xi_{t}=P(q_{t+1}=k,q_{t}=j,q_{t-1}=i|o_{1},\dots o_{T})=\frac{\alpha_{t}(i,j)\times\beta_{t+1}(j,k)\times a_{ijk}b_{k}(o_{t+1})}{\alpha(q_{f})}
\]

\(\gamma_{t}\)

\(\gamma_{t}\) = Probability of being in state i at ‘t’, given all the obervations

\[
\gamma_{t}=P(q_{t}=i|o_{1},\dots,o_{T})=\frac{\sum_{j}^{N}\alpha(j,i),\beta(j,i)}{\alpha(q_{f})}
\]

where \(\alpha(q_{f})=P(o_{1},\dots,o_{T})\) (joint probability of all observed variables)

{*}Please note that for \(\gamma_{t}\) values, the \(i\) and \(j\) are interchanged in \(\alpha\) and \(\beta\) values, because I wanted to keep the input as \(q_{t}=i\), as provided in the problem, whereas convention in textbook is to denote \(i\) as antecedent to \(j\).

Detailed proofs has been provided here

Written on February 10, 2015