Distinct Hidden Markov Models can be trained individually on data obtained from each class of a classification problem. If we create a set of those models and specialize each model to recognize each of the separate classes, we can then use the HMMs ability to calculate the likelihood that a given sequence belongs to the model to determine the most likely class of an unknown sequence.
| This post is a continuation for the previous article, Hidden Markov Models in C#. |
The code presented here is also part of the Accord.NET Framework. The Accord.NET Framework is a framework for developing machine learning, computer vision, computer audition, statistics and math applications. It is based on the already excellent AForge.NET Framework. Please see the starting guide for mode details. The latest version of the framework includes the latest version of this code plus many other statistics and machine learning tools.
Introduction
Hidden Markov Models
An introduction about Hidden Markov Models (HMM) was presented in a previous article, entitled Hidden Markov Models in C#. HMMs are models are stochastic methods to model temporal and sequence data.
They allow, among other things, (1) to infer the most likely sequence of states that produced a given output sequence, to (2) infer which will be the most likely next state (and thus predicting the next output) and (3) calculate the probability that a given sequence of outputs originated from the system (allowing the use of hidden Markov models for sequence classification). In the context of this article, we will be more interested in results from ability (3).
Hidden Markov Models can be seem as finite state machines where for each sequence unit observation there is a state transition and, for each state, there is a output symbol emission. The picture on the left summarizes the overall definition of a HMM given in the previous article.
Hidden Markov Model -Based Sequence Classifier
Hidden Markov Models can be trained individually on data obtained from each class of a classification problem. If we create a set of models and specialize each model to recognize each of the separated classes, then we will be able to explore the HMMs ability to calculate the likelihood that a given sequence belongs to itself to perform classification of discrete sequences.
After all models have been trained, the probability of the unknown-class sequence can be computed for each model. As each model specialized in a given class, the one which outputs the highest probability can be used to determine the most likely class for the new sequence, as shown in the picture below.
Schematic representation of the sequence classification procedure.
Source Code
The implementation is very straightforward, as shown in the picture below. The Tag property of the HiddenMarkovModel class can be used to store additional information about which class the model will be representing.
Class diagram for the HMM-based sequence classifier.
Computing the most likely class for a given sequence
Computing the most likely class for a given sequence is as simple as iterating through all models in the set and selecting the one which has produced the highest likelihood value.
/// <summary>
/// Computes the most likely class for a given sequence.
/// </summary>
public int Compute(int[] sequence, out double likelihood)
{
int label = 0;
likelihood = 0.0;
// For every model in the set,
for (int i = 0; i < models.Length; i++)
{
// Evaluate the probability for the given sequence
double p = models[i].Evaluate(sequence);
// And select the one which produces the highest probability
if (p > likelihood)
{
label = i;
likelihood = p;
}
}
// Returns the index of the most likely model.
return label;
}
Training each model to recognize each of the output classes
To train each model to recognize each of the output classes we have to split the training data into subsets whose outputs corresponds to each of the classes from the classification problem.
/// <summary>
/// Trains each model to recognize each of the output labels.
/// </summary>
/// <returns>The sum likelihood for all models after training.</returns>
public double Learn(int[][] inputs, int[] outputs, int iterations, double limit)
{
double sum = 0;
// For each model,
for (int i = 0; i < models.Length; i++)
{
// Select the input/output set corresponding
// to the model's specialization class
int[] inx = outputs.Find(y => y == i);
int[][] observations = inputs.Submatrix(inx);
// Train the current model in the input/output subset
sum += models[i].Learn(observations, iterations, limit);
}
// Returns the sum likelihood for all models.
return sum;
}
Using the Code
Code usage is very simple. Once one has the input data available in the form of a sequence array and output data in form of a integer array, create a new HiddenMarkovClassifier object with the appropriate parameters.
Then, just call Learn() passing the input and output data and the convergence limit for the learning algorithm. After that, call Compute() passing a new integer sequence to be classified by the system.
// Declare some testing data
int[][] inputs = new int[][]
{
new int[] { 0,1,1,0 }, // Class 0
new int[] { 0,0,1,0 }, // Class 0
new int[] { 0,1,1,1,0 }, // Class 0
new int[] { 0,1,0 }, // Class 0
new int[] { 1,0,0,1 }, // Class 1
new int[] { 1,1,0,1 }, // Class 1
new int[] { 1,0,0,0,1 }, // Class 1
new int[] { 1,0,1 }, // Class 1
};
int[] outputs = new int[]
{
0,0,0,0, // First four sequences are of class 0
1,1,1,1, // Last four sequences are of class 1
};
// We are trying to predict two different classes
int classes = 2;
// Each sequence may have up to two symbols (0 or 1)
int symbols = 2;
// Nested models will have two states each
int[] states = new int[] { 2, 2 };
// Creates a new Hidden Markov Model Classifier with the given parameters
HiddenMarkovClassifier hmc = new HiddenMarkovClassifier(classes, symbols, states);
// Will train until convergence of the average likelihood
double likelihood = hmc.Learn(inputs, outputs, 0.001);
// Will assert the models have learned the sequences correctly.
for (int i = 0; i < inputs.Length; i++)
{
int expected = outputs[i];
int actual = hmc.Compute(inputs[i], out likelihood);
Assert.AreEqual(expected, actual);
}
Sample Application
The accompanying sample application can read Excel spreadsheets containing integer sequences and labels for those sequences. An example spreadsheet contained inside the Resources folder in the compressed archive file demonstrates the expected format for this data.
Left: Input sequences data. Right: Hidden Markov Models contained in the Sequence Classifier with initial configurations.
Left: Trained models. Right: Performance test for the Hidden Markov Model Sequence Classifier.
See also
- Hidden Markov Models in C#
- Logistic Regression Analysis in C#
- Principal Component Analysis (PCA)
- Kernel Principal Component Analysis (KPCA)
- Linear Discriminant Analysis (LDA)
- Non-Linear Discriminant Analysis with Kernels (KDA)
References
Kadous, Mohammed W., “A general architecture for supervised classification of multivariate time series”, Technical Report UNSW-CSE-TR-9806. School of Computer Science & Engineering, University of New South Wales, 1998.
No comments:
Post a Comment