Simple pattern recognition

Simple pattern recognition: Roberto Lopez.
[email protected]
Artelnics - Making intelligent use of data

In this section a simple pattern recognition problem with just one input and one output variables is solved by means of a multilayer perceptron. The data and the source code for this problem can be found within OpenNN.

Contents:

1. Problem statement

In this example we have a set with 101 instances, and 1 input (x) and 1 target (y) variables. The objective is to design neural network that can predict y values for given x values. The data set is listed next.

The following figure shows the data set for this example.

2. Data set

The first step is to set up the data set. In order to do that we construct a DataSet object and load the data from a file.

DataSet data_set;

data_set.load_data("simplepatternrecognition.dat");

Then we add the information about the inputs and the targets in the Variables object.

Variables* variables_pointer = data_set.get_variables_pointer();

variables_pointer->set_use(0, Variables::Input);
variables_pointer->set_use(1, Variables::Target);

variables_pointer->set_name(0, "x");
variables_pointer->set_name(1, "y");

Matrix inputs_information = variables_pointer->arrange_inputs_information();
Matrix targets_information = variables_pointer->arrange_targets_information();

We also need to set the uses for the instances (training, generalization or testing). For simplicity, here we will use all the instances for training.

Instances* instances_pointer = data_set.get_instances_pointer();

instances_pointer->set_training();

Finally we scale all the input and target variables so that they fall in the range [-1,1].

Vector< Statistics > inputs_statistics = data_set.scale_inputs_minimum_maximum();
Vector< Statistics > targets_statistics = data_set.scale_targets_minimum_maximum();

3. Neural network

The first step in solving the problem formulated in this section is to choose a network architecture to represent the pattern recognition function. Here a multilayer perceptron with a sigmoid hidden layer and a linear output layer is used. The multilayer perceptron must have one input, since there is one input variable; and one output neuron, since there is one target variable. The size of the hidden layer is set to 2. This neural network can be denoted as 1 : 2 : 1. It defines a family V of parameterized functions y(x) of dimensions s = 4, which is the number of neural parameters in the multilayer perceptron.

The following figure is a graphical representation of that network architecture. The neural parameters are initialized at random with a normal distribution of mean 0 and standard deviation 1.

// Neural network

NeuralNetwork neural_network(1, 3, 1);

Inputs* inputs_pointer = neural_network.get_inputs_pointer();
inputs_pointer->set_information(inputs_information);

Outputs* outputs_pointer = neural_network.get_outputs_pointer();
outputs_pointer->set_information(targets_information);

neural_network.construct_scaling_layer();
ScalingLayer* scaling_layer_pointer = neural_network.get_scaling_layer_pointer();
scaling_layer_pointer->set_statistics(inputs_statistics);
scaling_layer_pointer->set_scaling_method(ScalingLayer::NoScaling);

neural_network.construct_unscaling_layer();
UnscalingLayer* unscaling_layer_pointer = neural_network.get_unscaling_layer_pointer();
unscaling_layer_pointer->set_statistics(targets_statistics);
unscaling_layer_pointer->set_unscaling_method(UnscalingLayer::NoUnscaling);

4. Performance functional

The second step is to assign the multilayer perceptron an objective functional. This is to be the normalized squared error. The variational statement of the pattern recognition problem being considered here is then to find a function y(x) 2 V for which the functional defined on V , takes on a minimum value. Note that Q is here the number of training instances. Evaluation of the objective functional in the last Equation just require explicit expressions for the function represented by the different multilayer perceptrons. On the other hand, evaluation of the objective function gradient vector, is obtained by the back-propagation algorithm derived in Section 5. This technique gives the greatest accuracy and numerical efficiency.

PerformanceFunctional performance_functional(&neural_network, &data_set);

5. Training strategy

The third step in solving this problem is to assign the objective function a training algorithm. We use the quasi-Newton method described in Section 6 for training. In this example, we set the training algorithm to stop after 1000 epochs of the training algorithm.

TrainingStrategy training_strategy(&performance_functional);

QuasiNewtonMethod* quasi_Newton_method_pointer = training_strategy.get_quasi_Newton_method_pointer();

quasi_Newton_method_pointer->set_minimum_performance_increase(1.0e-3);

TrainingStrategy::Results training_strategy_results = training_strategy.perform_training();

The presence of noise in the training data set makes the objective function to have local minima. This means that, when solving pattern recognition problems, we should always repeat the learning process from several different starting positions. During the training process the objective function decreases until the stopping criterion is satisfied. Table 7.5 shows the training results for the problem considered here. final parameter vector norm, final normalized squared error final gradient norm, number of epochs training time in a PC

6. Testing analysis

The last step is to test the generalization performance of the trained neural network. Here we compare the values provided by this technique to the actually observed values.

NeuralNetwork nn;

Bibliography

C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
R. Lopez. Neural Networks for Variational Problems in Engineering. PhD Thesis, Technical University of Catalonia, 2008.

OpenNN Copyright © 2014 Roberto Lopez (Artelnics)