The performance functional class

The performance functional class: Roberto Lopez.
[email protected]
Artelnics - Making intelligent use of data

In this tutorial we formulate the learning problem for neural networks and describe some learning tasks that they can solve.

Contents:

1. Basic theory

Objective functional

The objective is the most important term in the performance functional expression. It defines the task that the neural network is required to accomplish.

For data modeling applications, such as function regression or pattern recognition, the sum squared error is the reference objective functional. It measures the difference between the outputs from a neural network and the targets in a data set. Some related objective functionals here are the normalized squared error or the Minkowski error.

Applications in which the neural network learns from a mathematical model require other objective functionals. For instance, we can talk about minimum final time or desired trajectory optimal control problems. That two performance terms are called in OpenNN independent parameters error and solutions error, respectively.

Regularization functional

A problem is called well-posed if its solution meets existence, uniqueness and stability. A solution is said to be stable when small changes in the independent variable led to small changes in the dependent variable. Otherwise the problem is said to be ill-posed.

An approach for ill-posed problems is to control the effective complexity of the neural network. This can be achieved by using a regularization term into the performance functional.

One of the simplest forms of regularization term consists on the norm of the neural parameters vector. Adding that term to the objective functional will cause the neural network to have smaller weights and biases, and this will force its response to be smoother.

Regularization can be used in problems which learn from a data set. Function regression or pattern recognition problems with noisy data sets are common applications. It is also useful in optimal control problems which aim to save control action.

Constraints functional

A variational problem for a neural network can be specified by a set of constraints, which are equalities or inequalities that the solution must satisfy. Such constraints are expressed as functionals.

Here the aim is to find a solution which makes all the constraints to be satisfied and the objective functional to be an extremum.

Constraints are required in many optimal control or optimal shape design problems. For instance, we can talk about length, area or volume constraints.

Performance functional

The performance measure is a functional of the neural network which can take the following forms:

performance functional = Functional[neural network]

performance functional = Functional[neural network, data set]

performance functional = Functional[neural network, mathematical model]

performance functional = Functional[neural network, mathematical model, data set]

In order to perform a particular task a neural network must be associated a performance functional, which depends on the variational problem at hand. The learning problem is thus formulated in terms of the minimization of the performance functional.

The performance functional defines the task that the neural network is required to accomplish and provides a measure of the quality of the representation that the neural network is required to learn. In this way, the choice of a suitable performance functional depends on the particular application.

The learning problem can then be stated as to find a neural network for which the performance functional takes on a minimum or a maximum value. This is a variational problem.

In the context of neural network, the variational problems are can be treated as a function optimization problem. The variational approach looks at the performance as being a functional of the function represented by the neural network. The optimization approach looks at the performance as being a function of the parameters in the neural network.

Then, a performance function can be visualized as a hypersurface, with the neural network parameters as coordinates, the following figure represents the performance function.

Geometrical representation of the performance function.

The performance function evaluates the performance of the neural network by looking at its parameters. More complex calculations allow to obtain some partial derivatives of the performance with respect to the parameters. The first partial derivatives are arranged in the gradient vector. The second partial derivatives are arranged in the Hessian matrix.

When the desired output of the neural network for a given input is known, the gradient and Hessian can usually be found analytically using back-propagation. In some other circumstances exact evaluation of that quantities is not possible and numerical differentiation must be used.

performance functional =  objective term + regularization term + constraints term

2. Software model

Classes

In order to construct a software model for the performance functional, a few things need to be taken into account. First, a performance functional can be measured on a data set, on a mathematical model or on both of them. Second, a performance functional might be composed by three terms: an objective functional, a regularization functional and a constraints functional. Third, sometimes we will need numerical differentiation to calculate the derivatives of the performance with respect to the parameters in the neural network.
Data set: The class which represents the concept of data set is called \lstinline"DataSet". This class is basically a data matrix with information on the variables (input or target) and the instances (training, generalization or testing).
Mathematical model: The class representing the concept of mathematical model is called \lstinline"MathematicalModel". This is a very abstract class for calculating the solution of some mathematical model for a given input to that model.
Performance term: The class which represents the concepts of objective, regularization and constraints terms is called \lstinline"PerformanceTerm". The objective functional is the most important term in the performance functional expression.
Numerical differentiation: The class with utilities for numerical differentiation is called \lstinline"NumericalDifferentiation". While it is not needed for data modelling problems, it is in general a must when solving optimal control, optimal shape design or inverse problems.
Performance functional: The class which represents the concept of performance functional is called \lstinline"PerformanceFunctional". A performance functional is defined as the sum of the objective, regularization and constraints functionals.

Associations

The associations among the concepts described above are the following:
Performance functional - Data set:: A performance functional might be measured on a data set.
Performance functional - Mathematical model:: A performance functional might be measured on a mathematical model.
Performance functional - Objective functional:: A performance functional might contain an objective term.
Performance functional - Regularization functional:: A performance functional might contain a regularization term.
Performance functional - Constraints functional:: A performance functional might contain a constraints term.

The following figure depicts an association diagram for the performance functional class.

Association diagram for the PerformanceFunctional class.

Derived classes

The next task is then to establish which classes are abstract and to derive the necessary concrete classes to be added to the system. Let us then examine the classes we have so far:
Data set:: This is called DataSet, and it is a concrete class. It can be instantiated by loading the data matrix from a file and setting the variables and instances information.
Mathematical model:: The class MathematicalModel is abstract, since it needs a concrete representation. Derived classes here include OrdinaryDifferentialEquations and PlugIn.; The mathematical model depends on the particular application, so further derivation might be needed. It is a current research line to get closer to a concrete nature of this class, by the use of a mathematical parser.
Performance term:: The class PerformanceTerm is abstract, because it does not represent a concrete performance term. Indeed, that depends on the problem at hand.; Some suitable error functionals for data function regression, pattern recognition and time series prediction problems are the sum squared error, the mean squared error, the root mean squared error, the normalized squared error or the Minkowski error. Therefore the SumSquaredError, MeanSquaredError, RootMeanSquaredError, NormalizedSquaredError and MinkowskiError concrete classes are derived from the PerformanceTerm abstract class. All of these error functionals are measured on a data set.; A specific objective functional for pattern recognition is the cross entropy error. This derived class is called CrossEntropyError.; Some common objective functionals for optimal control are also included.; A class InverseSumSquaredError for inverse problems involving a data set and a mathematical model is finally derived.; The most common regularization functional is the norm of the multilayer perceptron parameters. This method is implemented in the NeuralParametersNorm derived class.; Another useful regularization term consists on the integrals of the neural network outputs. This is included in the OutputsIntegrals class.; There are some common constraints functionals for optimal control or optimal shape design problems, such as the final solutions error. This derived class is called FinalSolutionsError". Related names here include SolutionsError or IndependentParametersError.
Performance functional:: The class PerformanceFunctional is concrete, since it is composed of a concrete objective functional, a concrete regularization functional and a concrete constraints functional. The performance functional is one of the main classes of OpenNN.; The next figure shows the UML class diagram for the with some of the derived classes included.; Association diagram for the PerformanceFunctional class.

Attributes and operations

Data set

A data-set has the following attributes:

A data matrix.
A variables information object.
An instances information object.

It performs the following operations:

Load the data from a file.
Scale/unscale the data.

Mathematical model

A mathematical model has the following attributes:

The number of independent variables.
The number of dependent variables.

It performs the following operations:

Calculate the solution of the mathematical model.

Performance term

A performance term has the following attributes:

A relationship to a neural network. In C++ this is implemented as a pointer to a neural network object.
A relationship to a data set.
A relationship to a mathematical model.

It performs the following operations:

Calculate the evaluation of the performance term.
Calculate the derivatives of the performance term with respect to the neural network parameters.

Performance functional

A performance functional for a neural network has the following attributes:

A relationship to a neural network. In C++ this is implemented as a pointer to a neural network object.
An objective performance term.
A regularization performance term.
A constraints performance term.

It performs the following operations:

Calculate the performance of a neural network.
Calculate the derivatives of the performance with respect to the parameters.

3. Performance functional classes

OpenNN implements the PerformanceFunctional concrete class. This class manages different objective, regularization and constraints terms in order to construct a performance functional suitable for our problem.

Members

The PerformanceFunctional class contains:

The type of objective term.
The type of regularization term.
The type of constraints term.
A pointer to an objective performance term.
A pointer to a regularization performance term.
A pointer to a constraints performance term.

All that members are declared as private, and they can only be used with their corresponding get or set methods.

Constructors

As it has been said, the choice of the performance functional depends on the particular application. A default performance functional is not associated to a neural network, it is not measured on a data set or a mathematical model,

PerformanceFunctional pf;

The default objective functional in the performance functional above is a normalized squared error. This is very used in function regression, pattern recognition and time series prediction.

That performance functional does not contain any regularization or constraints terms by default. Also, it does not have numerical differentiation utilities.

The following sentence constructs a performance functional associated to a neural network and to be measured on a data set.

NeuralNetwork nn(1, 1);

DataSet ds(1, 1, 1);
ds.initialize_data(0.0);

PerformanceFunctional pf(&nn, &ds);

As before the default objective functional is the normalized squared error.

Methods

The calculate_performance method calculates the performance of some neural network. It is called as follows,

double performance = pf.calculate_performance();

Note that the evaluation of the performance functional is the sum of the objective, regularization and constraints terms.

The calculate_gradient method calculates the partial derivatives of the performance with respect to the neural network parameters.

PerformanceFunctional pf;

Vector<double> gradient = pf.calculate_gradient();

As before, the gradient of the performance functional is the sum of the objective, the regularization and the constraints gradients. Most of the times, an analytical solution for the gradient is achieved. An example is the normalized squared error. On the other hand, some applications might need numerical differentiation. An example is the outputs integrals performance term.

Similarly, the Hessian matrix can be computed using the calculate_Hessian method,

Matrix<double> Hessian = pf.calculate_Hessian();

The Hessian of the objective functional is also the sum of the objective, regularization and constraints matrices of second derivatives.

If the user wants another objective functional than the default, he can writte

pf.construct_objective_term('MEAN_SQUARED_ERROR');

The above sets the objective functional to be the mean squared error.

The performance functional is not regularized by default. To change that, the following can be used

pf.construct_regularization_term('NEURAL_PARAMETERS_NORM');

The above sets the default regularization method to be the multilayer perceptron parameters norm. Also, it sets the weight for that regularization term.

Finally, the performance functional does not include a constraints term by default. The use of constrains might be difficult, so the interested reader is referred to look at the examples included in OpenNN.

Bibliography

C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
H. Demuth, M. Beale, and M. Hagan. Neural Network Toolbox User's Gide. The MathWorks, Inc., 2009.
S. Haykin. Neural Networks: A Comprehensive Foundation. Prentice Hall.
R. Lopez. Neural Networks for Variational Problems in Engineering. PhD Thesis, Technical University of Catalonia, 2008.

OpenNN Copyright © 2014 Roberto Lopez (Artelnics)