The class of neural network implemented in OpenNN is based on the multilayer perceptron. That model is extended here to contain scaling, unscaling, bounding, probabilistic and conditions layers. A set of independent parameters associated to the neural network is also included here for convenience.
Contents:The neural network implemented in OpenNN is based on the multilayer perceptron. That classical model of neural network is also extended with scaling, unscaling, bounding, probabilistic and conditions layers, as well as a set of independent parameters.
Layers of perceptrons can be composed to form a multilayer perceptron. Most neural networks, even biological ones, exhibit a layered structure. Here layers and forward propagation are the basis to determine the architecture of a multilayer perceptron. This neural network represent an explicit function which can be used for a variety of purposes.
The architecture of a multilayer perceptron refers to the number of neurons, their arrangement and connectivity. Any architecture can be symbolized as a directed and labelled graph, where nodes represent neurons and edges represent connectivities among neurons. An edge label represents the parameter of the neuron for which the flow goes in. Thus, a neural network typically consists on a set of sensorial nodes which constitute the input layer, one or more hidden layers of neurons and a set of neurons which constitute the output layer.
There are two main categories of network architectures: acyclic or feed-forward networks and cyclic or recurrent networks. A feed-forward network represents a function of its current input; on the contrary, a recurrent neural network feeds outputs back into its own inputs. As it was said above, the characteristic neuron model of the multilayer perceptron is the perceptron. On the other hand, the multilayer perceptron has a feed-forward network architecture.
Hence, neurons in a feed-forward neural network are grouped into a sequence of layers of neurons, so that neurons in any layer are connected only to neurons in the next layer. The input layer consists of external inputs and is not a layer of neurons; the hidden layers contain neurons; and the output layer is also composed of output neurons. The following figure shows the network architecture of a multilayer perceptron.Communication proceeds layer by layer from the input layer via the hidden layers up to the output layer. The states of the output neurons represent the result of the computation.
In this way, in a feed-forward neural network, the output of each neuron is a function of the inputs. Thus, given an input to such a neural network, the activations of all neurons in the output layer can be computed in a deterministic pass.
In practice it is always convenient to scale the inputs in order to make all of them to be of order zero. In this way, if all the neural parameters are of order zero, the outputs will be also of order zero. On the other hand, scaled outputs are to be unscaled in order to produce the original units.
In the context of neural networks, the scaling function can be thought as an additional layer connected to the input layer of the multilayer perceptron. The number of scaling neurons is the number of inputs, and the connectivity of that layer is not total, but one-to-one. The following figure illustrates a scaling layer.The scaling layer contains some basic statistics on the inputs. They include the mean, standard deviation, minimum and maximum values. Two scaling methods very used in practice are the minimum-maximum and the mean-standard deviation methods.
The unscaling layer contains some basic statistics on the outputs. They include the mean, standard deviation, minimum and maximum values. Two unscaling methods very used in practice are the minimum-maximum and the mean-standard deviation methods.
Lower and upper bounds are an essential issue for that problems in which some variables are restricted to fall in an interval. Those problems could be intractable if bounds are not applied.
An easy way to treat lower and upper bounds is to post-process the outputs from the neural network with a bounding function. That function can be also be interpreted as an additional layer connected to the outputs. The following figure represents an unscaling layer.
A probabilistic function takes the outputs to produce new outputs whose elements can be interpreted as probabilities. In this way, the probabilistic outputs will always fall in the range [0,1], and the sum of all will always be 1. This form of post-processing is often used in patter recognition problems.
The probabilistic function can be interpreted as an additional layer connected to the output layer of the network architecture. The next figure shows a probabilistic layer.Note that the probabilistic layer has total connectivity, and that it does not contain any parameter. Two well-known probabilistic methods are the competitive and the softmax methods.
A neural network defines a function which is of the following form:
outputs = function(inputs).
The most important element of an OpenNN neural network is the multilayer perceptron. That composition of layers of perceptrons is a very good function approximator.
Many practical applications require, however, extensions to the multilayer perceptron. OpenNN presents a neural network with some of the most standard extensions. They include the scaling, unscaling, bounding, probabilistic or conditions layers.
For instance, a function regression problem might require a multilayer perceptron with scaling and unscaling layers. On the other hand, an optimal control problem may need a multilayer perceptron with a conditions layer.
Finally, some problems might require the use of other adjustable parameters than those belonging to the multilayer perceptron. That kind of parameters are called independent parameters.
Some basic information related to the input and output variables of a neural network includes the name, description and units of that variables. That information will be used to avoid errors such as interchanging the role of the variables, misunderstanding the significance of a variable or using a wrong units system.
The characterization in classes of the concepts studied in the previous section is as follows:
The next task is then to establish which classes are abstract and to derive the necessary concrete classes to be added to the system.
The neural network class in OpenNN will be intensively used by any application. Therefore, for performance reasons, all the composing classes have been designed to be concrete.
Let us then examine the classes we have so far:An attribute is a named value or relationship that exists for all or some instances of a class. An operation is a procedure associated with a class.
In UML class diagrams, classes are depicted as boxes with three sections: the top one indicates the name of the class, the one in the middle lists the attributes of the class, and the bottom one lists the operations.
As it has been said, OpenNN implements quite a general neural network in the class NeuralNetwork. It contains a multilayer perceptron with an arbitrary number of layers of perceptrons. On the other hand, it includes additional layers for inputs scaling, outputs unscaling, outputs bounding, outputs probabilizing or outputs holding some other conditions. This neural network can deal with a wide range of problems. Finally this class includes independent parameters, which can be useful for some problems.
The NeuralNetwork class is one of the most important in OpenNN, having many different members, constructors and methods.All that members are declared as private, and they can only be used with their corresponding get or set methods.
There are several constructors for the NeuralNetwork class, with different arguments.
The default activation function for the hidden layers is the hyperbolic tangent, and for the output layer is the linear. No default information, statistics, scaling, boundary conditions or bounds are set.
The easiest way of creating a neural object is by means of the default constructor, which creates an empty neural network.
NeuralNetwork nn;
To construct neural network having a multilayer perceptron with, for example, 3 inputs and 2 output neurons, we use the one layer constructor
NeuralNetwork nn(3, 2);
All the parameters in the multilayer perceptron object that we have constructed so far are initialized with random values chosen from a normal distribution with mean 0 and standard deviation 1. By default, this one-layer perceptron will have linear activation function.
To construct a neural network containing a multilayer perceptron object with, for example, 1 input, a single hidden layer of 3 neurons and an output layer with 2 neurons, we use the two layers constructor
NeuralNetwork nn(1,6,2);
All the parameters here are also initialized at random. By default, the hidden layer will have hyperbolic tangent activation function and the output layer will have linear activation function.
In order to construct a neural network with a more complex multilayer perceptron, its architecture must be specified in a vector of unsigned integers. For instance, to construct a multilayer perceptron with 1 input, 3 hidden layers with 2, 4 and 3 neurons and an output layer with 1 neuron we can write
Vector<unsigned> architecture(5); architecture[0] = 1; architecture[1] = 2; architecture[2] = 4; architecture[3] = 3; architecture[4] = 1; NeuralNetwork nn(architecture);
The network parameters here are also initialized at random.
The independent parameters constructor creates a neural network object without a multilayer perceptron but with a given number of independent parameters,
NeuralNetwork nn(3);
The above object can be used, for instance, as the basis for solving a function optimization problem not related to neural networks.
It is possible to construct a neural network by loading its members from a XML file. That is done in the following way,
NeuralNetwork nn(`neural_network.xml');
Please follow the format of the neural network file strictly.
Finally, the copy constructor can be used to create an object by copying the members from another object,
NeuralNetwork nn1(2, 4, 3); NeuralNetwork nn2(&nn1);
This class implements get and set methods for each member. The following sentences show the use of some of them,
NeuralNetwork nn(3, 2); MultilayerPerceptronPointer* mlpp = nn.get_multilayer_perceptron_pointer(); unsigned inputs_number = mlpp.count_inputs_number(); unsigned outputs_number = mlpp.count_outputs_number();
The number of parameters of the neural network above can be accessed as follows
unsigned parameters_number = nn.count_parameters_number();
The network parameters can be initialized with a given value by using the initialize() method,
NeuralNetwork nn(4, 3, 2); nn.initialize(0.0);
To calculate the output vector of the network in response to an input vector, we use the method calculate_outputs(). For instance, the following sentence returns the neural network output value for an input value.
Vector<double> inputs(1); inputs[0] = 0.5; Vector<double> outputs = nn.calculate_outputs(inputs);
To calculate the Jacobian matrix of the network in response to an input vector we use the method calculate_Jacobian(). For instance, the following sentence returns the partial derivatives of the outputs with respect to the inputs.
Matrix<double> Jacobian = nn.calculate_Jacobian(inputs);
We can save a neural network object to a data file by using the method save(). For instance, the next code saves the neural network object to the file neural_network.xml.
NeuralNetwork nn; nn.save(`neural_network.xml');
We can also load a neural network object from a data file by using the method load(). Indeed, the following sentence loads the neural network object from the file neural_network.xml.
nn.load(`neural_network.xml');
OpenNN Copyright © 2014 Roberto Lopez (Artelnics)