The data set class
Roberto Lopez.
[email protected]
Artelnics - Making intelligent use of data

The data set contains the information for creating the model. It comprises a data matrix in which columns represent variables and rows represent instances.

Contents:
  1. BasicTheory.
  2. Software model.
  3. Main classes.

Basic theory

The data set contains the information for creating the model. It comprises a data matrix in which columns represent variables and rows represent instances. The data is contained in a file with the following format:

d_1_1   d_1_2   ...   d_1_q
...     ...     ...   ...
d_p_1   d_p_2   ...   d_p_q

Here the number of instances is denoted p, while the number of variables is denoted q.

Variables in a data set can be of three types: Instances can be:

Software model

To do.

Main classes

The data set is represented in OpenNN by means of the DataSet class. It contains a data matrix, and information about the variables, the instances and the missing values.

The DataSet class has different members, constructors and methods.

Members

A DataSet object contains:

All that members are declared as private, and they can only be used with their corresponding get or set methods.

Constructors

There are several constructors for the DataSet class.

The easiest way of creating a DataSet object is by means of the default constructor, which creates an empty data set.

DataSet ds;

It is possible to construct a data set by loading its members from a XML file. That is done in the following way,

DataSet ds(`data_set.xml');

Methods

The most important methods are used to load the data matrix from a file.

DataSet ds;
ds.set_data_file_name("data.dat");

ds.load_data();

To calculate the output Vector of the network in response to an input Vector we use the method calculate_outputs(). For instance, the following sentence returns the neural network output value for an input value.

Vector<double> inputs(1);
inputs[0] = 0.5;

Vector<double> outputs = nn.calculate_outputs(inputs);

To calculate the statistics of the data (minimums, maximums, means and standard deviations) the following sentences can be used.

Vector< Statistics<double> > data_statistics = calculate_data_statistics();

We can save a data set object to a data file by using the method save. For instance, the next code saves the data set object to the file data_set.xml.

ds.save(`data_set.xml');

We can also load a data set object from a data file by using the method load. Indeed, the following sentence loads the neural network object from the file neural_network.xml.

ds.load(`data_set.xml');

Bibliography

OpenNN Copyright © 2014 Roberto Lopez (Artelnics)