OpenCV  3.0.0-dev
Open Source Computer Vision
Classes | Enumerations | Functions
Machine Learning

Classes

class  cv::ml::ANN_MLP
 Artificial Neural Networks - Multi-Layer Perceptrons. More...
 
class  cv::ml::Boost
 Boosted tree classifier derived from DTrees. More...
 
class  cv::ml::DTrees
 The class represents a single decision tree or a collection of decision trees. More...
 
class  cv::ml::EM
 The class implements the Expectation Maximization algorithm. More...
 
class  cv::ml::SVM::Kernel
 
class  cv::ml::KNearest
 The class implements K-Nearest Neighbors model. More...
 
class  cv::ml::LogisticRegression
 Implements Logistic Regression classifier. More...
 
class  cv::ml::DTrees::Node
 The class represents a decision tree node. More...
 
class  cv::ml::NormalBayesClassifier
 Bayes classifier for normally distributed data. More...
 
class  cv::ml::ParamGrid
 The structure represents the logarithmic grid range of statmodel parameters. More...
 
class  cv::ml::RTrees
 The class implements the random forest predictor. More...
 
class  cv::ml::DTrees::Split
 The class represents split in a decision tree. More...
 
class  cv::ml::StatModel
 Base class for statistical models in OpenCV ML. More...
 
class  cv::ml::SVM
 Support Vector Machines. More...
 
class  cv::ml::TrainData
 Class encapsulating training data. More...
 

Enumerations

enum  {
  cv::ml::EM::DEFAULT_NCLUSTERS =5,
  cv::ml::EM::DEFAULT_MAX_ITERS =100
}
 Default parameters. More...
 
enum  {
  cv::ml::EM::START_E_STEP =1,
  cv::ml::EM::START_M_STEP =2,
  cv::ml::EM::START_AUTO_STEP =0
}
 The initial step. More...
 
enum  cv::ml::ANN_MLP::ActivationFunctions {
  cv::ml::ANN_MLP::IDENTITY = 0,
  cv::ml::ANN_MLP::SIGMOID_SYM = 1,
  cv::ml::ANN_MLP::GAUSSIAN = 2
}
 
enum  cv::ml::ErrorTypes {
  cv::ml::TEST_ERROR = 0,
  cv::ml::TRAIN_ERROR = 1
}
 Error types More...
 
enum  cv::ml::StatModel::Flags {
  cv::ml::StatModel::UPDATE_MODEL = 1,
  cv::ml::StatModel::RAW_OUTPUT =1,
  cv::ml::StatModel::COMPRESSED_INPUT =2,
  cv::ml::StatModel::PREPROCESSED_INPUT =4
}
 
enum  cv::ml::DTrees::Flags {
  cv::ml::DTrees::PREDICT_AUTO =0,
  cv::ml::DTrees::PREDICT_SUM =(1<<8),
  cv::ml::DTrees::PREDICT_MAX_VOTE =(2<<8),
  cv::ml::DTrees::PREDICT_MASK =(3<<8)
}
 
enum  cv::ml::SVM::KernelTypes {
  cv::ml::SVM::CUSTOM =-1,
  cv::ml::SVM::LINEAR =0,
  cv::ml::SVM::POLY =1,
  cv::ml::SVM::RBF =2,
  cv::ml::SVM::SIGMOID =3,
  cv::ml::SVM::CHI2 =4,
  cv::ml::SVM::INTER =5
}
 SVM kernel type More...
 
enum  cv::ml::LogisticRegression::Methods {
  cv::ml::LogisticRegression::BATCH = 0,
  cv::ml::LogisticRegression::MINI_BATCH = 1
}
 Training methods. More...
 
enum  cv::ml::SVM::ParamTypes {
  cv::ml::SVM::C =0,
  cv::ml::SVM::GAMMA =1,
  cv::ml::SVM::P =2,
  cv::ml::SVM::NU =3,
  cv::ml::SVM::COEF =4,
  cv::ml::SVM::DEGREE =5
}
 SVM params type More...
 
enum  cv::ml::LogisticRegression::RegKinds {
  cv::ml::LogisticRegression::REG_DISABLE = -1,
  cv::ml::LogisticRegression::REG_L1 = 0,
  cv::ml::LogisticRegression::REG_L2 = 1
}
 Regularization kinds. More...
 
enum  cv::ml::SampleTypes {
  cv::ml::ROW_SAMPLE = 0,
  cv::ml::COL_SAMPLE = 1
}
 Sample types. More...
 
enum  cv::ml::ANN_MLP::TrainFlags {
  cv::ml::ANN_MLP::UPDATE_WEIGHTS = 1,
  cv::ml::ANN_MLP::NO_INPUT_SCALE = 2,
  cv::ml::ANN_MLP::NO_OUTPUT_SCALE = 4
}
 
enum  cv::ml::ANN_MLP::TrainingMethods {
  cv::ml::ANN_MLP::BACKPROP =0,
  cv::ml::ANN_MLP::RPROP =1
}
 
enum  cv::ml::KNearest::Types {
  cv::ml::KNearest::BRUTE_FORCE =1,
  cv::ml::KNearest::KDTREE =2
}
 Implementations of KNearest algorithm. More...
 
enum  cv::ml::SVM::Types {
  cv::ml::SVM::C_SVC =100,
  cv::ml::SVM::NU_SVC =101,
  cv::ml::SVM::ONE_CLASS =102,
  cv::ml::SVM::EPS_SVR =103,
  cv::ml::SVM::NU_SVR =104
}
 SVM type More...
 
enum  cv::ml::EM::Types {
  cv::ml::EM::COV_MAT_SPHERICAL =0,
  cv::ml::EM::COV_MAT_DIAGONAL =1,
  cv::ml::EM::COV_MAT_GENERIC =2,
  cv::ml::EM::COV_MAT_DEFAULT =COV_MAT_DIAGONAL
}
 Type of covariation matrices. More...
 
enum  cv::ml::Boost::Types {
  cv::ml::Boost::DISCRETE =0,
  cv::ml::Boost::REAL =1,
  cv::ml::Boost::LOGIT =2,
  cv::ml::Boost::GENTLE =3
}
 
enum  cv::ml::VariableTypes {
  cv::ml::VAR_NUMERICAL =0,
  cv::ml::VAR_ORDERED =0,
  cv::ml::VAR_CATEGORICAL =1
}
 Variable types. More...
 

Functions

void cv::ml::createConcentricSpheresTestSet (int nsamples, int nfeatures, int nclasses, OutputArray samples, OutputArray responses)
 Creates test set. More...
 
void cv::ml::randMVNormal (InputArray mean, InputArray cov, int nsamples, OutputArray samples)
 Generates sample from multivariate normal distribution. More...
 

Detailed Description

The Machine Learning Library (MLL) is a set of classes and functions for statistical classification, regression, and clustering of data.

Most of the classification and regression algorithms are implemented as C++ classes. As the algorithms have different sets of features (like an ability to handle missing measurements or categorical input variables), there is a little common ground between the classes. This common ground is defined by the class cv::ml::StatModel that all the other ML classes are derived from.

See detailed overview here: Machine Learning Overview.

Enumeration Type Documentation

anonymous enum

Default parameters.

Enumerator
DEFAULT_NCLUSTERS 
DEFAULT_MAX_ITERS 
anonymous enum

The initial step.

Enumerator
START_E_STEP 
START_M_STEP 
START_AUTO_STEP 

possible activation functions

Enumerator
IDENTITY 

Identity function: \(f(x)=x\)

SIGMOID_SYM 

Symmetrical sigmoid: \(f(x)=\beta*(1-e^{-\alpha x})/(1+e^{-\alpha x}\)

Note
If you are using the default sigmoid activation function with the default parameter values fparam1=0 and fparam2=0 then the function used is y = 1.7159*tanh(2/3 * x), so the output will range from [-1.7159, 1.7159], instead of [0,1].
GAUSSIAN 

Gaussian function: \(f(x)=\beta e^{-\alpha x*x}\)

Error types

Enumerator
TEST_ERROR 
TRAIN_ERROR 

Predict options

Enumerator
UPDATE_MODEL 
RAW_OUTPUT 

makes the method return the raw results (the sum), not the class label

COMPRESSED_INPUT 
PREPROCESSED_INPUT 

Predict options

Enumerator
PREDICT_AUTO 
PREDICT_SUM 
PREDICT_MAX_VOTE 
PREDICT_MASK 

SVM kernel type

A comparison of different kernels on the following 2D test case with four classes. Four SVM::C_SVC SVMs have been trained (one against rest) with auto_train. Evaluation on three different kernels (SVM::CHI2, SVM::INTER, SVM::RBF). The color depicts the class with max score. Bright means max-score > 0, dark means max-score < 0.

SVM_Comparison.png
image
Enumerator
CUSTOM 

Returned by SVM::getKernelType in case when custom kernel has been set

LINEAR 

Linear kernel. No mapping is done, linear discrimination (or regression) is done in the original feature space. It is the fastest option. \(K(x_i, x_j) = x_i^T x_j\).

POLY 

Polynomial kernel: \(K(x_i, x_j) = (\gamma x_i^T x_j + coef0)^{degree}, \gamma > 0\).

RBF 

Radial basis function (RBF), a good choice in most cases. \(K(x_i, x_j) = e^{-\gamma ||x_i - x_j||^2}, \gamma > 0\).

SIGMOID 

Sigmoid kernel: \(K(x_i, x_j) = \tanh(\gamma x_i^T x_j + coef0)\).

CHI2 

Exponential Chi2 kernel, similar to the RBF kernel: \(K(x_i, x_j) = e^{-\gamma \chi^2(x_i,x_j)}, \chi^2(x_i,x_j) = (x_i-x_j)^2/(x_i+x_j), \gamma > 0\).

INTER 

Histogram intersection kernel. A fast kernel. \(K(x_i, x_j) = min(x_i,x_j)\).

Training methods.

Enumerator
BATCH 
MINI_BATCH 

Set MiniBatchSize to a positive integer when using this method.

SVM params type

Enumerator
GAMMA 
NU 
COEF 
DEGREE 

Regularization kinds.

Enumerator
REG_DISABLE 

Regularization disabled.

REG_L1 

L1 norm

REG_L2 

L2 norm

Sample types.

Enumerator
ROW_SAMPLE 

each training sample is a row of samples

COL_SAMPLE 

each training sample occupies a column of samples

Train options

Enumerator
UPDATE_WEIGHTS 

Update the network weights, rather than compute them from scratch. In the latter case the weights are initialized using the Nguyen-Widrow algorithm.

NO_INPUT_SCALE 

Do not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation equal to 1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization.

NO_OUTPUT_SCALE 

Do not normalize the output vectors. If the flag is not set, the training algorithm normalizes each output feature independently, by transforming it to the certain range depending on the used activation function.

Available training methods

Enumerator
BACKPROP 

The back-propagation algorithm.

RPROP 

The RPROP algorithm. See [117] for details.

Implementations of KNearest algorithm.

Enumerator
BRUTE_FORCE 
KDTREE 

SVM type

Enumerator
C_SVC 

C-Support Vector Classification. n-class classification (n \(\geq\) 2), allows imperfect separation of classes with penalty multiplier C for outliers.

NU_SVC 

\(\nu\)-Support Vector Classification. n-class classification with possible imperfect separation. Parameter \(\nu\) (in the range 0..1, the larger the value, the smoother the decision boundary) is used instead of C.

ONE_CLASS 

Distribution Estimation (One-class SVM). All the training data are from the same class, SVM builds a boundary that separates the class from the rest of the feature space.

EPS_SVR 

\(\epsilon\)-Support Vector Regression. The distance between feature vectors from the training set and the fitting hyper-plane must be less than p. For outliers the penalty multiplier C is used.

NU_SVR 

\(\nu\)-Support Vector Regression. \(\nu\) is used instead of p. See [26] for details.

Type of covariation matrices.

Enumerator
COV_MAT_SPHERICAL 

A scaled identity matrix \(\mu_k * I\). There is the only parameter \(\mu_k\) to be estimated for each matrix. The option may be used in special cases, when the constraint is relevant, or as a first step in the optimization (for example in case when the data is preprocessed with PCA). The results of such preliminary estimation may be passed again to the optimization procedure, this time with covMatType=EM::COV_MAT_DIAGONAL.

COV_MAT_DIAGONAL 

A diagonal matrix with positive diagonal elements. The number of free parameters is d for each matrix. This is most commonly used option yielding good estimation results.

COV_MAT_GENERIC 

A symmetric positively defined matrix. The number of free parameters in each matrix is about \(d^2/2\). It is not recommended to use this option, unless there is pretty accurate initial estimation of the parameters and/or a huge number of training samples.

COV_MAT_DEFAULT 

Boosting type. Gentle AdaBoost and Real AdaBoost are often the preferable choices.

Enumerator
DISCRETE 

Discrete AdaBoost.

REAL 

Real AdaBoost. It is a technique that utilizes confidence-rated predictions and works well with categorical data.

LOGIT 

LogitBoost. It can produce good regression fits.

GENTLE 

Gentle AdaBoost. It puts less weight on outlier data points and for that reason is often good with regression data.

Variable types.

Enumerator
VAR_NUMERICAL 

same as VAR_ORDERED

VAR_ORDERED 

ordered variables

VAR_CATEGORICAL 

categorical variables

Function Documentation

void cv::ml::createConcentricSpheresTestSet ( int  nsamples,
int  nfeatures,
int  nclasses,
OutputArray  samples,
OutputArray  responses 
)

Creates test set.

void cv::ml::randMVNormal ( InputArray  mean,
InputArray  cov,
int  nsamples,
OutputArray  samples 
)

Generates sample from multivariate normal distribution.

Parameters
meanan average row vector
covsymmetric covariation matrix
nsamplesreturned samples count
samplesreturned samples array