Operators Catalogue

caffe2/operators/accuracy_op.cc

Accuracy

Accuracy takes two inputs- predictions and labels, and returns a float accuracy value for the batch. Predictions are expected in the form of 2-D tensor containing a batch of scores for various classes, and labels are expected in the form of 1-D tensor containing true label indices of samples in the batch. If the score for the label index in the predictions is the highest among all classes, it is considered a correct prediction.

Interface

Inputs
`predictions`	2-D tensor (Tensor) of size (num_batches x num_classes) containing scores
`labels`	1-D tensor (Tensor) of size (num_batches) having the indices of true labels
Outputs
`accuracy`	1-D tensor (Tensor) of size 1 containing accuracy

Code

Add

Performs element-wise binary addition (with limited broadcast support). If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and type as A

Code

caffe2/operators/sequence_ops.cc

AddPadding

Given a partitioned tensor T<N, D1…, Dn>, where the partitions are defined as ranges on its outer-most (slowest varying) dimension N, with given range lengths, return a tensor T<N + 2*padding_width, D1 …, Dn> with paddings added to the start and end of each range. Optionally, different paddings can be provided for beginning and end. Paddings provided must be a tensor T<D1…, Dn>. If no padding is provided, add zero padding. If no lengths vector is provided, add padding only once, at the start and end of data.

Interface

Arguments
`padding_width`	Number of copies of padding to add around each range.
`end_padding_width`	(Optional) Specifies a different end-padding width.
Inputs
`data_in`	(T<N, D1…, Dn>) Input data
`lengths`	(i64) Num of elements in each range. sum(lengths) = N.
`start_padding`	T<D1…, Dn> Padding data for range start.
`end_padding`	T<D1…, Dn> (optional) Padding for range end. If not provided, start_padding is used as end_padding as well.
Outputs
`data_out`	(T<N + 2*padding_width, D1…, Dn>) Padded data.
`lengths_out`	(i64, optional) Lengths for each padded range.

Code

Alias

Makes the output and the input share the same underlying storage. WARNING: in general, in caffe2’s operator interface different tensors should have different underlying storage, which is the assumption made by components such as the dependency engine and memory optimization. Thus, in normal situations you should not use the AliasOp, especially in a normal forward-backward pass. The Alias op is provided so one can achieve true asynchrony, such as Hogwild, in a graph. But make sure you understand all the implications similar to multi-thread computation before you use it explicitly.

Interface

Inputs
`input`	Input tensor whose storage will be shared.
Outputs
`output`	Tensor of same shape as input, sharing its storage.

Code

caffe2/operators/communicator_op.cc

Allgather

Does an allgather operation among the nodes.

Interface

Inputs
`comm_world`	The common world.
`X`	A tensor to be allgathered.
Outputs
`Y`	The allgathered tensor, same on all nodes.

Code

Allreduce

Does an allreduce operation among the nodes. Currently only Sum is supported.

Interface

Inputs
`comm_world`	The common world.
`X`	A tensor to be allreduced.
Outputs
`Y`	The allreduced tensor, same on all nodes.

Code

caffe2/operators/communicator_op.cc

And

Performs element-wise logical operation and (with limited broadcast support). Both input operands should be of type bool . If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and A and type `bool`

Code

caffe2/operators/loss_op.cc

Append

Append input 2 to the end of input 1. Input 1 must be the same as output, that is, it is required to be in-place. Input 1 may have to be re-allocated in order for accommodate to the new size. Currently, an exponential growth ratio is used in order to ensure amortized constant time complexity. All except the outer-most dimension must be the same between input 1 and 2.

Interface

Inputs
`dataset`	The tensor to be appended to.
`new_data`	Tensor to append to the end of dataset.
Outputs
`dataset`	Same as input 0, representing the mutated tensor.

Code

BatchBoxCox

Input data is a N * D matrix. Apply box-cox transform for each column. lambda1 and lambda2 is of size D that defines the hyper-paramteres for the transform of each column x of the input data :

    ln(x + lambda2), if lambda1 == 0
    ((x + lambda2)^lambda1 - 1)/lambda1, if lambda1 != 0

Interface

Inputs
`data`	input float or double N * D matrix
`lambda1`	tensor of size D with the same type as data
`lambda2`	tensor of size D with the same type as data
Outputs
`output`	output matrix that applied box-cox transform

Code

caffe2/operators/batch_box_cox_op.cc

BatchMatMul

Batch Matrix multiplication Yi = Ai * Bi, where A has size (C x M x K), B has size (C x K x N) where C is the batch size and i ranges from 0 to C-1.

Interface

Arguments
`trans_a`	Pass 1 to transpose A before multiplication
`trans_b`	Pass 1 to transpose B before multiplication
Inputs
`A`	3D matrix of size (C x M x K)
`B`	3D matrix of size (C x K x N)
Outputs
`Y`	3D matrix of size (C x M x N)

Code

caffe2/operators/batch_matmul_op.cc

BatchOneHot

Input is a matrix tensor. Its first dimension is the batch size. Expand each column of it using one hot encoding. The lengths specifies the size of each column after encoding, and the values is the dictionary value of one-hot encoding for each column. For example If data = [[2, 3], [4, 1], [2, 5]], lengths = [2, 3], and values = [2, 4, 1, 3, 5], then output = [[1, 0, 0, 1, 0], [0, 1, 1, 0, 0], [1, 0, 0, 0, 1]]

Interface

Inputs
`data`	input tensor matrix
`lengths`	the size is the same as the width of the `data`
`values`	one hot encoding dictionary values
Outputs
`output`	output matrix that expands each input column with one hot encoding

BooleanUnmask

Given a series of mask and values, reconstruct values together according to masks. A comprehensive example: mask1

1	= True, False, True, False, False

values1 = 1.0, 3.0 mask2

1	= False, True, False, False, False

values2 = 2.0 mask3

1	= False, False, False, True, True

values3 = 4.0, 5.0 Reconstruct by: output = net.BooleanUnmask([mask1, values1, mask2, values2, mask3, values3], [“output”]) We get: output = 1.0, 2.0, 3.0, 4.0, 5.0 Note that for all mask positions, there must be at least one True. If for a field there are multiple True’s, we will accept the first value. For example: Example 1: mask1

1	= True, False

values1 = 1.0 mask2

1	= False, False

values2 = This is not allowed: output = net.BooleanUnmask([mask1, values1, mask2, values2], [“output”]) Example 2: mask1

1	= True, False

values1 = 1.0 mask2

1	= True, True

values2 = 2.0, 2.0 output = net.BooleanUnmask([mask1, values1, mask2, values2], [“output”]) We get: output = 1.0, 2.0

Interface

Outputs
`unmasked_data`	The final reconstructed unmasked data

Code

caffe2/operators/boolean_unmask_ops.cc

Broadcast

Does a broadcast operation from the root node to every other node. The tensor on each node should have been pre-created with the same shape and data type.

Interface

Arguments
`root`	(int, default 0) the root to run broadcast from.
Inputs
`comm_world`	The common world.
`X`	A tensor to be broadcasted.
Outputs
`X`	In-place as input 1.

Code

caffe2/operators/communicator_op.cc

Cast

The operator casts the elements of a given input tensor to a data type specified by the ‘to’ argument and returns an output tensor of the same size in the converted type. The ‘to’ argument must be one of the data types specified in the ‘DataType’ enum field in the TensorProto message. If the ‘to’ argument is not provided or is not one of the enumerated types in DataType, Caffe2 throws an Enforce error. NOTE: Casting to and from strings is not supported yet.

Interface

Arguments
`to`	The data type to which the elements of the input tensor are cast.Strictly must be one of the types from DataType enum in TensorProto
Inputs
`input`	Input tensor to be cast.
Outputs
`output`	Output tensor with the same shape as input with type specified by the ‘to’ argument

Code

caffe2/operators/cast_op.cc

CheckAtomicBool

Copy the value of a atomic to a bool

Interface

Inputs
`atomic_bool`	Blob containing a unique_ptr<atomic>
Outputs
`value`	Copy of the value for the atomic

Code

caffe2/operators/atomic_ops.cc

CheckCounterDone

If the internal count value <= 0, outputs true, otherwise outputs false,

Interface

Inputs
`counter`	A blob pointing to an instance of a counter.
Outputs
`done`	true if the internal count is zero or negative.

Code

caffe2/operators/counter_ops.cc

CheckDatasetConsistency

Checks that the given data fields represents a consistent dataset unther the schema specified by the fields argument. Operator fails if the fields are not consistent. If data is consistent, each field’s data can be safely appended to an existing dataset, keeping it consistent.

Interface

Arguments
`fields`	List of strings representing the string names in the formatspecified in the doc for CreateTreeCursor.
Inputs
`field_0`	Data for field 0.

Code

caffe2/operators/load_save_op.cc

Checkpoint

The Checkpoint operator is similar to the Save operator, but allows one to save to db every few iterations, with a db name that is appended with the iteration count. It takes [1, infinity) number of inputs and has no output. The first input has to be a TensorCPU of type int and has size 1 (i.e. the iteration counter). This is determined whether we need to do checkpointing.

Interface

Arguments
`absolute_path`	(int, default 0) if set, use the db path directly and do not prepend the current root folder of the workspace.
`db`	(string) a template string that one can combine with the iteration to create the final db name. For example, “/home/lonestarr/checkpoint_%08d.db”
`db_type`	(string) the type of the db.
`every`	(int, default 1) the checkpointing is carried out when (iter mod every) is zero.

Code

Clip

Clip operator limits the given input within an interval. The interval is specified with arguments ‘min’ and ‘max’. They default to numeric_limits::min() and numeric_limits::max() respectively. The clipping operation can be done in in-place fashion too, where the input and output blobs are the same.

Interface

Arguments
`min`	Minimum value, under which element is replaced by min
`max`	Maximum value, above which element is replaced by max
Inputs
`input`	Input tensor (Tensor) containing elements to beclipped
`output`	Output tensor (Tensor) containing clippedinput elements

Interface

Inputs
`vector of Tensor`	std::unique_ptr<std::vector >
Outputs
`tensor`	tensor after concatenating

Code

caffe2/operators/atomic_ops.cc

ConditionalSetAtomicBool

1	Set an atomic<bool> to true if the given condition bool variable is true

Interface

Inputs
`atomic_bool`	Blob containing a unique_ptr<atomic>
`condition`	Blob containing a bool

Code

ConstantFill

The operator fills the elements of the output tensor with a constant value specified by the ‘value’ argument. The data type is specified by the ‘dtype’ argument. The ‘dtype’ argument must be one of the data types specified in the ‘DataType’ enum field in the TensorProto message. If the ‘dtype’ argument is not provided, the data type of ‘value’ is used. The output tensor shape is specified by the ‘shape’ argument. If the number of input is 1, the shape will be identical to that of the input at run time with optional additional dimensions appended at the end as specified by ‘extra_shape’ argument. In that case the ‘shape’ argument should not be set. If input_as_shape is set to true, then the input should be a 1D tensor containing the desired output shape (the dimensions specified in extra_shape will also be appended) NOTE: Currently, it supports data type of float, int32, int64, and bool.

Interface

Arguments
`value`	The value for the elements of the output tensor.
`dtype`	The data type for the elements of the output tensor.Strictly must be one of the types from DataType enum in TensorProto.
`shape`	The shape of the output tensor.Cannot set the shape argument and pass in an input at the same time.
`extra_shape`	The additional dimensions appended at the end of the shape indicatedby the input blob.Cannot set the extra_shape argument when there is no input blob.
`input_as_shape`	1D tensor containing the desired output shape
Inputs
`input`	Input tensor (optional) to provide shape information.
Outputs
`output`	Output tensor of constant values specified by ‘value’argument and its type is specified by the ‘dtype’ argument

Code

caffe2/operators/filler_op.cc

Conv

The convolution operator consumes an input vector, the filter blob and the bias blob and computes the output. Note that other parameters, such as the stride and kernel size, or the pads’ sizes in each direction are not necessary for input because they are provided by the ConvPoolOpBase operator. Various dimension checks are done implicitly, and the sizes are specified in the Input docs for this operator. As is expected, the filter is convolved with a subset of the image and the bias is added; this is done throughout the image data and the output is computed. As a side note on the implementation layout: conv_op_impl.h is the templated implementation of the conv_op.h file, which is why they are separate files.

Interface

Inputs
`X`	Input data blob from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the NCHW usage. On the other hand, the NHWC Op has a different set of dimension constraints.
`filter`	The filter blob that will be used in the convolutions; has size (M x C x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel.
`bias`	The 1D bias blob that is added through the convolution; has size (M).
Outputs
`Y`	Output data blob that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.

Code

caffe2/operators/conv_op.cc

ConvGradient

No documentation yet.

Code

caffe2/operators/conv_gradient_op.cc

ConvTranspose

    The transposed convolution consumes an input vector, the filter blob, and
    the bias blob, and computes the output. Note that other parameters, such as
    the stride and kernel size, or the pads' sizes in each direction are not
    necessary for input because they are provided by the
    ConvTransposeUnpoolOpBase operator. Various dimension checks are done
    implicitly, and the sizes are specified in the Input docs for this operator.
    As is expected, the filter is deconvolved with a subset of the
    image and the bias is added; this is done throughout the image data and the
    output is computed. As a side note on the implementation layout:
    conv_transpose_op_impl.h is the templated implementation of the
    conv_transpose_op.h file, which is why they are separate files.

Interface

Inputs
`X`	Input data blob from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the NCHW usage. On the other hand, the NHWC Op has a different set of dimension constraints.
`filter`	The filter blob that will be used in the transposed convolution; has size (M x C x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel.
`bias`	The 1D bias blob that is added through the convolution;has size (C)
Outputs
`Y`	Output data blob that contains the result of the transposed convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.

CosineEmbeddingCriterion

CosineEmbeddingCriterion takes two inputs: the similarity value and the label, and computes the elementwise criterion output as output = 1 - s,

              if y == 1

1	max(0, s - margin), if y == -1

Interface

Inputs
`S`	The cosine similarity as a 1-dim TensorCPU.
`Y`	The label as a 1-dim TensorCPU with int value of 1 or -1.
Outputs
`loss`	The output loss with the same dimensionality as S.

Code

caffe2/operators/cosine_embedding_criterion_op.cc

CosineEmbeddingCriterionGradient

No documentation yet.

Code

caffe2/operators/cosine_embedding_criterion_op.cc

CosineSimilarity

  Given two input float tensors X, Y, and produces one output float tensor
  of the cosine similarity between X and Y.

Interface

Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

  Struct(
      a=Int(),
      b=List(List(Int),
      c=List(
          Struct(

1	c1=String,

            c2=List(Int),
          ),
      ),
  )

the field list will be:

  [
      "a",
      "b:lengths",
      "b:values:lengths",
      "b:values:values",
      "c:lengths",
      "c:c1",
      "c:c2:lengths",
      "c:c2:values",
  ]

And for the following instance of the struct:

  Struct(
      a=3,
      b=[[4, 5], [6, 7, 8], [], [9]],
      c=[
          Struct(c1='alex', c2=[10, 11]),
          Struct(c1='bob', c2=[12]),
      ],
  )

The values of the fields will be:

  {
      "a": [3],
      "b:lengths": [4],
      "b:values:lengths": [2, 3, 0, 1],
      "b:values:values": [4, 5, 6, 7, 8, 9],
      "c:lengths": [2],
      "c:c1": ["alex", "bob"],
      "c:c2:lengths": [2, 1],
      "c:c2:values", [10, 11, 12],
  }

In general, every field name in the format “{prefix}:lengths” defines a domain “{prefix}”, and every subsequent field in the format “{prefx}:{field}” will be in that domain, and the length of the domain is provided for each entry of the parent domain. In the example, “b:lengths” defines a domain of length 4, so every field under domain “b” will have 4 entries. The “lengths” field for a given domain must appear before any reference to that domain. Returns a pointer to an instance of the Cursor, which keeps the current offset on each of the domains defined by fields . Cursor also ensures thread-safety such that ReadNextBatch and ResetCursor can be used safely in parallel. A cursor does not contain data per se, so calls to ReadNextBatch actually need to pass a list of blobs containing the data to read for each one of the fields.

Interface

Arguments
`fields`	A list of strings each one representing a field of the dataset.
Outputs
`cursor`	A blob pointing to an instance of a new TreeCursor.

Code

CrossEntropy

Operator computes the cross entropy between the input and the label set. In practice, it is most commonly used at the end of models, after the SoftMax operator and before the AveragedLoss operator. Note that CrossEntropy assumes that the soft labels provided is a 2D array of size N x D (batch size x number of classes). Each entry in the 2D label corresponds to the soft label for the input, where each element represents the correct probability of the class being selected. As such, each element must be between 0 and 1, and all elements in an entry must sum to 1. The formula used is:

                Y[i] = sum_j (label[i][j] * log(X[i][j]))

where (i, j) is the classifier’s prediction of the jth class (the correct one), and i is the batch size. Each log has a lower limit for numerical stability.

Interface

Inputs
`X`	Input blob from the previous layer, which is almost always the result of a softmax operation; X is a 2D array of size N x D, where N is the batch size and D is the number of classes
`label`	Blob containing the labels used to compare the input
Outputs
`Y`	Output blob after the cross entropy computation

Code

DotProduct

  Given two input float tensors X, Y, and produces one output float tensor
  of the dot product between X and Y.

Interface

Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

Code

DotProductGradient

No documentation yet.

Code

DotProductWithPadding

  Given two input float tensors X, Y with different shapes and produces one
  output float tensor of the dot product between X and Y. We currently support
  two kinds of strategies to achieve this. Before doing normal dot_product 1)
  pad the smaller tensor (using pad_value) to the same shape as the other one.
  2) replicate the smaller tensor to the same shape as the other one.

Interface

Arguments
`pad_value`	the padding value for tensors with smaller dimension
`replicate`	wehther to replicate the smaller tensor or not
Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and A and type `bool`

Code

ElementwiseLinear

    Given inputs X of size (N x D), a of size D and b of size D,
    the op computes Y of size (N X D) where Y_{nd} = X_{nd} * a_d + b_d

Interface

Arguments
`axis`	default to 1; describes the axis of the inputs; defaults to one because the 0th axis most likely describes the batch_size
Inputs
`X`	2D input tensor of size (N X D) data
`a`	1D scaling factors of size D
`b`	1D biases of size D
Outputs
`Y`	2D output tensor

In backward, if the gradient passed-in is sparse gradient, change it to

1	dense gradient in linear time; otherwise, simply pass the dense gradient.

Interface

Inputs
`input`	Input tensors.
Outputs
`output`	Output tensor. Same dimension as inputs.

Code

caffe2/operators/exp_op.cc

Exp

Calculates the exponential of the given input tensor, element-wise. This operation can be done in an in-place fashion too, by providing the same input and output blobs.

Interface

Inputs
`input`	Input tensor
Outputs
`output`	The exponential of the input tensor computed element-wise

Code

ExpandDims

Insert single-dimensional entries to the shape of a tensor. Takes one required argument dims , a list of dimensions that will be inserted. Dimension indices in dims are as seen in the output tensor. For example:

  Given a tensor such that tensor.Shape() = [3, 4, 5], then
  ExpandDims(tensor, dims=[0, 4]).Shape() == [1, 3, 4, 5, 1])

If the same blob is provided in input and output, the operation is copy-free.

Interface

Inputs
`data`	Original tensor
Outputs
`expanded`	Reshaped tensor with same data as input.

Code

caffe2/operators/extend_tensor_op.cc

ExtendTensor

Extend input 0 if necessary based on max element in input 1. Input 0 must be the same as output, that is, it is required to be in-place. Input 0 may have to be re-allocated in order for accommodate to the new size. Currently, an exponential growth ratio is used in order to ensure amortized constant time complexity. All except the outer-most dimension must be the same between input 0 and 1.

Interface

Inputs
`tensor`	The tensor to be extended.
`new_indices`	The size of tensor will be extended based on max element in new_indices.
Outputs
`extended_tensor`	Same as input 0, representing the mutated tensor.

Code

FC

Computes the result of passing an input vector X into a fully connected layer with 2D weight matrix W and 1D bias vector b. The layer computes Y = X * W^T + b, where X has size (M x K), W has size (N x K), b has size (N), and Y has size (M x N), where M is the batch size. Even though b is 1D, it is resized to size (M x N) implicitly and added to each vector in the batch. These dimensions must be matched correctly, or else the operator will throw errors.

Interface

Arguments
`axis`	(int32_t) default to 1; describes the axis of the inputs; defaults to one because the 0th axis most likely describes the batch_size
Inputs
`X`	2D input of size (MxK) data
`W`	2D blob of size (KxN) containing fully connected weight matrix
`b`	1D blob containing bias vector
Outputs
`Y`	2D output tensor

Code

caffe2/operators/fully_connected_op.cc

FCGradient

No documentation yet.

Code

caffe2/operators/fully_connected_op.cc

FeedBlob

FeedBlobs the content of the blobs. The input and output blobs should be one-to-one inplace.

Interface

Arguments
`value`	(string) if provided then we will use this string as the value for theprovided output tensor

Code

caffe2/operators/feed_blob_op.cc

Find

Finds elements of second input from first input,

                outputting the last (max) index for each query.
                If query not find, inserts missing_value.
                See IndexGet() for a version that modifies the index when
                values are not found.

Interface

Arguments
`missing_value`	Placeholder for items that are not found
Inputs
`index`	Index (integers)
`query`	Needles / query
Outputs
`query_indices`	Indices of the needles in index or ‘missing value’

Code

caffe2/operators/find_op.cc

FindDuplicateElements

Shrink the data tensor by removing data blocks with given zero-based indices in the outermost dimension of the tensor. Indices are not assumed in any order or unique but with the range [0, blocks_size). Indices could be empty.

Interface

Inputs
`data`	a 1-D tensor.
Outputs
`indices`	indices of duplicate elements in data, excluding first occurrences.

Code

caffe2/operators/find_duplicate_elements_op.cc

Flatten

Flattens the input tensor into a 2D matrix, keeping the first dimension unchanged.

Interface

Inputs
`input`	A tensor of rank >= 2.
Outputs
`output`	A tensor of rank 2 with the contents of the input tensor, with first dimension equal first dimension of input, and remaining input dimensions flatenned into the inner dimension of the output.

Code

FlattenToVec

Flattens the input tensor into a 1D vector.

Interface

Inputs
`input`	A tensor of rank >= 1.
Outputs
`output`	A tensor of rank 1 with the contents of the input tensor

Code

Performs element-wise greater than comparison > (with limited broadcast support). If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and A and type `bool`

Code

Gather

Given DATA tensor of rank r >= 1, and INDICES tensor of rank q, gather entries of the outer-most dimension of DATA indexed by INDICES, and concatenate them in an output tensor of rank q + (r - 1). Example:

  DATA  = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
  ]
  INDICES = [
      [0, 1],
      [1, 2],
  ]
  OUTPUT = [
      [
          [1.0, 1.2],
          [2.3, 3.4],
      ],
      [
          [2.3, 3.4],
          [4.5, 5.7],
      ],
  ]

Interface

Inputs
`DATA`	Tensor of rank r >= 1.
`INDICES`	Tensor of int32/int64 indices, of any rank q.
Outputs
`OUTPUT`	Tensor of rank q + (r - 1).

Code

caffe2/operators/sequence_ops.cc

GatherPadding

Gather the sum of start and end paddings in a padded input sequence. Used in order to compute the gradients of AddPadding w.r.t the padding tensors.

Interface

Arguments
`padding_width`	Outer-size of padding present around each range.
`end_padding_width`	(Optional) Specifies a different end-padding width.
Inputs
`data_in`	T<N, D1…, Dn> Padded input data
`lengths`	(i64) Num of elements in each range. sum(lengths) = N. If not provided, considers all data as a single segment.
Outputs
`padding_sum`	Sum of all start paddings, or of all paddings if end_padding_sum is not provided.
`end_padding_sum`	T<D1…, Dn> Sum of all end paddings, if provided.

Code

GatherRanges

Given DATA tensor of rank 1, and RANGES tensor of rank 3, gather corresponding ranges into a 1-D tensor OUTPUT. RANGES dimentions description: 1: represents list of examples within a batch 2: represents list features 3: two values which are start and length or a range (to be applied on DATA) Another output LENGTHS represents each example length within OUTPUT Example:

  DATA  = [1, 2, 3, 4, 5, 6]
  RANGES = [
    [
      [0, 1],
      [2, 2],
    ],
    [
      [4, 1],
      [5, 1],
    ]
  ]
  OUTPUT = [1, 3, 4, 5, 6]
  LENGTHS = [3, 2]

Interface

Inputs
`DATA`	Tensor of rank 1.
`RANGES`	Tensor of int32/int64 ranges, of dims (N, M, 2). Where N is number of examples and M is a size of each example. Last dimention represents a range in the format (start, lengths)
Outputs
`OUTPUT`	1-D tensor of size sum of range lengths
`LENGTHS`	1-D tensor of size N with lengths over gathered data for each row in a batch. sum(LENGTHS) == OUTPUT.size()

GetGPUMemoryUsage

Fetches GPU memory stats from CUDAContext. Result is stored

      in output blob with shape (2, num_gpus). First row contains the total
      current memory usage, and the second row the maximum usage during
      this execution.

      NOTE: --caffe2_gpu_memory_tracking flag must be enabled to use this op.

HasElements

Returns true iff the input tensor has size > 0

Interface

Inputs
`tensor`	Tensor of any type.
Outputs
`has_elements`	Scalar bool tensor. True if input is not empty.

Code

caffe2/operators/h_softmax_op.cc

HuffmanTreeHierarchy

    HuffmanTreeHierarchy is an operator to generate huffman tree hierarchy given
    the input labels. It returns the tree as seralized HierarchyProto

Interface

Arguments
`num_classes`	The number of classes used to build the hierarchy.
Inputs
`Labels`	The labels vector
Outputs
`Hierarch`	Huffman coding hierarchy of the labels

Code

Im2Col

The Im2Col operator from Matlab.

Interface

Inputs
`X`	4-tensor in NCHW or NHWC.
Outputs
`Y`	4-tensor. For NCHW: N x (C x kH x kW) x outH x outW.For NHWC: N x outH x outW x (kH x kW x C

Code

caffe2/operators/im2col_op.cc

IndexFreeze

Freezes the given index, disallowing creation of new index entries. Should not be called concurrently with IndexGet.

Interface

Inputs
`handle`	Pointer to an Index instance.
Outputs
`handle`	The input handle.

Code

IndexGet

Given an index handle and a tensor of keys, return an Int tensor of same shape containing the indices for each of the keys. If the index is frozen, unknown entries are given index 0. Otherwise, new entries are added into the index. If an insert is necessary but max_elements has been reached, fail.

Interface

Inputs
`handle`	Pointer to an Index instance.
`keys`	Tensor of keys to be looked up.
Outputs
`indices`	Indices for each of the keys.

Code

IndexLoad

Loads the index from the given 1-D tensor. Elements in the tensor will be given consecutive indexes starting at 1. Fails if tensor contains repeated elements.

Interface

Arguments
`skip_first_entry`	If set, skips the first entry of the tensor. This allows to load tensors that are aligned with an embedding, where the first entry corresponds to the default 0 index entry.
Inputs
`handle`	Pointer to an Index instance.
`items`	1-D tensor with elements starting with index 1.
Outputs
`handle`	The input handle.

Code

IndexSize

Returns the number of entries currently present in the index.

Interface

Inputs
`handle`	Pointer to an Index instance.
Outputs
`items`	Scalar int64 tensor with number of entries.

Code

IndexStore

Stores the keys of this index in a 1-D tensor. Since element 0 is reserved for unknowns, the first element of the output tensor will be element of index 1.

Interface

Inputs
`handle`	Pointer to an Index instance.
Outputs
`items`	1-D tensor with elements starting with index 1.

Code

InstanceNorm

Carries out instance normalization as described in the paper https://arxiv.org/abs/1607.08022. Depending on the mode it is being run, there are multiple cases for the number of outputs, which we list below: * Output case #1: output * Output case #2: output, saved_mean

1	- don't use, doesn't make sense but won't crash

Output case #3: output, saved_mean, saved_inv_stdev

  - Makes sense for training only

For training mode, type 3 is faster in the sense that for the backward pass, it is able to reuse the saved mean and inv_stdev in the gradient computation.

Interface

Arguments
`epsilon`	The epsilon value to use to avoid division by zero.
`order`	A StorageOrder string.
Inputs
`input`	The input 4-dimensional tensor of shape NCHW or NHWC depending on the order parameter.
`scale`	The input 1-dimensional scale tensor of size C.
`bias`	The input 1-dimensional bias tensor of size C.
Outputs
`output`	The output 4-dimensional tensor of the same shape as input.
`saved_mean`	Optional saved mean used during training to speed up gradient computation. Should not be used for testing.
`saved_inv_stdev`	Optional saved inverse stdev used during training to speed up gradient computation. Should not be used for testing.

L1Distance

  Given two input float tensors X, Y, and produces one output float tensor
  of the L1 difference between X and Y, computed as L1(x,y) = sum over |x-y|

Interface

Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

Code

L1DistanceGradient

No documentation yet.

Code

caffe2/operators/segment_reduction_op.cc

LE

Performs element-wise less or equal than comparison <= (with limited broadcast support). If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and A and type `bool`

                            Y[i] = -log(X[i][j])

where (i, j) is the classifier’s prediction of the jth class (the correct one), and i is the batch size. Each log has a lower limit for numerical stability.

Interface

Inputs
`X`	Input blob from the previous layer, which is almost always the result of a softmax operation; X is a 2D array of size N x D, where N is the batch size and D is the number of classes
`label`	Blob containing the labels used to compare the input
Outputs
`Y`	Output blob after the cross entropy computation

Code

LengthsTile

Given DATA tensor of rank r >= 1, and LENGTHS tensor of rank 1, duplicate each entry of the outer-most dimension of DATA according to LENGTHS, and concatenate them in an output tensor of rank r. Example:

  DATA  = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
      [6.8, 7.9],
  ]
  LENGTHS = [0, 1, 3, 2]
  OUTPUT = [
      [2.3, 3.4],
      [4.5, 5.7],
      [4.5, 5.7],
      [4.5, 5.7],
      [6.8, 7.9],
      [6.8, 7.9],
  ]

Interface

Inputs
`DATA`	Tensor of rank r >= 1. First dimension must be equal to the size of lengths
`LENGTHS`	Tensor of int32 lengths of rank 1
Outputs
`OUTPUT`	Tensor of rank r

Code

caffe2/operators/lengths_tile_op.cc

LengthsToRanges

Given a vector of segment lengths, calculates offsets of each segment and packs them next to the lengths. For the input vector of length N the output is a Nx2 matrix with (offset, lengths) packaged for each segment. For example, [1, 3, 0, 2] transforms into [[0, 1], [1, 3], [4, 0], [4, 2]] .

Interface

Inputs
`lengths`	1D tensor of int32 segment lengths.
Outputs
`ranges`	2D tensor of shape len(lengths) X 2 and the same type as `lengths`

Code

caffe2/operators/pool_gradient_op.cc

LengthsToSegmentIds

Given a vector of segment lengths, returns a zero-based, consecutive vector of segment_ids. For example, [1, 3, 0, 2] will produce [0, 1, 1, 1, 3, 3]. In general, the inverse operation is SegmentIdsToLengths. Notice though that trailing empty sequence lengths can’t be properly recovered from segment ids.

Interface

Inputs
`lengths`	1D tensor of int32 or int64 segment lengths.
Outputs
`segment_ids`	1D tensor of length `sum(lengths)`

Code

MaxPoolWithIndex

    MaxPoolWithIndex consumes an input blob X and applies max pooling across the
    blob according to kernel sizes, stride sizes and pad lengths defined by the
    ConvPoolOpBase operator. It also produces an explicit mask that defines the
    location that all maximum values were found, which is re-used in the
    gradient pass. This op is deterministic.

Interface

Inputs
`X`	Input data tensor from the previous operator; dimensions depend on whether the NCHW or NHWC operators are being used. For example, in the former, the input has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. The corresponding permutation of dimensions is used in the latter case.
Outputs
`Y`	Output data tensor from average pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes.
`Index`	Mask of location indices of the found maximum values, used in the gradient operator to accumulate dY values to the appropriate locations in Y

Code

caffe2/operators/max_pool_with_index.cu

Mul

Performs element-wise binary multiplication (with limited broadcast support). If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and type as A

Code

caffe2/operators/multi_class_accuracy_op.cc

MultiClassAccuracy

Respectively compute accuracy score for each class given a number of instances and predicted scores of each class for each instance.

Interface

Inputs
`prediction`	2-D float tensor (N,D,) of predicted scores of each class for each data. N is the number of instances, i.e., batch size. D is number of possible classes/labels.
`labels`	1-D int tensor (N,) of labels for each instance.
Outputs
`accuracies`	1-D float tensor (D,) of accuracy for each class. If a class has no instance in the batch, its accuracy score is set to zero.
`amounts`	1-D int tensor (D,) of number of instances for each class in the batch.

Code

NCHW2NHWC

The operator switches the order of data in a tensor from NCHW- sample index N, channels C, height H and width W, to the NHWC order.

Interface

Inputs
`data`	The input data (Tensor) in the NCHW order.
Outputs
`output`	The output tensor (Tensor) in the NHWC order.

Code

caffe2/operators/order_switch_ops.cc

NHWC2NCHW

The operator switches the order of data in a tensor from NHWC- sample index N, height H, width H and channels C, to the NCHW order.

Interface

Inputs
`data`	The input data (Tensor) in the NHWC order.
Outputs
`output`	The output tensor (Tensor) in the NCHW order.

Code

caffe2/operators/order_switch_ops.cc

NanCheck

Identity operator, but checks all values for nan or inf

Interface

Inputs
`tensor`	Tensor to check for nan/inf
Outputs
`output`	Tensor to copy input into if no NaNs or inf. Can be in-place

Code

Negative

Computes the element-wise negative of the input.

Interface

Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and A and type `bool`

Code

caffe2/operators/perplexity_op.cc

PRelu

PRelu takes input data (Tensor) and slope tensor as input, and produces one output data (Tensor) where the function `f(x) = slope * x for x < 0` , `f(x) = x for x >= 0` ., is applied to the data tensor elementwise.

Interface

Inputs
`X`	1D input tensor
`Slope`	1D slope tensor. If `Slope` is of size 1, the value is sharedacross different channels
Outputs
`Y`	1D input tensor

Code

PiecewiseLinearTransform

PiecewiseLinearTransform takes inputs – predictions, a 2-D or 1-D tensor (Tensor) of size (batch_size x prediction_dimensions). The piecewise linear functions are stored in bounds, slopes and intercepts. The output tensor has the same shape of input `predictions` and contains the predictions transformed by the piecewise linear functions. Each column of predictions has its own piecewise linear transformation functions. Therefore the size of piecewise function parameters are pieces x prediction_dimensions, except for binary predictions where only the positive prediction needs them. Note that in each piece, low bound is excluded while high bound is included. Also the piecewise linear function must be continuous. Notes - If the input is binary predictions (Nx2 or Nx1 tensor), set the binary arg to true so that one group of piecewise linear functions is needed (see details below).

The transform parameters (bounds, slopes, intercepts) can be passed either through args or through input blobs.
If we have multiple groups of piecewise linear functions, each group has the same number of pieces.
If a prediction is out of the bounds, it is capped to the smallest or largest bound.

Interface

Arguments
`bounds`	1-D vector of size (prediction_dimensions x (pieces+1)) contain the upper bounds of each piece of linear function. One special case is the first bound is the lower bound of whole piecewise function and we treat it the same as the left most functions. (bounds, slopes, intercepts) can passed through either arg or input blobs.
`slopes`	1-D vector of size (prediction_dimensions x pieces) containing the slopes of linear function
`intercepts`	1-D vector of size (prediction_dimensions x pieces) containing the intercepts of linear function
`binary`	If set true, we assume the input is a Nx1 or Nx2 tensor. If it is Nx1 tensor, it is positive predictions. If the input is Nx2 tensor, its first column is negative predictions and second column is positive and negative + positive = 1. We just need one group of piecewise linear functions for the positive predictions.
Inputs
`predictions`	2-D tensor (Tensor) of size (num_batches x num_classes) containing scores
`bounds (optional)`	See bounds in Arg. (bounds, slopes, intercepts) can passed through either arg or input blobs.
`slopes (optional)`	See slopes in Arg. (bounds, slopes, intercepts) can passed through either arg or input blobs.
`intercepts (optional)`	See intercepts in Arg. (bounds, slopes, intercepts) can passed through either arg or input blobs.
Outputs
`transforms`	2-D tensor (Tensor) of size (num_batches x num_classes) containing transformed predictions

Code

caffe2/operators/piecewise_linear_transform_op.cc

Pow

Pow takes input data (Tensor) and an argument exponent, and produces one output data (Tensor) where the function `f(x) = x^exponent` , is applied to the data tensor elementwise.

Interface

Arguments
`exponent`	The exponent of the power function.
Inputs
`X`	Input tensor of any shape
Outputs
`Y`	Output tensor (same size as X)

Code

caffe2/operators/math_ops.cc

Print

Logs shape and contents of input tensor to stderr or to a file.

Interface

Arguments
`to_file`	(bool) if 1, saves contents to the root folder of the current workspace, appending the tensor contents to a file named after the blob name. Otherwise, logs to stderr.
Inputs
`tensor`	The tensor to print.

Code

caffe2/operators/metrics_ops.cc

QPSMetric

QPSMetric operator syncronously updates metric storedcreate a blob that will store state that is required for computing QPSMetric. The only output of the operator will have blob with QPSMetricState as an output.

Interface

Inputs
`QPS_METRIC_STATE`	Input Blob QPSMetricState, that needs to be updated
`INPUT_BATCH`	Input Blob containing a tensor with batch of the examples. First dimension of the batch will be used to get the number of examples in the batch.
Outputs
`output`	Blob with QPSMetricState

Code

QPSMetricReport

QPSMetricReport operator that syncronously consumes the QPSMetricState blob and reports the information about QPS.

Interface

Outputs
`output`	Blob with QPSMetricState

RecurrentNetwork

Run the input network in a recurrent fashion. This can be used to implement fairly general recurrent neural networks (RNNs). The operator proceeds as follows.

First, initialized the states from the input recurrent states - For each timestep T, apply the links (that map offsets from input/output

1	tensors into the inputs/outputs for the `step` network)

Finally, alias the recurrent states to the specified output blobs. This is a fairly special-case meta-operator, and so the implementation is somewhat complex. It trades of generality (and frankly usability) against performance and control (compared to e.g. TF dynamic_rnn, Theano scan, etc). See the usage examples for a flavor of how to use it.

Interface

Inputs
`mat`	The matrix
Outputs
`output`	Output

RemoveDataBlocks

Interface

Inputs
`data`	a N-D data tensor, N >= 1
`indices`	zero-based indices of blocks to be removed
Outputs
`shrunk data`	data after removing data blocks indexed by ‘indices’

Code

caffe2/operators/remove_data_blocks_op.cc

RemovePadding

Remove padding around the edges of each segment of the input data. This is the reverse opration of AddPadding, and uses the same arguments and conventions for input and output data format.

Interface

Arguments
`padding_width`	Outer-size of padding to remove around each range.
`end_padding_width`	(Optional) Specifies a different end-padding width.
Inputs
`data_in`	T<N, D1…, Dn> Input data
`lengths`	(i64) Num of elements in each range. sum(lengths) = N. If not provided, considers all data as a single segment.
Outputs
`data_out`	(T<N - 2*padding_width, D1…, Dn>) Unpadded data.
`lengths_out`	(i64, optional) Lengths for each unpadded range.

Code

caffe2/operators/sequence_ops.cc

ReplaceNaN

Replace the NaN (not a number) element in the input tensor with argument value

Interface

Arguments
`value (optional)`	the value to replace NaN, the default is 0
Inputs
`input`	Input tensor
`output`	Output tensor

Code

caffe2/operators/replace_nan_op.cc

ResetCounter

Resets a count-down counter with initial value specified by the ‘init_count’ argument.

Interface

Arguments
`init_count`	Resets counter to this value, must be >= 0.
Inputs
`counter`	A blob pointing to an instance of a new counter.
Outputs
`previous_value`	(optional) Previous value of the counter.

Code

caffe2/operators/counter_ops.cc

ResetCursor

Resets the offsets for the given TreeCursor. This operation is thread safe.

Interface

Inputs
`cursor`	A blob containing a pointer to the cursor.

Code

caffe2/operators/reshape_op.cc

Reshape

Reshape the input tensor similar to numpy.reshape. It takes a tensor as input and an optional tensor specifying the new shape. When the second input is absent, an extra argument shape must be specified. It outputs the reshaped tensor as well as the original shape. At most one dimension of the new shape can be -1. In this case, the value is inferred from the size of the tensor and the remaining dimensions. A dimension could also be 0, in which case the actual dimension value is going to be copied from the input tensor.

Interface

Arguments
`shape`	New shape
Inputs
`data`	An input tensor.
`new_shape`	New shape.
Outputs
`reshaped`	Reshaped data.
`old_shape`	Original shape.

Code

ResizeLike

Produces tensor containing data of first input and shape of second input.

Interface

Inputs
`data`	Tensor whose data will be copied into the output.
`shape_tensor`	Tensor whose shape will be applied to output.
Outputs
`output`	Tensor with data of input 0 and shape of input 1.

Code

caffe2/operators/resize_op.cc

ResizeNearest

            Resizes the spatial dimensions of the input using nearest neighbor
            interpolation. The `width_scale` and `height_scale` arguments
            control the size of the output, which is given by:
            output_width = floor(input_width * width_scale)
            output_height = floor(output_height * height_scale)

Interface

Arguments
`width_scale`	Scale along width dimension
`height_scale`	Scale along height dimension
Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

Code

RetrieveCount

Retrieve the current value from the counter.

Interface

Inputs
`counter`	A blob pointing to an instance of a counter.
Outputs
`count`	current count value.

Code

caffe2/operators/counter_ops.cc

ReversePackedSegs

Reverse segments in a 3-D tensor (lengths, segments, embeddings,), leaving paddings unchanged. This operator is used to reverse input of a recurrent neural network to make it a BRNN.

Interface

Inputs
`data`	a 3-D (lengths, segments, embeddings,) tensor.
`lengths`	length of each segment.
Outputs
`reversed data`	a (lengths, segments, embeddings,) tensor with each segment reversedand paddings unchanged.

Code

caffe2/operators/reverse_packed_segs_op.cc

RoIPool

Carries out ROI Pooling for Faster-RCNN. Depending on the mode, there are multiple output cases:

  Output case #1: Y, argmaxes (train mode)
  Output case #2: Y           (test mode)

Interface

Arguments
`is_test`	If set, run in test mode and skip computation of argmaxes (used for gradient computation). Only one output tensor is produced. (Default: false).
`order`	A StorageOrder string (Default: “NCHW”).
`pooled_h`	The pooled output height (Default: 1).
`pooled_w`	The pooled output width (Default: 1).
`spatial_scale`	Multiplicative spatial scale factor to translate ROI coords from their input scale to the scale used when pooling (Default: 1.0).
Inputs
`X`	The input 4-D tensor of data. Only NCHW order is currently supported.
`rois`	RoIs (Regions of Interest) to pool over. Should be a 2-D tensor of shape (num_rois, 5) given as [[batch_id, x1, y1, x2, y2], …].
Outputs
`Y`	RoI pooled output 4-D tensor of shape (num_rois, channels, pooled_h, pooled_w).
`argmaxes`	Argmaxes corresponding to indices in X used for gradient computation. Only output if arg “is_test” is false.

Slice

Produces a slice of the input tensor. Currently, only slicing in a single dimension is supported. Slices are passed as 2 1D vectors with starting and end indices for each dimension of the input data tensor. End indices are non-inclusive. If a negative value is passed for any of the start or end indices, it represent number of elements before the end of that dimension. Example:

  data = [
      [1, 2, 3, 4],
      [5, 6, 7, 8],
  ]
  starts = [0, 1]
  ends = [-1, 3]

  result = [
      [2, 3],
      [6, 7],
  ]

Interface

Inputs
`data`	Tensor of data to extract slices from.
`starts`	1D tensor: start-indices for each dimension of data.
`ends`	1D tensor: end-indices for each dimension of data.
Outputs
`output`	Sliced data tensor.

Code

caffe2/operators/softmax_op.cc

Softmax

The operator computes the softmax normalized values for each layer in the batch of the given input. The input is a 2-D tensor (Tensor) of size (batch_size x input_feature_dimensions). The output tensor has the same shape and contains the softmax normalized values of the corresponding input.

Interface

Inputs
`input`	The input data as 2-D Tensor.
Outputs
`output`	The softmax normalized output values with the same shape as input tensor.

Code

SoftmaxGradient

No documentation yet.

Code

caffe2/operators/softmax_op.cc

SoftmaxWithLoss

Combined Softmax and Cross-Entropy loss operator. The operator computes the softmax normalized values for each layer in the batch of the given input, after which cross-entropy loss is computed. This operator is numerically more stable than separate Softmax and CrossEntropy ops. The inputs are a 2-D tensor (Tensor) of size (batch_size x input_feature_dimensions) and tensor of labels (ground truth). Output is tensor with the probability for each label for each example (N x D) and averaged loss (scalar). Use parameter spatial=1 to enable spatial softmax. Spatial softmax also supports special \"don't care\" label (-1) that is ignored when computing the loss. Use parameter label_prob=1 to enable inputting labels as a probability distribution.

1	Currently does not handle spatial=1 case.

Optional third input blob can be used to weight the samples for the loss. For the spatial version, weighting is by x,y position of the input.

Interface

Inputs
`logits`	Unscaled log probabilities
`labels`	Ground truth
`weight_tensor`	Optional blob to be used to weight the samples for the loss. With spatial set, weighting is by x,y of the input
Outputs
`softmax`	Tensor with softmax cross entropy loss
`loss`	Average loss

) of the given input tensor element-wise. This operation can be done in an in-place fashion too, by providing the same input and output blobs.

Interface

Inputs
`input`	1-D input tensor
Outputs
`output`	The softsign (x/1+	x	) values of the input tensor computed element-wise

Code

caffe2/operators/softsign_op.cc

SoftsignGradient

Calculates the softsign gradient (sgn(x)/(1+

)^2) of the given input tensor element-wise.

Interface

Inputs
`input`	1-D input tensor
`input`	1-D input tensor
Outputs
`output`	The softsign gradient (sgn(x)/(1+	x	)^2) values of the input tensor computed element-wise

Code

caffe2/operators/softsign_op.cc

SortAndShuffle

Compute the sorted indices given a field index to sort by and break the sorted indices into chunks of shuffle_size * batch_size and shuffle each chunk, finally we shuffle between batches. If sort_by_field_idx is -1 we skip sort. For example, we have data sorted as 1,2,3,4,5,6,7,8,9,10,11,12 and batchSize = 2 and shuffleSize = 3, when we shuffle we get: [3,1,4,6,5,2] [12,10,11,8,9,7] After this we will shuffle among different batches with size 2 [3,1],[4,6],[5,2],[12,10],[11,8],[9,7] We may end up with something like [9,7],[5,2],[12,10],[4,6],[3,1],[11,8] Input(0) is a blob pointing to a TreeCursor, and [Input(1),… Input(num_fields)] a list of tensors containing the data for each field of the dataset. SortAndShuffle is thread safe.

Interface

Inputs
`cursor`	A blob containing a pointer to the cursor.
`dataset_field_0`	First dataset field
Outputs
`indices`	Tensor containing sorted indices.

Code

caffe2/operators/segment_reduction_op.cc

SortedSegmentMean

Applies ‘Mean’ to each segment of input tensor. Segments need to be sorted and contiguous. See also UnsortedSegmentMean that doesn’t have this requirement. SEGMENT_IDS is a vector that maps each of the first dimension slices of the DATA to a particular group (segment). Values belonging to the same segment are aggregated together. The first dimension of the output is equal to the number of input segments, i.e. SEGMENT_IDS[-1]+1 . Other dimensions are inherited from the input tensor. Mean computes the element-wise mean of the input slices. Operation doesn’t change the shape of the individual blocks.

Interface

Inputs
`DATA`	Input tensor, slices of which are aggregated.
`SEGMENT_IDS`	Vector with the same length as the first dimension of DATA and values in the range 0..K-1 and in increasing order that maps each slice of DATA to one of the segments
Outputs
`OUTPUT`	Aggregated output tensor. Has the first dimension of K (the number of segments).

SparseSortedSegmentWeightedSumGradient

No documentation yet.

Code

SparseToDense

Convert sparse representations to dense with given indices. Transforms a sparse representation of map<id, value> represented as indices vector and values tensor into a compacted tensor where the first dimension is determined by the first dimension of the 3rd input if it is given or the max index. Missing values are filled with zeros. The op supports duplicated indices and performs summation over corresponding values. This behavior is useful for converting GradientSlices into dense representation. After running this op: ``` output[indices[i], :] += values[i]

1	# sum over all indices[i] equal to the index

output[j, …] = 0 if j not in indices ```

Interface

Inputs
`indices`	1-D int32/int64 tensor of concatenated ids of data
`values`	Data tensor, first dimension has to match `indices`, basic numeric types are supported
`data_to_infer_dim`	Optional: if provided, the first dimension of output is the first dimension of this tensor.
Outputs
`output`	Output tensor of the same type as `values` of shape `[len(lengths), len(mask)] + shape(default_value)` (if `lengths` is not provided the first dimension is omitted)

Code

caffe2/operators/sparse_to_dense_op.cc

SparseToDenseMask

Convert sparse representations to dense with given indices. Transforms a sparse representation of map<id, value> represented as indices vector and values tensor into a compacted tensor where the first dimension corresponds to each id provided in mask argument. Missing values are filled with the value of default_value . After running this op: output[j, :] = values[i] # where mask[j] == indices[i] output[j, ...] = default_value # when mask[j] doesn't appear in indices If lengths is provided and not empty, and extra “batch” dimension is prepended to the output. values and default_value can have additional matching dimensions, operation is performed on the entire subtensor in thise case. For example, if lengths is supplied and values is 1-D vector of floats and default_value is a float scalar, the output is going to be a float matrix of size len(lengths) X len(mask)

Interface

Arguments
`mask`	list(int) argument with desired ids on the ‘dense’ output dimension
`return_presence_mask`	bool whether to return presence mask, false by default
Inputs
`indices`	1-D int32/int64 tensor of concatenated ids of data
`values`	Data tensor, first dimension has to match `indices`
`default_value`	Default value for the output if the id is not present in `indices`. Must have the same type as `values` and the same shape, but without the first dimension
`lengths`	Optional lengths to represent a batch of `indices` and `values`.
Outputs
`output`	Output tensor of the same type as `values` of shape `[len(lengths), len(mask)] + shape(default_value)` (if `lengths` is not provided the first dimension is omitted)
`presence_mask`	Bool tensor of shape `[len(lengths), len(mask)]` (if `lengths` is not provided the first dimension is omitted). True when a value for given id was present, false otherwise.

Code

caffe2/operators/sparse_to_dense_mask_op.cc

SparseUnsortedSegmentMean

Pulls in slices of the input tensor, groups them into segments and applies ‘Mean’ to each segment. Segments ids can appear in arbitrary order (unlike in SparseSortedSegmentMean). This op is basically Gather and UnsortedSegmentMean fused together. INDICES should contain integers in range 0..N-1 where N is the first dimension of DATA. INDICES represent which slices of DATA need to be pulled in. SEGMENT_IDS is a vector that maps each referenced slice of the DATA to a particular group (segment). Values belonging to the same segment are aggregated together. SEGMENT_IDS should have the same dimension as INDICES. If num_segments argument is passed it would be used as a first dimension for the output. Otherwise, it’d be dynamically calculated from as the max value of SEGMENT_IDS plus one. Other output dimensions are inherited from the input tensor. Mean computes the element-wise mean of the input slices. Operation doesn’t change the shape of the individual blocks.

Interface

Inputs
`DATA`	Input tensor, slices of which are aggregated.
`INDICES`	Integer vector containing indices of the first dimension of DATA for the slices that are being aggregated
`SEGMENT_IDS`	Integer vector with the same length as INDICES that maps each slice of DATA referenced by INDICES to one of the segments
Outputs
`OUTPUT`	Aggregated output tensor. Has the first dimension of equal to the number of segments.

Code

caffe2/operators/spatial_batch_norm_gradient_op.cc

Split

Split a tensor into a list of tensors, along the specified

    'axis'. The lengths of the split can be specified using argument 'axis' or
    optional second input blob to the operator. Otherwise, the tensor is split
    to equal sized parts.

Interface

Arguments
`axis`	Which axis to split on
`split`	length of each output
`order`	Either NHWC or NCWH, will split on C axis
Inputs
`input`	The tensor to split
`split`	Optional list of output lengths (see also arg ‘split’)

Code

caffe2/operators/concat_split_op.cc

Sqr

Square (x^2) the elements of the input

Interface

Inputs
`input`	Input tensor
Outputs
`output`	Squared elements of the input

Code

caffe2/operators/math_ops.cc

SquareRootDivide

Given DATA tensor with first dimention N and SCALE vector of the same size N produces an output tensor with same dimensions as DATA. Which consists of DATA slices. i-th slice is divided by sqrt(SCALE[i]) elementwise. If SCALE[i] == 0 output slice is identical to the input one (no scaling) Example:

  Data = [
    [1.0, 2.0],
    [3.0, 4.0]
  ]

  SCALE = [4, 9]

  OUTPUT = [
    [2.0, 4.0],
    [9.0, 12.0]
  ]

Code

caffe2/operators/square_root_divide_op.cc

SquaredL2Distance

  Given two input float tensors X, Y, and produces one output float tensor
  of the L2 difference between X and Y that is computed as ||(X - Y)^2 / 2||.

Interface

Inputs
`X`	1D input tensor
Outputs
`Y`	1D input tensor

Code

SquaredL2DistanceGradient

No documentation yet.

Code

Squeeze

Remove single-dimensional entries from the shape of a tensor. Takes a

1	parameter `dims` with a list of dimension to squeeze.

If the same blob is provided in input and output, the operation is copy-free. This is the exact inverse operation of ExpandDims given the same dims arg.

Interface

Inputs
`data`	Tensors with at least max(dims) dimensions.
Outputs
`squeezed`	Reshaped tensor with same data as input.

Code

caffe2/operators/stats_ops.cc

StatRegistryCreate

Create a StatRegistry object that will contain a map of performance counters keyed by name. A StatRegistry is used to gather and retrieve performance counts throuhgout the caffe2 codebase.

Interface

Outputs
`handle`	A Blob pointing to the newly created StatRegistry.

Code

StatRegistryExport

No documentation yet.

Interface

Arguments
`reset`	(default true) Whether to atomically reset the counters afterwards.
Inputs
`handle`	If provided, export values from given StatRegistry.Otherwise, export values from the global singleton StatRegistry.
Outputs
`keys`	1D string tensor with exported key names
`values`	1D int64 tensor with exported values
`timestamps`	The unix timestamp at counter retrieval.

Code

caffe2/operators/stats_ops.cc

StatRegistryUpdate

Update the given StatRegistry, or the global StatRegistry, with the values of counters for the given keys.

Interface

Inputs
`keys`	1D string tensor with the key names to update.
`values`	1D int64 tensor with the values to update.
`handle`	If provided, update the given StatRegistry. Otherwise, update the global singleton.

Sub

Performs element-wise binary subtraction (with limited broadcast support). If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and type as A

Code

Sum

Element-wise sum of each of the input tensors. The first input tensor can be used in-place as the output tensor, in which case the sum will be done in place and results will be accumulated in input0. All inputs and outputs must have the same shape and data type.

Interface

Inputs
`data_0`	First of the input tensors. Can be inplace.
Outputs
`sum`	Output tensor. Same dimension as inputs.

Code

caffe2/operators/reduction_ops.cc

SumElements

Sums the elements of the input tensor.

Interface

Arguments
`average`	whether to average or not
Inputs
`X`	Tensor to sum up
Outputs
`sum`	Scalar sum

Code

SumElementsGradient

No documentation yet.

Code

caffe2/operators/reduction_ops.cc

SumInt

No documentation yet.

Code

SumReduceLike

SumReduceLike operator takes 2 tensors as input. It performs reduce sum to the first input so that the output looks like the second one. It assumes that the first input has more dimensions than the second, and the dimensions of the second input is the contiguous subset of the dimensions of the first. For example, the following tensor shapes are supported:

  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 2, 5), shape(B) = (2), with axis=0

Interface

Arguments
`axis`	If set, defines the starting dimension for reduction. Args `axis` and `axis_str` cannot be used simultaneously.
`axis_str`	If set, it could only be N or C or H or W. `order` arg should also be provided. It defines the reduction dimensions on NCHW or NHWC. Args `axis` and `axis_str` cannot be used simultaneously.
`order`	Either NHWC or HCWH
Inputs
`A`	First operand, should share the type with the second operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and type as B

Code

caffe2/operators/reduction_ops.cc

SumSqrElements

Sums the squares elements of the input tensor.

Interface

Arguments
`average`	whether to average or not
Inputs
`X`	Tensor to sum up
Outputs
`sum`	Scalar sum of squares

Code

Summarize

Summarize computes four statistics of the input tensor (Tensor)- min, max, mean and standard deviation. The output will be written to a 1-D tensor of size 4 if an output tensor is provided. Else, if the argument 'to_file' is greater than 0, the values are written to a log file in the root folder.

Interface

Arguments
`to_file`	(int, default 0) flag to indicate if the summarized statistics have to be written to a log file.
Inputs
`data`	The input data as Tensor.
Outputs
`output`	1-D tensor (Tensor) of size 4 containing min, max, mean and standard deviation

Code

caffe2/operators/summarize_op.cc

TT

The TT-layer serves as a low-rank decomposition of a fully connected layer. The inputs are the same as to a fully connected layer, but the number of parameters are greatly reduced and forward computation time can be drastically reduced especially for layers with large weight matrices. The multiplication is computed as a product of the input vector with each of the cores that make up the TT layer. Given the input sizes (inp_sizes), output sizes(out_sizes), and the ranks of each of the cores (tt_ranks), the ith core will have size:

    inp_sizes[i] * tt_ranks[i] * tt_ranks[i + 1] * out_sizes[i].

The complexity of the computation is dictated by the sizes of inp_sizes, out_sizes, and tt_ranks, where there is the trade off between accuracy of the low-rank decomposition and the speed of the computation.

Interface

Arguments
`inp_sizes`	(int[]) Input sizes of cores. Indicates the input size of the individual cores; the size of the input vector X must match the product of the inp_sizes array.
`out_sizes`	(int[]) Output sizes of cores. Indicates the output size of the individual cores; the size of the output vector Y must match the product of the out_sizes array.
`tt_ranks`	(int[]) Ranks of cores. Indicates the ranks of the individual cores; lower rank means larger compression, faster computation but reduce accuracy.
Inputs
`X`	Input tensor from previous layer with size (M x K), where M is the batch size and K is the input size.
`b`	1D blob containing the bias vector
`cores`	1D blob containing each individual cores with sizes specified above.
Outputs
`Y`	Output tensor from previous layer with size (M x N), where M is the batch size and N is the output size.

Code

caffe2/operators/tt_linear_op.cc

Tanh

Calculates the hyperbolic tangent of the given input tensor element-wise. This operation can be done in an in-place fashion too, by providing the same input and output blobs.

Interface

Inputs
`input`	1-D input tensor
Outputs
`output`	The hyperbolic tangent values of the input tensor computed element-wise

Code

caffe2/operators/pack_segments.cc

UnsafeCoalesce

Coalesce the N inputs into N outputs and a single coalesced output blob. This allows operations that operate over multiple small kernels (e.g. biases in a deep CNN) to be coalesced into a single larger operation, amortizing the kernel launch overhead, synchronization costs for distributed computation, etc. The operator: - computes the total size of the coalesced blob by summing the input sizes - allocates the coalesced output blob as the total size - copies the input vectors into the coalesced blob, at the correct offset.

aliases each Output(i) to- point into the coalesced blob, at the

  corresponding offset for Input(i).

This is ‘unsafe’ as the output vectors are aliased, so use with caution.

Code

UnsortedSegmentMean

Applies ‘Mean’ to each segment of input tensor. Segments ids can appear in arbitrary order (unlike in SortedSegmentMean). SEGMENT_IDS is a vector that maps each of the first dimension slices of the DATA to a particular group (segment). Values belonging to the same segment are aggregated together. If num_segments argument is passed it would be used as a first dimension for the output. Otherwise, it’d be dynamically calculated from as the max value of SEGMENT_IDS plus one. Other output dimensions are inherited from the input tensor. Mean computes the element-wise mean of the input slices. Operation doesn’t change the shape of the individual blocks.

Interface

Arguments
`num_segments`	Optional int argument specifying the number of output segments and thus the first dimension of the output
Inputs
`DATA`	Input tensor, slices of which are aggregated.
`SEGMENT_IDS`	Integer vector with the same length as the first dimension of DATA that maps each slice of DATA to one of the segments
Outputs
`OUTPUT`	Aggregated output tensor. Has the first dimension of equal to the number of segments.

Xor

Performs element-wise logical operation xor (with limited broadcast support). Both input operands should be of type bool . If necessary the right-hand-side argument will be broadcasted to match the shape of left-hand-side argument. When broadcasting is specified, the second tensor can either be of size 1 (a scalar value), or having its shape as a contiguous subset of the first tensor’s shape. The starting of the mutually equal shape is specified by the argument “axis”, and if it is not set, suffix matching is assumed. 1-dim expansion doesn’t work yet. For example, the following tensor shapes are supported (with broadcast=1):

  shape(A) = (2, 3, 4, 5), shape(B) = (,), i.e. B is a scalar
  shape(A) = (2, 3, 4, 5), shape(B) = (5,)
  shape(A) = (2, 3, 4, 5), shape(B) = (4, 5)
  shape(A) = (2, 3, 4, 5), shape(B) = (3, 4), with axis=1
  shape(A) = (2, 3, 4, 5), shape(B) = (2), with axis=0

Argument broadcast=1 needs to be passed to enable broadcasting.

Interface

Arguments
`broadcast`	Pass 1 to enable broadcasting
`axis`	If set, defines the broadcast dimensions. See doc for details.
Inputs
`A`	First operand.
`B`	Second operand. With broadcasting can be of smaller size than A. If broadcasting is disabled it should be of the same size.
Outputs
`C`	Result, has same dimensions and A and type `bool`

Code