Module caffe2.python.recurrent. More...
Functions | |
def | recurrent_net (net, cell_net, inputs, initial_cell_inputs, links, timestep=None, scope=None, outputs_with_grads=(0,), recompute_blobs_on_backward=None) |
def | MILSTM (model, input_blob, seq_lengths, initial_states, dim_in, dim_out, scope, outputs_with_grads=(0,), memory_optimization=False, forget_bias=0.0) |
Module caffe2.python.recurrent.
def recurrent.MILSTM | ( | model, | |
input_blob, | |||
seq_lengths, | |||
initial_states, | |||
dim_in, | |||
dim_out, | |||
scope, | |||
outputs_with_grads = (0,) , |
|||
memory_optimization = False , |
|||
forget_bias = 0.0 |
|||
) |
Adds MI flavor of standard LSTM recurrent network operator to a model. See https://arxiv.org/pdf/1606.06630.pdf model: CNNModelHelper object new operators would be added to input_blob: the input sequence in a format T x N x D where T is sequence size, N - batch size and D - input dimention seq_lengths: blob containing sequence lengths which would be passed to LSTMUnit operator initial_states: a tupple of (hidden_input_blob, cell_input_blob) which are going to be inputs to the cell net on the first iteration dim_in: input dimention dim_out: output dimention outputs_with_grads : position indices of output blobs which will receive external error gradient during backpropagation memory_optimization: if enabled, the LSTM step is recomputed on backward step so that we don't need to store forward activations for each timestep. Saves memory with cost of computation.
Definition at line 252 of file recurrent.py.
def recurrent.recurrent_net | ( | net, | |
cell_net, | |||
inputs, | |||
initial_cell_inputs, | |||
links, | |||
timestep = None , |
|||
scope = None , |
|||
outputs_with_grads = (0,) , |
|||
recompute_blobs_on_backward = None |
|||
) |
net: the main net operator should be added to cell_net: cell_net which is executed in a recurrent fasion inputs: sequences to be fed into the recurrent net. Currently only one input is supported. It has to be in a format T x N x (D1...Dk) where T is lengths of the sequence. N is a batch size and (D1...Dk) are the rest of dimentions initial_cell_inputs: inputs of the cell_net for the 0 timestamp. Format for each input is: (cell_net_input_name, external_blob_with_data) links: a dictionary from cell_net input names in moment t+1 and output names of moment t. Currently we assume that each output becomes an input for the next timestep. timestep: name of the timestep blob to be used. If not provided "timestep" is used. scope: Internal blobs are going to be scoped in a format <scope_name>/<blob_name> If not provided we generate a scope name automatically outputs_with_grads : position indices of output blobs which will receive error gradient (from outside recurrent network) during backpropagation recompute_blobs_on_backward: specify a list of blobs that will be recomputed for backward pass, and thus need not to be stored for each forward timestep.
Definition at line 18 of file recurrent.py.