Module caffe2.python.rnn_cell. More...
Classes | |
| class | LSTMCell |
| class | LSTMWithAttentionCell |
| class | MILSTMCell |
| class | MILSTMWithAttentionCell |
| class | RNNCell |
Functions | |
| def | LSTM (model, input_blob, seq_lengths, initial_states, dim_in, dim_out, scope, outputs_with_grads=(0,), return_params=False, memory_optimization=False, forget_bias=0.0) |
| def | GetLSTMParamNames () |
| def | InitFromLSTMParams (lstm_pblobs, param_values) |
| def | cudnn_LSTM (model, input_blob, initial_states, dim_in, dim_out, scope, recurrent_params=None, input_params=None, num_layers=1, return_params=False) |
| def | LSTMWithAttention (model, decoder_inputs, decoder_input_lengths, initial_decoder_hidden_state, initial_decoder_cell_state, initial_attention_weighted_encoder_context, encoder_output_dim, encoder_outputs, decoder_input_dim, decoder_state_dim, scope, attention_type=AttentionType.Regular, outputs_with_grads=(0, 4), weighted_encoder_outputs=None, lstm_memory_optimization=False, attention_memory_optimization=False, forget_bias=0.0) |
| def | MILSTM (model, input_blob, seq_lengths, initial_states, dim_in, dim_out, scope, outputs_with_grads=(0,), memory_optimization=False, forget_bias=0.0) |
Module caffe2.python.rnn_cell.
| def rnn_cell.cudnn_LSTM | ( | model, | |
| input_blob, | |||
| initial_states, | |||
| dim_in, | |||
| dim_out, | |||
| scope, | |||
recurrent_params = None, |
|||
input_params = None, |
|||
num_layers = 1, |
|||
return_params = False |
|||
| ) |
CuDNN version of LSTM for GPUs.
input_blob Blob containing the input. Will need to be available
when param_init_net is run, because the sequence lengths
and batch sizes will be inferred from the size of this
blob.
initial_states tuple of (hidden_init, cell_init) blobs
dim_in input dimensions
dim_out output/hidden dimension
scope namescope to apply
recurrent_params dict of blobs containing values for recurrent
gate weights, biases (if None, use random init values)
See GetLSTMParamNames() for format.
input_params dict of blobs containing values for input
gate weights, biases (if None, use random init values)
See GetLSTMParamNames() for format.
num_layers number of LSTM layers
return_params if True, returns (param_extract_net, param_mapping)
where param_extract_net is a net that when run, will
populate the blobs specified in param_mapping with the
current gate weights and biases (input/recurrent).
Useful for assigning the values back to non-cuDNN
LSTM.
Definition at line 289 of file rnn_cell.py.
| def rnn_cell.InitFromLSTMParams | ( | lstm_pblobs, | |
| param_values | |||
| ) |
Set the parameters of LSTM based on predefined values
Definition at line 258 of file rnn_cell.py.
| def rnn_cell.LSTM | ( | model, | |
| input_blob, | |||
| seq_lengths, | |||
| initial_states, | |||
| dim_in, | |||
| dim_out, | |||
| scope, | |||
outputs_with_grads = (0,), |
|||
return_params = False, |
|||
memory_optimization = False, |
|||
forget_bias = 0.0 |
|||
| ) |
Adds a standard LSTM recurrent network operator to a model.
model: CNNModelHelper object new operators would be added to
input_blob: the input sequence in a format T x N x D
where T is sequence size, N - batch size and D - input dimention
seq_lengths: blob containing sequence lengths which would be passed to
LSTMUnit operator
initial_states: a tupple of (hidden_input_blob, cell_input_blob)
which are going to be inputs to the cell net on the first iteration
dim_in: input dimention
dim_out: output dimention
outputs_with_grads : position indices of output blobs which will receive
external error gradient during backpropagation
return_params: if True, will return a dictionary of parameters of the LSTM
memory_optimization: if enabled, the LSTM step is recomputed on backward step
so that we don't need to store forward activations for each
timestep. Saves memory with cost of computation.
Definition at line 202 of file rnn_cell.py.
| def rnn_cell.LSTMWithAttention | ( | model, | |
| decoder_inputs, | |||
| decoder_input_lengths, | |||
| initial_decoder_hidden_state, | |||
| initial_decoder_cell_state, | |||
| initial_attention_weighted_encoder_context, | |||
| encoder_output_dim, | |||
| encoder_outputs, | |||
| decoder_input_dim, | |||
| decoder_state_dim, | |||
| scope, | |||
attention_type = AttentionType.Regular, |
|||
outputs_with_grads = (0, 4), |
|||
weighted_encoder_outputs = None, |
|||
lstm_memory_optimization = False, |
|||
attention_memory_optimization = False, |
|||
forget_bias = 0.0 |
|||
| ) |
Adds a LSTM with attention mechanism to a model.
The implementation is based on https://arxiv.org/abs/1409.0473, with
a small difference in the order
how we compute new attention context and new hidden state, similarly to
https://arxiv.org/abs/1508.04025.
The model uses encoder-decoder naming conventions,
where the decoder is the sequence the op is iterating over,
while computing the attention context over the encoder.
model: CNNModelHelper object new operators would be added to
decoder_inputs: the input sequence in a format T x N x D
where T is sequence size, N - batch size and D - input dimention
decoder_input_lengths: blob containing sequence lengths
which would be passed to LSTMUnit operator
initial_decoder_hidden_state: initial hidden state of LSTM
initial_decoder_cell_state: initial cell state of LSTM
initial_attention_weighted_encoder_context: initial attention context
encoder_output_dim: dimension of encoder outputs
encoder_outputs: the sequence, on which we compute the attention context
at every iteration
decoder_input_dim: input dimention (last dimension on decoder_inputs)
decoder_state_dim: size of hidden states of LSTM
attention_type: One of: AttentionType.Regular, AttentionType.Recurrent.
Determines which type of attention mechanism to use.
outputs_with_grads : position indices of output blobs which will receive
external error gradient during backpropagation
weighted_encoder_outputs: encoder outputs to be used to compute attention
weights. In the basic case it's just linear transformation of
encoder outputs (that the default, when weighted_encoder_outputs is None).
However, it can be something more complicated - like a separate
encoder network (for example, in case of convolutional encoder)
lstm_memory_optimization: recompute LSTM activations on backward pass, so
we don't need to store their values in forward passes
attention_memory_optimization: recompute attention for backward pass
Definition at line 590 of file rnn_cell.py.
| def rnn_cell.MILSTM | ( | model, | |
| input_blob, | |||
| seq_lengths, | |||
| initial_states, | |||
| dim_in, | |||
| dim_out, | |||
| scope, | |||
outputs_with_grads = (0,), |
|||
memory_optimization = False, |
|||
forget_bias = 0.0 |
|||
| ) |
Adds MI flavor of standard LSTM recurrent network operator to a model.
See https://arxiv.org/pdf/1606.06630.pdf
model: CNNModelHelper object new operators would be added to
input_blob: the input sequence in a format T x N x D
where T is sequence size, N - batch size and D - input dimention
seq_lengths: blob containing sequence lengths which would be passed to
LSTMUnit operator
initial_states: a tupple of (hidden_input_blob, cell_input_blob)
which are going to be inputs to the cell net on the first iteration
dim_in: input dimention
dim_out: output dimention
outputs_with_grads : position indices of output blobs which will receive
external error gradient during backpropagation
memory_optimization: if enabled, the LSTM step is recomputed on backward step
so that we don't need to store forward activations for each
timestep. Saves memory with cost of computation.
Definition at line 793 of file rnn_cell.py.
1.8.14