Module caffe2.python.data_parallel_model. More...
Functions | |
| def | Parallelize_GPU (model_helper_obj, input_builder_fun, forward_pass_builder_fun, param_update_builder_fun, devices=range(0, workspace.NumCudaDevices()), rendezvous=None, net_type='dag', broadcast_computed_params=True, optimize_gradient_memory=False) |
| def | ExtractPredictorNet (model, inputs, outputs, device) |
| def | FinalizeAfterCheckpoint (model, blobs, sync_iter=True) |
| def | stripParamName (param) |
Variables | |
| log = logging.getLogger("data_parallel_model") | |
Module caffe2.python.data_parallel_model.
| def data_parallel_model.ExtractPredictorNet | ( | model, | |
| inputs, | |||
| outputs, | |||
| device | |||
| ) |
Returns (net, params) that can be exported to be used as a prediction net.
Definition at line 202 of file data_parallel_model.py.
| def data_parallel_model.Parallelize_GPU | ( | model_helper_obj, | |
| input_builder_fun, | |||
| forward_pass_builder_fun, | |||
| param_update_builder_fun, | |||
devices = range(0, workspace.NumCudaDevices()), |
|||
rendezvous = None, |
|||
net_type = 'dag', |
|||
broadcast_computed_params = True, |
|||
optimize_gradient_memory = False |
|||
| ) |
Function to create a model that can run on many GPUs.
model_helper_obj: an object of ModelHelperBase, such as CNNModelHelper
input_builder_fun:
Function that adds the input operators
Note: Remember to instantiate reader outside of this
function so all GPUs share same reader object.
Signature: input_builder_fun(model)
forward_pass_builder_fun:
Function to add the operators to the model.
Must return list of loss-blob references that
are used to build the gradient. Loss scale parameter
is passed, as you should scale the loss of your model
by 1.0 / the total number of gpus.
Signature: forward_pass_builder_fun(model, loss_scale)
param_update_builder_fun:
Function that adds operators that are run after
gradient update, such as updating the weights and
weight decaying.
Signature: param_update_builder_fun(model)
devices: List of GPU ids, such as [0, 1, 2, 3],
rendezvous: used for rendezvous in distributed computation, if None
then only one node is used. To create rendezvous,
use <TBD>.
net_type: Network type
Definition at line 32 of file data_parallel_model.py.
1.8.14