Public Member Functions | |
def | __init__ (self, db, db_type) |
def | init (self, nodes=None, retrieve_from_epoch=None) |
def | blob_list (self) |
def | load (self, epoch) |
def | load_blobs_from_checkpoint (self, blob_names, epoch) |
def | check_db_exists (self, epoch) |
def | save (self, epoch) |
Controls saving and loading of workspaces on every epoch boundary of a job. If a CheckpointManager instance is passed to JobRunner, then JobRunner will call `init`, `read` and `save` at different moments in between epoch runs.
Definition at line 107 of file checkpoint.py.
def checkpoint.CheckpointManager.init | ( | self, | |
nodes = None , |
|||
retrieve_from_epoch = None |
|||
) |
Build a Task that will be run once after the job's `init_group` is run. This task will determine which blobs need to be checkpointed. If retrieve_from_epoch is not None, then the checkpoint metadata is retrieved from a previously saved checkpoint.
Definition at line 121 of file checkpoint.py.
def checkpoint.CheckpointManager.load | ( | self, | |
epoch | |||
) |
Build a Task that will be run by JobRunner when the job is to be resumed from a given epoch. This task will run a Load op that will load and deserialize all relevant blobs from a persistent storage.
Definition at line 152 of file checkpoint.py.
def checkpoint.CheckpointManager.load_blobs_from_checkpoint | ( | self, | |
blob_names, | |||
epoch | |||
) |
Builds a Task that loads only the necessary blobs from a checkpoint of the given epoch. The necessary blobs are given in the blob_names argument. Args: blob_names: A list of strings. Each string is the name of a blob. epoch: The checkpoint epoch to load from. Returns: A Task which loads the specified blobs from the checkpoint of the given epoch.
Definition at line 168 of file checkpoint.py.
def checkpoint.CheckpointManager.save | ( | self, | |
epoch | |||
) |
Build a Task that is run once after `init_group` and after each epoch is run. This will execute a Save ops to serialize and persist blobs present in the global workspaace.
Definition at line 207 of file checkpoint.py.