Public Member Functions | |
def | __init__ (self, fields, name=None) |
def | init_empty (self, init_net) |
def | init_from_dataframe (self, net, dataframe) |
def | get_blobs (self) |
def | content (self) |
def | field_names (self) |
def | field_types (self) |
def | reader (self, init_net=None, cursor_name=None, batch_size=1) |
def | random_reader (self, init_net=None, indices=None, cursor_name=None, batch_size=1) |
def | writer (self, init_net=None) |
Public Attributes | |
schema | |
fields | |
field_types | |
name | |
field_blobs | |
Represents an in-memory dataset with fixed schema. Use this to store and iterate through datasets with complex schema that fit in memory. Iterating through entries of this dataset is very fast since the dataset is stored as a set of native Caffe2 tensors, thus no type conversion or deserialization is necessary.
Definition at line 174 of file dataset.py.
def dataset.Dataset.__init__ | ( | self, | |
fields, | |||
name = None |
|||
) |
Create an un-initialized dataset with schema provided by `fields`. Before this dataset can be used, it must be initialized, either by `init_empty` or `init_from_dataframe`. Args: fields: either a schema.Struct or a list of field names in a format compatible with the one described in schema.py. name: optional name to prepend to blobs that will store the data.
Definition at line 185 of file dataset.py.
def dataset.Dataset.content | ( | self | ) |
Return a Record of BlobReferences pointing to the full content of this dataset.
Definition at line 234 of file dataset.py.
def dataset.Dataset.field_names | ( | self | ) |
Return the list of field names for this dataset.
Definition at line 241 of file dataset.py.
def dataset.Dataset.field_types | ( | self | ) |
Return the list of field dtypes for this dataset. If a list of strings, not a schema.Struct, was passed to the constructor, this will return a list of dtype(np.void).
Definition at line 245 of file dataset.py.
def dataset.Dataset.get_blobs | ( | self | ) |
Return the list of BlobReference pointing to the blobs that contain the data for this dataset.
Definition at line 226 of file dataset.py.
def dataset.Dataset.init_empty | ( | self, | |
init_net | |||
) |
Initialize the blobs for this dataset with empty values. Empty arrays will be immediately fed into the current workspace, and `init_net` will take those blobs as external inputs.
Definition at line 206 of file dataset.py.
def dataset.Dataset.init_from_dataframe | ( | self, | |
net, | |||
dataframe | |||
) |
Initialize the blobs for this dataset from a Pandas dataframe. Each column of the dataframe will be immediately fed into the current workspace, and the `net` will take this blobs as external inputs.
Definition at line 215 of file dataset.py.
def dataset.Dataset.random_reader | ( | self, | |
init_net = None , |
|||
indices = None , |
|||
cursor_name = None , |
|||
batch_size = 1 |
|||
) |
Create a Reader object that is used to iterate through the dataset. NOTE: The reader order depends on the order in indices. Args: init_net: net that will be run once to create the cursor. indices: blob of reading order cursor_name: optional name for the blob containing a pointer to the cursor. batch_size: how many samples to read per iteration. Returns: A DatasetReader that can be used to create operators that will iterate through the dataset according to indices.
Definition at line 279 of file dataset.py.
def dataset.Dataset.reader | ( | self, | |
init_net = None , |
|||
cursor_name = None , |
|||
batch_size = 1 |
|||
) |
Create a Reader object that is used to iterate through the dataset. This will append operations to `init_net` that create a TreeCursor, used to iterate through the data. NOTE: Currently, it is not safe to append to a dataset while reading. Args: init_net: net that will be run once to create the cursor. cursor_name: optional name for the blob containing a pointer to the cursor. batch_size: how many samples to read per iteration. Returns: A _DatasetReader that can be used to create operators that will iterate through the dataset.
Definition at line 254 of file dataset.py.
def dataset.Dataset.writer | ( | self, | |
init_net = None |
|||
) |
Create a Writer that can be used to append entries into the dataset. NOTE: Currently, it is not safe to append to a dataset while reading from it. NOTE: Currently implementation of writer is not thread safe. TODO: fixme Args: init_net: net that will be run once in order to create the writer. (currently not used)
Definition at line 301 of file dataset.py.