Bases: object
Walk through file system to audit objects
Entrypoint to object_audit, with a failsafe generic exception handler.
Audits the given object location.
Parameters: | location – an audit location (from diskfile.object_audit_location_generator) |
---|
Based on config’s object_size_stats will keep track of how many objects fall into the specified ranges. For example with the following:
object_size_stats = 10, 100, 1024
and your system has 3 objects of sizes: 5, 20, and 10000 bytes the log will look like: {“10”: 1, “100”: 1, “1024”: 0, “OVER”: 1}
Bases: swift.common.daemon.Daemon
Audit objects.
Audit loop
Clear recon cache entries
Child execution
Run the object audit
Run the object audit until stopped.
Run the object audit once
Disk File Interface for the Swift Object Server
The DiskFile, DiskFileWriter and DiskFileReader classes combined define the on-disk abstraction layer for supporting the object server REST API interfaces (excluding REPLICATE). Other implementations wishing to provide an alternative backend for the object server must implement the three classes. An example alternative implementation can be found in the mem_server.py and mem_diskfile.py modules along size this one.
The DiskFileManager is a reference implemenation specific class and is not part of the backend API.
The remaining methods in this module are considered implementation specific and are also not considered part of the backend API.
Bases: object
Represents an object location to be audited.
Other than being a bucket of data, the only useful thing this does is stringify to a filesystem path so the auditor’s logs look okay.
Bases: object
Manage object files.
This specific implementation manages object files on a disk formatted with a POSIX-compliant file system that supports extended attributes as metadata on a file or directory.
Note
The arguments to the constructor are considered implementation specific. The API does not define the constructor arguments.
Parameters: |
|
---|
Context manager to create a file. We create a temporary file first, and then return a DiskFileWriter object to encapsulate the state.
Note
An implementation is not required to perform on-disk preallocations even if the parameter is specified. But if it does and it fails, it must raise a DiskFileNoSpace exception.
Parameters: | size – optional initial size of file to explicitly allocate on disk |
---|---|
Raises DiskFileNoSpace: | |
if a size is specified and allocation fails |
Delete the object.
This implementation creates a tombstone file using the given timestamp, and removes any older versions of the object file. Any file that has an older timestamp than timestamp will be deleted.
Note
An implementation is free to use or ignore the timestamp parameter.
Parameters: | timestamp – timestamp to compare with each file |
---|---|
Raises DiskFileError: | |
this implementation will raise the same errors as the create() method. |
Provide the metadata for a previously opened object as a dictionary.
Returns: | object’s metadata dictionary |
---|---|
Raises DiskFileNotOpen: | |
if the swift.obj.diskfile.DiskFile.open() method was not previously invoked |
Open the object.
This implementation opens the data file representing the object, reads the associated metadata in the extended attributes, additionally combining metadata from fast-POST .meta files.
Note
An implementation is allowed to raise any of the following exceptions, but is only required to raise DiskFileNotExist when the object representation does not exist.
Raises: |
|
---|---|
Returns: | itself for use as a context manager |
Return the metadata for an object without requiring the caller to open the object first.
Returns: | metadata dictionary for an object |
---|---|
Raises DiskFileError: | |
this implementation will raise the same errors as the open() method. |
Return a swift.common.swob.Response class compatible “app_iter” object as defined by swift.obj.diskfile.DiskFileReader.
For this implementation, the responsibility of closing the open file is passed to the swift.obj.diskfile.DiskFileReader object.
Parameters: |
|
---|---|
Returns: | a swift.obj.diskfile.DiskFileReader object |
Write a block of metadata to an object without requiring the caller to create the object first. Supports fast-POST behavior semantics.
Parameters: | metadata – dictionary of metadata to be associated with the object |
---|---|
Raises DiskFileError: | |
this implementation will raise the same errors as the create() method. |
Bases: object
Management class for devices, providing common place for shared parameters and methods not provided by the DiskFile class (which primarily services the object server REST API layer).
The get_diskfile() method is how this implementation creates a DiskFile object.
Note
This class is reference implementation specific and not part of the pluggable on-disk backend API.
Note
TODO(portante): Not sure what the right name to recommend here, as “manager” seemed generic enough, though suggestions are welcome.
Parameters: |
|
---|
Construct the path to a device without checking if it is mounted.
Parameters: | device – name of target device |
---|---|
Returns: | full path to the device |
Return the path to a device, checking to see that it is a proper mount point based on a configuration parameter.
Parameters: |
|
---|---|
Returns: | full path to the device, None if the path to the device is not a proper mount point. |
Returns a DiskFile instance for an object at the given object_hash. Just in case someone thinks of refactoring, be sure DiskFileDeleted is not raised, but the DiskFile instance representing the tombstoned object is returned instead.
Raises DiskFileNotExist: | |
---|---|
if the object does not exist |
A context manager that will lock on the device given, if configured to do so.
Raises ReplicationLockTimeout: | |
---|---|
If the lock on the device cannot be granted within the configured timeout. |
Yields tuples of (full_path, hash_only, timestamp) for object information stored for the given device, partition, and (optionally) suffixes. If suffixes is None, all stored suffixes will be searched for object hashes. Note that if suffixes is not None but empty, such as [], then nothing will be yielded.
Yields tuples of (full_path, suffix_only) for suffixes stored on the given device and partition.
Bases: object
Encapsulation of the WSGI read context for servicing GET REST API requests. Serves as the context manager object for the swift.obj.diskfile.DiskFile class’s swift.obj.diskfile.DiskFile.reader() method.
Note
The quarantining behavior of this method is considered implementation specific, and is not required of the API.
Note
The arguments to the constructor are considered implementation specific. The API does not define the constructor arguments.
Parameters: |
|
---|
Returns an iterator over the data file for range (start, stop)
Returns an iterator over the data file for a set of ranges
Close the open file handle if present.
For this specific implementation, this method will handle quarantining the file if necessary.
Bases: object
Encapsulation of the write context for servicing PUT REST API requests. Serves as the context manager object for the swift.obj.diskfile.DiskFile class’s swift.obj.diskfile.DiskFile.create() method.
Note
It is the responsibility of the swift.obj.diskfile.DiskFile.create() method context manager to close the open file descriptor.
Note
The arguments to the constructor are considered implementation specific. The API does not define the constructor arguments.
Parameters: |
|
---|
Finalize writing the file on disk.
For this implementation, this method is responsible for renaming the temporary file to the final name and directory location. This method should be called after the final call to swift.obj.diskfile.DiskFileWriter.write().
Parameters: | metadata – dictionary of metadata to be associated with the object |
---|
Write a chunk of data to disk. All invocations of this method must come before invoking the :func:
For this implementation, the data is written into a temporary file.
Parameters: | chunk – the chunk of data to write as a string object |
---|---|
Returns: | the total number of bytes written to an object |
Get a list of hashes for the suffix dir. do_listdir causes it to mistrust the hash cache for suffix existence at the (unexpectedly high) cost of a listdir. reclaim_age is just passed on to hash_suffix.
Parameters: |
|
---|---|
Returns: | tuple of (number of suffix dirs hashed, dictionary of hashes) |
Given a simple list of files names, determine the files to use.
Params files: | simple set of files as a python list |
---|---|
Params datadir: | directory name files are from for convenience |
Returns: | a tuple of data, meta and ts (tombstone) files, in one of two states: |
ts_file is not None, data_file is None, meta_file is None
object is considered deleted
data_file is not None, ts_file is None
object exists, and optionally has fast-POST metadata
List contents of a hash directory and clean up any old files.
Parameters: |
|
---|---|
Returns: | list of files remaining in the directory, reverse sorted |
Performs reclamation and returns an md5 of all (remaining) files.
Parameters: | reclaim_age – age in seconds at which to remove tombstones |
---|---|
Raises: |
|
Invalidates the hash for a suffix_dir in the partition’s hashes file.
Parameters: | suffix_dir – absolute path to suffix dir whose hash needs invalidating |
---|
Given a devices path (e.g. “/srv/node”), yield an AuditLocation for all objects stored under that directory if device_dirs isn’t set. If device_dirs is set, only yield AuditLocation for the objects under the entries in device_dirs. The AuditLocation only knows the path to the hash directory, not to the .data file therein (if any). This is to avoid a double listdir(hash_dir); the DiskFile object will always do one, so we don’t.
Parameters: |
|
---|---|
Device_dirs : | a list of directories under devices to traverse |
In the case that a file is corrupted, move it to a quarantined area to allow replication to fix it.
Params device_path: | |
---|---|
The path to the device the corrupted file is on. | |
Params corrupted_file_path: | |
The path to the file you want quarantined. | |
Returns: | path (str) of directory the file was moved to |
Raises OSError: | re-raises non errno.EEXIST / errno.ENOTEMPTY exceptions from rename |
Helper function to read the pickled metadata from an object file.
Parameters: | fd – file descriptor or filename to load the metadata from |
---|---|
Returns: | dictionary of metadata |
Helper function to write pickled metadata for an object file.
Parameters: |
|
---|
Bases: swift.common.daemon.Daemon
Replicate objects.
Encapsulates most logic and data needed by the object replication process. Each call to .replicate() performs one replication pass. It’s up to the caller to do this in a loop.
Check to see if the ring has been updated
Returns: | boolean indicating whether or not the ring has changed |
---|
Returns a sorted list of jobs (dictionaries) that specify the partitions, nodes, etc to be synced.
In testing, the pool.waitall() call very occasionally failed to return. This is an attempt to make sure the replicator finishes its replication pass in some eventuality.
Loop that runs in the background during replication. It periodically logs progress.
Utility function that kills all coroutines currently running.
Run a replication pass
Uses rsync to implement the sync method. This was the first sync method in Swift.
Logs various stats for the currently running replication pass.
Synchronize local suffix directories from a partition with a remote node.
Parameters: |
|
---|---|
Returns: | boolean indicating success or failure |
High-level method that replicates a single partition.
Parameters: | job – a dict containing info about the partition to be replicated |
---|
High-level method that replicates a single partition that doesn’t belong on this node.
Parameters: | job – a dict containing info about the partition to be replicated |
---|
Bases: object
Sends REPLICATION requests to the object server.
These requests are eventually handled by ssync_receiver and full documentation about the process is there.
Establishes a connection and starts a REPLICATION request with the object server.
Closes down the connection to the object server once done with the REPLICATION request.
Handles the sender-side of the MISSING_CHECK step of a REPLICATION request.
Full documentation of this can be found at Receiver.missing_check().
Reads a line from the REPLICATION response body.
httplib has no readline and will block on read(x) until x is read, so we have to do the work ourselves. A bit of this is taken from Python’s httplib itself.
Sends a DELETE subrequest with the given information.
Sends a PUT subrequest for the url_path using the source df (DiskFile) and content_length.
Handles the sender-side of the UPDATES step of a REPLICATION request.
Full documentation of this can be found at Receiver.updates().
Bases: object
Handles incoming REPLICATION requests to the object server.
These requests come from the object-replicator daemon that uses ssync_sender.
The number of concurrent REPLICATION requests is restricted by use of a replication_semaphore and can be configured with the object-server.conf [object-server] replication_concurrency setting.
A REPLICATION request is really just an HTTP conduit for sender/receiver replication communication. The overall REPLICATION request should always succeed, but it will contain multiple requests within its request and response bodies. This “hack” is done so that replication concurrency can be managed.
The general process inside a REPLICATION request is:
- Initialize the request: Basic request validation, mount check, acquire semaphore lock, etc..
- Missing check: Sender sends the hashes and timestamps of the object information it can send, receiver sends back the hashes it wants (doesn’t have or has an older timestamp).
- Updates: Sender sends the object information requested.
- Close down: Release semaphore lock, etc.
Basic validation of request and mount check.
This function will be called before attempting to acquire a replication semaphore lock, so contains only quick checks.
Handles the receiver-side of the MISSING_CHECK step of a REPLICATION request.
Receives a list of hashes and timestamps of object information the sender can provide and responds with a list of hashes desired, either because they’re missing or have an older timestamp locally.
The process is generally:
- Sender sends :MISSING_CHECK: START and begins sending hash timestamp lines.
- Receiver gets :MISSING_CHECK: START and begins reading the hash timestamp lines, collecting the hashes of those it desires.
- Sender sends :MISSING_CHECK: END.
- Receiver gets :MISSING_CHECK: END, responds with :MISSING_CHECK: START, followed by the list of hashes it collected as being wanted (one per line), :MISSING_CHECK: END, and flushes any buffers.
- Sender gets :MISSING_CHECK: START and reads the list of hashes desired by the receiver until reading :MISSING_CHECK: END.
The collection and then response is so the sender doesn’t have to read while it writes to ensure network buffers don’t fill up and block everything.
Handles the UPDATES step of a REPLICATION request.
Receives a set of PUT and DELETE subrequests that will be routed to the object server itself for processing. These contain the information requested by the MISSING_CHECK step.
The PUT and DELETE subrequests are formatted pretty much exactly like regular HTTP requests, excepting the HTTP version on the first request line.
The process is generally:
- Sender sends :UPDATES: START and begins sending the PUT and DELETE subrequests.
- Receiver gets :UPDATES: START and begins routing the subrequests to the object server.
- Sender sends :UPDATES: END.
- Receiver gets :UPDATES: END and sends :UPDATES: START and :UPDATES: END (assuming no errors).
- Sender gets :UPDATES: START and :UPDATES: END.
If too many subrequests fail, as configured by replication_failure_threshold and replication_failure_ratio, the receiver will hang up the request early so as to not waste any more time.
At step 4, the receiver will send back an error if there were any failures (that didn’t cause a hangup due to the above thresholds) so the sender knows the whole was not entirely a success. This is so the sender knows if it can remove an out of place partition, for example.
Object Server for Swift
Bases: object
Implements the WSGI application for the Swift Object Server.
Handle HTTP DELETE requests for the Swift Object Server.
Handle HTTP GET requests for the Swift Object Server.
Handle HTTP HEAD requests for the Swift Object Server.
Handle HTTP POST requests for the Swift Object Server.
Handle HTTP PUT requests for the Swift Object Server.
Handle REPLICATE requests for the Swift Object Server. This is used by the object replicator to get hashes for directories.
Sends or saves an async update.
Parameters: |
|
---|
Update the container when objects are updated.
Parameters: |
|
---|
Update the expiring objects container when objects are updated.
Parameters: |
|
---|
Utility method for instantiating a DiskFile object supporting a given REST API.
An implementation of the object server that wants to use a different DiskFile class would simply over-ride this method to provide that behavior.
Implementation specific setup. This method is called at the very end by the constructor to allow a specific implementation to modify existing attributes or add its own attributes.
Parameters: | conf – WSGI configuration parameter |
---|
paste.deploy app factory for creating WSGI object server apps
Callback for swift.common.wsgi.run_wsgi during the global_conf creation so that we can add our replication_semaphore, used to limit the number of concurrent REPLICATION_REQUESTS across all workers.
Parameters: |
|
---|
Bases: swift.common.daemon.Daemon
Update object information in container listings.
Get the container ring. Load it, if it hasn’t been yet.
If there are async pendings on the device, walk each one and update.
Parameters: | device – path to device |
---|
Perform the object update to the container
Parameters: |
|
---|
Process the object information to be updated and update.
Parameters: |
|
---|
Run the updater continuously.
Run the updater once.