A cloud will always have bugs. Some of these will be security problems. For this reason, it is critically important to be prepared to apply security updates and general software updates. This involves smart use of configuration management tools, which are discussed below. This also involves knowing when an upgrade is necessary.
For announcements regarding security relevant changes, subscribe to the OpenStack Announce mailing list. The security notifications are also posted through the downstream packages for example through Linux distributions that you may be subscribed to as part of the package updates.
The OpenStack components are only a small fraction of the software in a cloud. It is important to keep up to date with all of these other components, too. While the specific data sources will be deployment specific, the key idea is to ensure that a cloud administrator subscribes to the necessary mailing lists for receiving notification of any related security updates. Often this is as simple as tracking an upstream Linux distribution.
Note | |
---|---|
OpenStack releases security information through two channels.
|
After you are notified of a security update, the next step is to determine how critical this update is to a given cloud deployment. In this case, it is useful to have a pre-defined policy. Existing vulnerability rating systems such as the common vulnerability scoring system (CVSS) v2 do not properly account for cloud deployments.
In this example we introduce a scoring matrix that places vulnerabilities in three categories: Privilege Escalation, Denial of Service and Information Disclosure. Understanding the type of vulnerability and where it occurs in your infrastructure will enable you to make reasoned response decisions.
Privilege Escalation describes the ability of a user to act with the privileges of some other user in a system, bypassing appropriate authorization checks. For example, a standard Linux user running code or performing an operation that allows them to conduct further operations with the privileges of the root users on the system.
Denial of Service refers to an exploited vulnerability that may cause service or system disruption. This includes both distributed attacks to overwhelm network resources, and single-user attacks that are typically caused through resource allocation bugs or input induced system failure flaws.
Information Disclosure vulnerabilities reveal information about your system or operations. These vulnerabilities range from debugging information disclosure, to exposure of critical security data, such as authentication credentials and passwords.
|
Attacker Position / Privilege Level |
|||
|
External |
Cloud User |
Cloud Admin |
Control Plane |
Privilege Elevation (3 levels) |
Critical |
n/a |
n/a |
n/a |
Privilege Elevation (2 levels) |
Critical |
Critical |
n/a |
n/a |
Privilege Elevation (1 level) |
Critical |
Critical |
Critical |
n/a |
Denial of Service |
High |
Medium |
Low |
Low |
Information Disclosure |
Critical / High |
Critical / High |
Medium / Low |
Low |
This table illustrates a generic approach to measuring the impact of a vulnerability based on where it occurs in your deployment and the effect. For example, a single level privilege escalation on a Compute API node potentially allows a standard user of the API to escalate to have the same privileges as the root user on the node.
We suggest that cloud administrators use this table as a model to help define which actions to take for the various security levels. For example, a critical-level security update might require the cloud to be upgraded on a specified time line, whereas a low-level update might be more relaxed.
You should test any update before you deploy it in a production environment. Typically this requires having a separate test cloud setup that first receives the update. This cloud should be as close to the production cloud as possible, in terms of software and hardware. Updates should be tested thoroughly in terms of performance impact, stability, application impact, and more. Especially important is to verify that the problem theoretically addressed by the update, such as a specific vulnerability, is actually fixed.
A production quality cloud should always use tools to automate configuration and deployment. This eliminates human error, and allows the cloud to scale much more rapidly. Automation also helps with continuous integration and testing.
When building an OpenStack cloud it is strongly recommended to approach your design and implementation with a configuration management tool or framework in mind. Configuration management allows you to avoid the many pitfalls inherent in building, managing, and maintaining an infrastructure as complex as OpenStack. By producing the manifests, cookbooks, or templates required for a configuration management utility, you are able to satisfy a number of documentation and regulatory reporting requirements. Further, configuration management can also function as part of your BCP and DR plans wherein you can rebuild a node or service back to a known state in a DR event or given a compromise.
Additionally, when combined with a version control system
such as Git or SVN, you can track changes to your environment
over time and re-mediate unauthorized changes that may occur.
For example, a nova.conf
file or other
configuration file falls out of compliance with your standard,
your configuration management tool can revert or replace the
file and bring your configuration back into a known state.
Finally a configuration management tool can also be used to
deploy updates; simplifying the security patch process. These
tools have a broad range of capabilities that are useful in this
space. The key point for securing your cloud is to choose a tool
for configuration management and use it.
There are many configuration management solutions; at the time of this writing there are two in the marketplace that are robust in their support of OpenStack environments: Chef and Puppet. A non-exhaustive listing of tools in this space is provided below:
Chef
Puppet
Salt Stack
Ansible
It is important to include Backup procedures and policies in the overall System Security Plan. For a good overview of OpenStack's Backup and Recovery capabilities and procedures, please refer to the OpenStack Operations Guide.
Ensure only authenticated users and backup clients have access to the backup server.
Use data encryption options for storage and transmission of backups.
Use a dedicated and hardened backup servers. The logs for the backup server must be monitored daily and accessible by only few individuals.
Test data recovery options regularly. One of the things that can be restored from secured backups is the images. In case of a compromise, the best practice would be to terminate running instances immediately and then relaunch the instances from the images in the secured backup repository.
Security auditing tools can complement the configuration management tools. Security auditing tools automate the process of verifying that a large number of security controls are satisfied for a given system configuration. These tools help to bridge the gap from security configuration guidance documentation (for example, the STIG and NSA Guides) to a specific system installation. For example, SCAP can compare a running system to a pre-defined profile. SCAP outputs a report detailing which controls in the profile were satisfied, which ones failed, and which ones were not checked.
Combining configuration management and security auditing tools creates a powerful combination. The auditing tools will highlight deployment concerns. And the configuration management tools simplify the process of changing each system to address the audit concerns. Used together in this fashion, these tools help to maintain a cloud that satisfies security requirements ranging from basic hardening to compliance validation.
Configuration management and security auditing tools will introduce another layer of complexity into the cloud. This complexity brings additional security concerns with it. We view this as an acceptable risk trade-off, given their security benefits. Securing the operational use of these tools is beyond the scope of this guide.