An often-cited reason for designing applications by using cloud patterns is the ability to scale out. That is: to add additional resources, as required. Contrast this strategy to the previous one of increasing capacity by scaling up the size of existing resources. To scale out, you must:
The Introduction to the fractals application architecture section describes how to build in a modular fashion, create an API, and other aspects of the application architecture. Now you will see why those strategies are so important. By creating a modular application with decoupled services, you can identify components that cause application performance bottlenecks and scale them out. Just as importantly, you can also remove resources when they are no longer necessary. It is very difficult to overstate the cost savings that this feature can bring, as compared to traditional infrastructure.
Of course, having access to additional resources is only part of the game plan; while you can manually add or delete resources, you get more value and more responsiveness if the application automatically requests additional resources when it needs them.
This section continues to illustrate the separation of services onto multiple instances and highlights some of the choices that we have made that facilitate scalability in the application architecture.
You will progressively ramp up to use up six instances, so make sure that your cloud account has the appropriate quota.
The previous section uses two virtual machines - one ‘control’ service and one ‘worker’. The speed at which your application can generate fractals depends on the number of workers. With just one worker, you can produce only one fractal at a time. Before long, you will need more resources.
Note
If you do not have a working application, follow the steps in Introduction to the fractals application architecture to create one.
To test what happens when the Fractals application is under load, you can:
Use SSH with the existing SSH keypair to log in to the
app-controller
controller instance.
$ ssh -i ~/.ssh/id_rsa USERNAME@IP_CONTROLLER
Note
Replace IP_CONTROLLER
with the IP address of the
controller instance and USERNAME with the appropriate
user name.
Call the faafo
command-line interface to request the
generation of five large fractals.
$ faafo create --height 9999 --width 9999 --tasks 5
If you check the load on the worker, you can see that the instance is not doing well. On the single CPU flavor instance, a load average greater than 1 means that the server is at capacity.
$ ssh -i ~/.ssh/id_rsa USERNAME@IP_WORKER uptime
10:37:39 up 1:44, 2 users, load average: 1.24, 1.40, 1.36
Note
Replace IP_WORKER
with the IP address of the worker
instance and USERNAME with the appropriate user name.
API load is a slightly different problem than the previous one regarding capacity to work. We can simulate many requests to the API, as follows:
Use SSH with the existing SSH keypair to log in to the
app-controller
controller instance.
$ ssh -i ~/.ssh/id_rsa USERNAME@IP_CONTROLLER
Note
Replace IP_CONTROLLER
with the IP address of the
controller instance and USERNAME with the appropriate
user name.
Use a for loop to call the faafo
command-line interface to
request a random set of fractals 500 times:
$ for i in $(seq 1 500); do faafo --endpoint-url http://IP_CONTROLLER create & done
Note
Replace IP_CONTROLLER
with the IP address of the
controller instance.
If you check the load on the app-controller
API service
instance, you see that the instance is not doing well. On your single
CPU flavor instance, a load average greater than 1 means that the server is
at capacity.
$ uptime
10:37:39 up 1:44, 2 users, load average: 1.24, 1.40, 1.36
The sheer number of requests means that some requests for fractals might not make it to the message queue for processing. To ensure that you can cope with demand, you must also scale out the API capability of the Fractals application.
Go ahead and delete the existing instances and security groups that you created in previous sections. Remember, when instances in the cloud are no longer working, remove them and re-create something new.
for instance in conn.list_nodes():
if instance.name in ['all-in-one','app-worker-1', 'app-worker-2', 'app-controller']:
print('Destroying Instance: %s' % instance.name)
conn.destroy_node(instance)
for group in conn.ex_list_security_groups():
if group.name in ['control', 'worker', 'api', 'services']:
print('Deleting security group: %s' % group.name)
conn.ex_delete_security_group(group)
As you change the topology of your applications, you must update or create security groups. Here, you re-create the required security groups.
api_group = conn.ex_create_security_group('api', 'for API services only')
conn.ex_create_security_group_rule(api_group, 'TCP', 80, 80)
conn.ex_create_security_group_rule(api_group, 'TCP', 22, 22)
worker_group = conn.ex_create_security_group('worker', 'for services that run on a worker node')
conn.ex_create_security_group_rule(worker_group, 'TCP', 22, 22)
controller_group = conn.ex_create_security_group('control', 'for services that run on a control node')
conn.ex_create_security_group_rule(controller_group, 'TCP', 22, 22)
conn.ex_create_security_group_rule(controller_group, 'TCP', 80, 80)
conn.ex_create_security_group_rule(controller_group, 'TCP', 5672, 5672, source_security_group=worker_group)
services_group = conn.ex_create_security_group('services', 'for DB and AMQP services only')
conn.ex_create_security_group_rule(services_group, 'TCP', 22, 22)
conn.ex_create_security_group_rule(services_group, 'TCP', 3306, 3306, source_security_group=api_group)
conn.ex_create_security_group_rule(services_group, 'TCP', 5672, 5672, source_security_group=worker_group)
conn.ex_create_security_group_rule(services_group, 'TCP', 5672, 5672, source_security_group=api_group)
Define a short function to locate unused or allocate floating IPs. This saves a few lines of code and prevents you from reaching your floating IP quota too quickly.
def get_floating_ip(conn):
'''A helper function to re-use available Floating IPs'''
unused_floating_ip = None
for floating_ip in conn.ex_list_floating_ips():
if not floating_ip.node_id:
unused_floating_ip = floating_ip
break
if not unused_floating_ip:
pool = conn.ex_list_floating_ip_pools()[0]
unused_floating_ip = pool.create_floating_ip()
return unused_floating_ip
Before you scale out your application services, like the API service or the
workers, you must add a central database and an app-services
messaging
instance. The database and messaging queue will be used to track the state of
fractals and to coordinate the communication between the services.
userdata = '''#!/usr/bin/env bash curl -L -s http://git.openstack.org/cgit/openstack/faafo/plain/contrib/install.sh | bash -s -- \ -i database -i messaging ''' instance_services = conn.create_node(name='app-services', image=image, size=flavor, ex_keyname='demokey', ex_userdata=userdata, ex_security_groups=[services_group]) instance_services = conn.wait_until_running([instance_services])[0][0] services_ip = instance_services.private_ips[0]
With multiple workers producing fractals as fast as they can, the system must be able to receive the requests for fractals as quickly as possible. If our application becomes popular, many thousands of users might connect to our API to generate fractals.
Armed with a security group, image, and flavor size, you can add multiple API services:
userdata = '''#!/usr/bin/env bash curl -L -s http://git.openstack.org/cgit/openstack/faafo/plain/contrib/install.sh | bash -s -- \ -i faafo -r api -m 'amqp://guest:guest@%(services_ip)s:5672/' \ -d 'mysql+pymysql://faafo:password@%(services_ip)s:3306/faafo' ''' % { 'services_ip': services_ip } instance_api_1 = conn.create_node(name='app-api-1', image=image, size=flavor, ex_keyname='demokey', ex_userdata=userdata, ex_security_groups=[api_group]) instance_api_2 = conn.create_node(name='app-api-2', image=image, size=flavor, ex_keyname='demokey', ex_userdata=userdata, ex_security_groups=[api_group]) instance_api_1 = conn.wait_until_running([instance_api_1])[0][0] api_1_ip = instance_api_1.private_ips[0] instance_api_2 = conn.wait_until_running([instance_api_2])[0][0] api_2_ip = instance_api_2.private_ips[0] for instance in [instance_api_1, instance_api_2]: floating_ip = get_floating_ip(conn) conn.ex_attach_floating_ip_to_node(instance, floating_ip) print('allocated %(ip)s to %(host)s' % {'ip': floating_ip.ip_address, 'host': instance.name})
These services are client-facing, so unlike the workers they do not use a message queue to distribute tasks. Instead, you must introduce some kind of load balancing mechanism to share incoming requests between the different API services.
A simple solution is to give half of your friends one address and half the other, but that solution is not sustainable. Instead, you can use a DNS round robin to do that automatically. However, OpenStack networking can provide Load Balancing as a Service, which Networking explains.
To increase the overall capacity, add three workers:
userdata = '''#!/usr/bin/env bash curl -L -s http://git.openstack.org/cgit/openstack/faafo/plain/contrib/install.sh | bash -s -- \ -i faafo -r worker -e 'http://%(api_1_ip)s' -m 'amqp://guest:guest@%(services_ip)s:5672/' ''' % {'api_1_ip': api_1_ip, 'services_ip': services_ip} instance_worker_1 = conn.create_node(name='app-worker-1', image=image, size=flavor, ex_keyname='demokey', ex_userdata=userdata, ex_security_groups=[worker_group]) instance_worker_2 = conn.create_node(name='app-worker-2', image=image, size=flavor, ex_keyname='demokey', ex_userdata=userdata, ex_security_groups=[worker_group]) instance_worker_3 = conn.create_node(name='app-worker-3', image=image, size=flavor, ex_keyname='demokey', ex_userdata=userdata, ex_security_groups=[worker_group])
Adding this capacity enables you to deal with a higher number of requests for fractals. As soon as these worker instances start, they begin checking the message queue for requests, reducing the overall backlog like a new register opening in the supermarket.
This process was obviously a very manual one. Figuring out that we needed more workers and then starting new ones required some effort. Ideally the system would do this itself. If you build your application to detect these situations, you can have it automatically request and remove resources, which saves you the effort of doing this work yourself. Instead, the OpenStack Orchestration service can monitor load and start instances, as appropriate. To find out how to set that up, see Orchestration.
In the previous steps, you split out several services and expanded capacity. To see the new features of the Fractals application, SSH to one of the app instances and create a few fractals.
$ ssh -i ~/.ssh/id_rsa USERNAME@IP_API_1
Note
Replace IP_API_1
with the IP address of the first
API instance and USERNAME with the appropriate user name.
Use the faafo create
command to generate fractals.
Use the faafo list
command to watch the progress of fractal
generation.
Use the faafo UUID
command to examine some of the fractals.
The generated_by field shows the worker that created the fractal. Because multiple worker instances share the work, fractals are generated more quickly and users might not even notice when a worker fails.
root@app-api-1:# faafo list
+--------------------------------------+------------------+-------------+
| UUID | Dimensions | Filesize |
+--------------------------------------+------------------+-------------+
| 410bca6e-baa7-4d82-9ec0-78e409db7ade | 295 x 738 pixels | 26283 bytes |
| 66054419-f721-492f-8964-a5c9291d0524 | 904 x 860 pixels | 78666 bytes |
| d123e9c1-3934-4ffd-8b09-0032ca2b6564 | 952 x 382 pixels | 34239 bytes |
| f51af10a-084d-4314-876a-6d0b9ea9e735 | 877 x 708 pixels | 93679 bytes |
+--------------------------------------+------------------+-------------+
root@app-api-1:# faafo show d123e9c1-3934-4ffd-8b09-0032ca2b6564
+--------------+------------------------------------------------------------------+
| Parameter | Value |
+--------------+------------------------------------------------------------------+
| uuid | d123e9c1-3934-4ffd-8b09-0032ca2b6564 |
| duration | 1.671410 seconds |
| dimensions | 952 x 382 pixels |
| iterations | 168 |
| xa | -2.61217 |
| xb | 3.98459 |
| ya | -1.89725 |
| yb | 2.36849 |
| size | 34239 bytes |
| checksum | d2025a9cf60faca1aada854d4cac900041c6fa762460f86ab39f42ccfe305ffe |
| generated_by | app-worker-2 |
+--------------+------------------------------------------------------------------+
root@app-api-1:# faafo show 66054419-f721-492f-8964-a5c9291d0524
+--------------+------------------------------------------------------------------+
| Parameter | Value |
+--------------+------------------------------------------------------------------+
| uuid | 66054419-f721-492f-8964-a5c9291d0524 |
| duration | 5.293870 seconds |
| dimensions | 904 x 860 pixels |
| iterations | 348 |
| xa | -2.74108 |
| xb | 1.85912 |
| ya | -2.36827 |
| yb | 2.7832 |
| size | 78666 bytes |
| checksum | 1f313aaa36b0f616b5c91bdf5a9dc54f81ff32488ce3999f87a39a3b23cf1b14 |
| generated_by | app-worker-1 |
+--------------+------------------------------------------------------------------+
The fractals are now available from any of the app-api hosts. To verify, visit http://IP_API_1/fractal/FRACTAL_UUID and http://IP_API_2/fractal/FRACTAL_UUID. You now have multiple redundant web services. If one fails, you can use the others.
Note
Replace IP_API_1
and IP_API_2
with the
corresponding floating IPs. Replace FRACTAL_UUID with the UUID
of an existing fractal.
Go ahead and test the fault tolerance. Start deleting workers and API instances. As long as you have one of each, your application is fine. However, be aware of one weak point. The database contains the fractals and fractal metadata. If you lose that instance, the application stops. Future sections will explain how to address this weak point.
If you had a load balancer, you could distribute this load between the two different API services. You have several options. The Networking section shows you one option.
In theory, you could use a simple script to monitor the load on your workers and API services and trigger the creation of instances, which you already know how to do. Congratulations! You are ready to create scalable cloud applications.
Of course, creating a monitoring system for a single application might not make sense. To learn how to use the OpenStack Orchestration monitoring and auto-scaling capabilities to automate these steps, see Orchestration.
You should be fairly confident about starting instances and distributing services from an application among these instances.
As mentioned in Introduction to the fractals application architecture, the generated fractal images are
saved on the local file system of the API service instances. Because
you have multiple API instances up and running, the fractal images are
spread across multiple API services, which causes a number of
IOError: [Errno 2] No such file or directory
exceptions when
trying to download a fractal image from an API service instance that
does not have the fractal image on its local file system.
Go to Make it durable to learn how to use Object Storage to solve this problem in a elegant way. Or, you can proceed to one of these sections:
This file contains all the code from this tutorial section. This comprehensive code sample lets you view and run the code as a single script.
Before you run this script, confirm that you have set your authentication information, the flavor ID, and image ID.
# step-1
for instance in conn.list_nodes():
if instance.name in ['all-in-one','app-worker-1', 'app-worker-2', 'app-controller']:
print('Destroying Instance: %s' % instance.name)
conn.destroy_node(instance)
for group in conn.ex_list_security_groups():
if group.name in ['control', 'worker', 'api', 'services']:
print('Deleting security group: %s' % group.name)
conn.ex_delete_security_group(group)
# step-2
api_group = conn.ex_create_security_group('api', 'for API services only')
conn.ex_create_security_group_rule(api_group, 'TCP', 80, 80)
conn.ex_create_security_group_rule(api_group, 'TCP', 22, 22)
worker_group = conn.ex_create_security_group('worker', 'for services that run on a worker node')
conn.ex_create_security_group_rule(worker_group, 'TCP', 22, 22)
controller_group = conn.ex_create_security_group('control', 'for services that run on a control node')
conn.ex_create_security_group_rule(controller_group, 'TCP', 22, 22)
conn.ex_create_security_group_rule(controller_group, 'TCP', 80, 80)
conn.ex_create_security_group_rule(controller_group, 'TCP', 5672, 5672, source_security_group=worker_group)
services_group = conn.ex_create_security_group('services', 'for DB and AMQP services only')
conn.ex_create_security_group_rule(services_group, 'TCP', 22, 22)
conn.ex_create_security_group_rule(services_group, 'TCP', 3306, 3306, source_security_group=api_group)
conn.ex_create_security_group_rule(services_group, 'TCP', 5672, 5672, source_security_group=worker_group)
conn.ex_create_security_group_rule(services_group, 'TCP', 5672, 5672, source_security_group=api_group)
# step-3
def get_floating_ip(conn):
'''A helper function to re-use available Floating IPs'''
unused_floating_ip = None
for floating_ip in conn.ex_list_floating_ips():
if not floating_ip.node_id:
unused_floating_ip = floating_ip
break
if not unused_floating_ip:
pool = conn.ex_list_floating_ip_pools()[0]
unused_floating_ip = pool.create_floating_ip()
return unused_floating_ip
# step-4
userdata = '''#!/usr/bin/env bash
curl -L -s http://git.openstack.org/cgit/openstack/faafo/plain/contrib/install.sh | bash -s -- \
-i database -i messaging
'''
instance_services = conn.create_node(name='app-services',
image=image,
size=flavor,
ex_keyname='demokey',
ex_userdata=userdata,
ex_security_groups=[services_group])
instance_services = conn.wait_until_running([instance_services])[0][0]
services_ip = instance_services.private_ips[0]
# step-5
userdata = '''#!/usr/bin/env bash
curl -L -s http://git.openstack.org/cgit/openstack/faafo/plain/contrib/install.sh | bash -s -- \
-i faafo -r api -m 'amqp://guest:guest@%(services_ip)s:5672/' \
-d 'mysql+pymysql://faafo:password@%(services_ip)s:3306/faafo'
''' % { 'services_ip': services_ip }
instance_api_1 = conn.create_node(name='app-api-1',
image=image,
size=flavor,
ex_keyname='demokey',
ex_userdata=userdata,
ex_security_groups=[api_group])
instance_api_2 = conn.create_node(name='app-api-2',
image=image,
size=flavor,
ex_keyname='demokey',
ex_userdata=userdata,
ex_security_groups=[api_group])
instance_api_1 = conn.wait_until_running([instance_api_1])[0][0]
api_1_ip = instance_api_1.private_ips[0]
instance_api_2 = conn.wait_until_running([instance_api_2])[0][0]
api_2_ip = instance_api_2.private_ips[0]
for instance in [instance_api_1, instance_api_2]:
floating_ip = get_floating_ip(conn)
conn.ex_attach_floating_ip_to_node(instance, floating_ip)
print('allocated %(ip)s to %(host)s' % {'ip': floating_ip.ip_address, 'host': instance.name})
# step-6
userdata = '''#!/usr/bin/env bash
curl -L -s http://git.openstack.org/cgit/openstack/faafo/plain/contrib/install.sh | bash -s -- \
-i faafo -r worker -e 'http://%(api_1_ip)s' -m 'amqp://guest:guest@%(services_ip)s:5672/'
''' % {'api_1_ip': api_1_ip, 'services_ip': services_ip}
instance_worker_1 = conn.create_node(name='app-worker-1',
image=image, size=flavor,
ex_keyname='demokey',
ex_userdata=userdata,
ex_security_groups=[worker_group])
instance_worker_2 = conn.create_node(name='app-worker-2',
image=image, size=flavor,
ex_keyname='demokey',
ex_userdata=userdata,
ex_security_groups=[worker_group])
instance_worker_3 = conn.create_node(name='app-worker-3',
image=image, size=flavor,
ex_keyname='demokey',
ex_userdata=userdata,
ex_security_groups=[worker_group])
# step-7
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.