About QoS in Quality of Service

This document provides an introduction to the concepts behind Quality of Service (QoS). For a description of how to use Symbian's QoS API see About the QoS API

As the demand for network services increases in scope and variety the network carriers need a means of sharing network resources among the subscribers. Some subscribers may demand a high bandwidth such as streaming video, others may be satisfied with voice telephony and messaging. The service provider needs a means of dealing effectively with these competing demands and charging for them appropriately. The service provider needs tools to limit the rate and volume of traffic entering and passing through the network. These tools need to be sophisticated enough to adapt to changes in the environment so that the resources available to the network at any time can be allocated effectively.

The allocation of shared network resources and the tools to manage this is known as Quality of Service (QoS). QoS has to be implemented at every stage of the network for the overall QoS measurement and control to be effective. The aim of the QoS determination is threefold:

To ensure that the actual performance of the network reflects the perceptions of the end user.
The measurement must be useful for network management and the dispersal of shared resources.
The measurement must accurately reflect the cost of carrying the traffic so that any level of QoS can be charged for at a reasonable rate.

The Quality of Service (QoS) of any communications network is defined in the most general sense by three things; bandwidth, latency and reliability (other defining factors such as coverage, security and interoperability are taken as a given).

To put a human perspective on these three factors imagine that you are in the following situations.

Lack of Latency

As a Spanish ambassador to the court of the English king, Henry VII, in 1495 you are expected to be in regular communication with your superiors in Madrid. Bandwidth was not a problem as long as your letters were light enough to be carried by a horse. Reliability was not too poor despite the occasional messenger falling ill on the road or a ship floundering. Latency was by far the biggest problem with the average letter taking three weeks or more so it could easily be 2 months before you got a reply. This made effective diplomacy a tortuous process at best.

Lack of Bandwidth

The telegraph and morse code provided a dramatic reduction in latency but at the cost of reduced bandwidth. To overcome this the language of 'telegraphese' was invented as an early form of data compression. So a journalist receiving a telegram "REPORT ADEN WARWISE" would know that it meant "File a report about the military and civilian preparedness in Aden for the imminent war". A journalist replying "ADEN UNWARWISE" might trigger a 1000 word editorial attacking the government for their lack of foresight. This adhoc data compression (without error correction) was forced on the users by the low bandwidth and occasionally lead to misunderstandings.

Lack of Reliability

Even if there is sufficient bandwidth and the network latency is satisfactory, poor reliability can mean that the service is unacceptable. Chronic telephone systems in remote locations exhibit this problem. The bandwidth and latency are sufficient for the purpose however getting a working line can mean waiting hours or even days. Even sophisticated telephone systems can suffer this problem. In the immediate aftermath of the 9/11 attacks on the World Trade Centre buildings the telephone network was so swamped with call attempts that the network controllers were having to deny dial tones (in random rotation) to up to a third of the lines.

Any QoS measurement must take a balanced account of all three factors.

Even though QoS is a technical solution the emphasis on the human factor is important particularly in the case of a Universal Mobile Telephone System (UMTS) because it is the end user decides what QoS they want and whether they are getting what they asked for.

When implementing QoS within a UMTS it is useful to divide up the kind of traffic that can be carried into different classes, each with their own characteristics. As well as bandwidth, latency and reliability there are some other practical characteristics that the network needs to adapt to if it is to carry the traffic effectively. Two of these are whether the traffic comes in bursts and whether the upstream and downstream traffic characteristics are symmetric or asymmetric.

The classes of traffic carried by a UMTS are divided for QoS purposes into; Conversational, Streaming, Interactive and Background classes.

The Conversational Traffic Class

Voice telephony. This is characterised by a fixed, and relatively small bandwidth with a small latency and comparatively tolerant of transmission errors. The upstream and downstream data rates are symmetrical.

The Streaming Traffic Class

This class of traffic is continuous data such as streaming video. Some errors can be tolerated (say dropped lines in a single video frame). The data rates are high and generally asymmetrical.

The Interactive Traffic Class

This class of traffic handles user request/server response traffic such as web browsing. It is medium bandwidth, must be reliable and is bursty asymmetric data. Unlike the other classes different priorities can be attached to interactive traffic.

The Background Traffic Class

This traffic is low volume occasional traffic such as e-mail. A reasonable delay is acceptable to the user but reliability must be high.

The following table illustrates, roughly, the characteristics of these classes:

Traffic Class	Acceptable Bandwidth	Acceptable Latency	Acceptable Reliability	Burst Data?	Asymmetric?
Conversational	Low	Low	Low	No	No
Streaming	High	Medium	Medium	No	Yes
Interactive	Medium	Medium	High	Yes	Yes
Background	Low	High	High	Yes	Yes

This section describes the attributes that determine QoS. The example used is the UMTS QoS as that provides the facility for providing a guaranteed level of service as well as a maximum level of service. The attributes for the upstream and the downstream exist independently.

Attribute	Units	Description
Traffic Class	Enumerated	This is described above.
Maximum Bitrate	kbps	This is the maximum bitrate. The traffic is conformant when it follows a token bucket algorithm (see below) where the token rate equals the maximum bitrate and the bucket size equals the maximum SDU size.
Guaranteed Bitrate	kbps	This is the guaranteed bitrate in kbps. The traffic is conformant when it follows a token bucket algorithm (see below) where there is data to deliver, the token rate equals the guaranteed bitrate and the bucket size equals the maximum SDU size.
Delivery order	Yes or No	This indicates whether the UMTS shall provide in-sequence SDU delivery or not.
Maximum SDU Size	Octets	The maximum allowable size of a SDU.
SDU Format Information	Bits	This is a list of the possible exact sizes of SDUs.
SDU Error Ratio	Ratio	This indicates the fraction of SDUs lost or detected as erroneous. The SDU error ration is defined only for conforming traffic.
Residual Bit Error Ratio	Ratio	Indicates the error ratio in the delivered SDUs.
Delivery of Erroneous SDUs	Yes or No	Indicates whether SDUs detected as erroneous should be delivered or discarded.
Transfer Delay	mS	Indicates the maximum delay for the 95th percentile of the distribution of the delay for all delivered SDUs during the lifetime of the bearer service, where the delay for an SDU is defined as the time from the request to transfer an SDU at one SAP to its delivery at the other SAP.
Traffic Handling Priority	0 to 3	This specifies the relative importance of all SDUs belonging to the UMTS bearer compared to the SDUs of other bearers. 0 is the highest priority. The priority is only relevant for the Interactive traffic class.

Suppose that we have the case whereby a user is using an client application to connect to a network and the client has been asked by the user to deliver more data than the agreed data rate with the network. In this case the client can do several things with the excess data:

Discard the excess data. In this case the network is not involved and it is up to the client to decide whether the user of the user is informed of this decision (if at all).
Slow down the data rate to the rate that has been agreed with the network. This is called 'traffic smoothing'. When the average data rate is within limits but occasional bursts of data are outside the agreed limits this is called 'burst smoothing'. Naturally enough the client then has an opportunity to inform the user that this is taking place. The client-network relationship is maintained at the agreed level therefore there is no requirement for the network to tell the client of any degradation of the service.
Offer the data to the network but mark some of it as excessive so that the network server can choose whether to discard excessive data. If the server is not under excessive load it may choose to handle this data anyway. This technique is called 'traffic policing'. In this case the client-network agreement has been broken and, depending on the configuration, the network server may or may not tell the client of the consequences of this.

The main technique for traffic smoothing is by using the 'token bucket algorithm'.

Token Bucket Algorithm

It is necessary to regulate the data rate within a network to avoid bursts of data overloading the system. In QoS terms a client might have committed to (i.e. paid for) a particular guaranteed and maximum bit rate. The token rate algorithm is used to ensure that the client keeps within these limits.

An analogy is a parcel conveyor. The parcels have irregular sizes and arrive at random. To get on the the conveyor they have to fit through an aperture and they must not touch another parcel, as illustrated below:

Parcel A is too big to get through the aperture and is discarded. Parcel B, arriving shortly after A will fit through the aperture but as there is another parcel already on the conveyer parcel B has to wait. As the conveyor belt slowly moves to the right eventually enough space opens up to fit parcel B on the conveyor. Parcel C however has to wait behind parcel B, and it has to wait longer than B did for enough space to open up to allow C to fit.

The system provides a way of taking a un-throttled stream and converting it into a stream with a maximum throughput limit. Substitute packet for parcel, packet size for parcel size, token bucket size for aperture width and bit rate for conveyor speed and you have the picture.

With the following definitions the token bucket algorithm works as follows:

Term	Definition
b	The maximum allowable size of a single packet of data.
r	The maximum allowable bit rate.
t	The token bucket counter. On initialisation of the token bucket counter t = b

Term

Definition

b

The maximum allowable size of a single packet of data.

r

The maximum allowable bit rate.

t

The token bucket counter.

On initialisation of the token bucket counter t = b

Conformance with the agreed level of service can be defined as: "Data is conformant if the amount of data submitted during any arbitrarily chosen time period T does not exceed (b+rT)".

The token bucket counter is computed as follows:

As each packet arrives the length of the packet set t = rT where T is the time since the last packet was processed. The decision as to whether to discard, queue or pass the packet on to the network is made as follows:

Packet length (i.e. the first packet in the queue)	Action
packet length > b	Discard the packet.
packet length > t && <= b	Hold the packet in the queue.
packet length <= t	Forward the packet to the network.

If there are packets in the queue then at regular time intervals of time T the value of t is increased by rT, with a maximum limit of b and the test described in the table above is repeated.

The Architecture of the Symbian QoS support allows QoS functionality to be provided to network flows based on policies. There are three kinds of policy; Flowspec policies, Modulespec policies and Extension policies.

Policy	Description
Flowspec	Policies related to the QoS requirements of traffic flows. These can be manipulated by applications using the QoS API.
Modulespec	Policies that define the additional modules used to support QoS.
Extension	Policies used to provide module-specific data to additional modules. The QoS framework does not use extension policies itself, instead the extension policies are meant solely for the additional modules. Extension policies can be manipulated using QoS API extensions such as the UMTS API.

About Quality of Service (QoS)

The General Characteristics of a Network

Lack of Latency

Lack of Bandwidth

Lack of Reliability

Traffic Classes

The Conversational Traffic Class

The Streaming Traffic Class

The Interactive Traffic Class

The Background Traffic Class

QoS Attributes

Flow Management and the Token Bucket Algorithm

Token Bucket Algorithm

QoS Policies