eCos contains support for limited Symmetric Multi-Processing (SMP). This is only available on selected architectures and platforms. The implementation has a number of restrictions on the kind of hardware supported. These are described in the Section called SMP Support in Chapter 4.
The following sections describe the changes that have been made to the eCos kernel to support SMP operation.
The system startup sequence needs to be somewhat different on an SMP
system, although this is largely transparent to application code. The
main startup takes place on only one CPU, called the primary CPU. All
other CPUs, the secondary CPUs, are either placed in suspended state
at reset, or are captured by the HAL and put into a spin as they start
up. The primary CPU is responsible for copying the DATA segment and
zeroing the BSS (if required), calling HAL variant and platform
initialization routines and invoking constructors. It then calls
cyg_start
to enter the application. The
application may then create extra threads and other objects.
It is only when the application calls
cyg_scheduler_start
that the secondary CPUs are
initialized. This routine scans the list of available secondary CPUs
and invokes HAL_SMP_CPU_START
to start each
CPU. Finally it calls an internal function
Cyg_Scheduler::start_cpu
to enter the scheduler
for the primary CPU.
Each secondary CPU starts in the HAL, where it completes any per-CPU
initialization before calling into the kernel at
cyg_kernel_cpu_startup
. Here it claims the
scheduler lock and calls
Cyg_Scheduler::start_cpu
.
Cyg_Scheduler::start_cpu
is common to both the
primary and secondary CPUs. The first thing this code does is to
install an interrupt object for this CPU's inter-CPU interrupt. From
this point on the code is the same as for the single CPU case: an
initial thread is chosen and entered.
From this point on the CPUs are all equal, eCos makes no further distinction between the primary and secondary CPUs. However, the hardware may still distinguish between them as far as interrupt delivery is concerned.
To function correctly an operating system kernel must protect its vital data structures, such as the run queues, from concurrent access. In a single CPU system the only concurrent activities to worry about are asynchronous interrupts. The kernel can easily guard its data structures against these by disabling interrupts. However, in a multi-CPU system, this is inadequate since it does not block access by other CPUs.
The eCos kernel protects its vital data structures using the scheduler lock. In single CPU systems this is a simple counter that is atomically incremented to acquire the lock and decremented to release it. If the lock is decremented to zero then the scheduler may be invoked to choose a different thread to run. Because interrupts may continue to be serviced while the scheduler lock is claimed, ISRs are not allowed to access kernel data structures, or call kernel routines that can. Instead all such operations are deferred to an associated DSR routine that is run during the lock release operation, when the data structures are in a consistent state.
By choosing a kernel locking mechanism that does not rely on interrupt manipulation to protect data structures, it is easier to convert eCos to SMP than would otherwise be the case. The principal change needed to make eCos SMP-safe is to convert the scheduler lock into a nestable spin lock. This is done by adding a spinlock and a CPU id to the original counter.
The algorithm for acquiring the scheduler lock is very simple. If the scheduler lock's CPU id matches the current CPU then it can just increment the counter and continue. If it does not match, the CPU must spin on the spinlock, after which it may increment the counter and store its own identity in the CPU id.
To release the lock, the counter is decremented. If it goes to zero the CPU id value must be set to NONE and the spinlock cleared.
To protect these sequences against interrupts, they must be performed with interrupts disabled. However, since these are very short code sequences, they will not have an adverse effect on the interrupt latency.
Beyond converting the scheduler lock, further preparing the kernel for SMP is a relatively minor matter. The main changes are to convert various scalar housekeeping variables into arrays indexed by CPU id. These include the current thread pointer, the need_reschedule flag and the timeslice counter.
At present only the Multi-Level Queue (MLQ) scheduler is capable of supporting SMP configurations. The main change made to this scheduler is to cope with having several threads in execution at the same time. Running threads are marked with the CPU that they are executing on. When scheduling a thread, the scheduler skips past any running threads until it finds a thread that is pending. While not a constant-time algorithm, as in the single CPU case, this is still deterministic, since the worst case time is bounded by the number of CPUs in the system.
A second change to the scheduler is in the code used to decide when the scheduler should be called to choose a new thread. The scheduler attempts to keep the n CPUs running the n highest priority threads. Since an event or interrupt on one CPU may require a reschedule on another CPU, there must be a mechanism for deciding this. The algorithm currently implemented is very simple. Given a thread that has just been awakened (or had its priority changed), the scheduler scans the CPUs, starting with the one it is currently running on, for a current thread that is of lower priority than the new one. If one is found then a reschedule interrupt is sent to that CPU and the scan continues, but now using the current thread of the rescheduled CPU as the candidate thread. In this way the new thread gets to run as quickly as possible, hopefully on the current CPU, and the remaining CPUs will pick up the remaining highest priority threads as a consequence of processing the reschedule interrupt.
The final change to the scheduler is in the handling of timeslicing. Only one CPU receives timer interrupts, although all CPUs must handle timeslicing. To make this work, the CPU that receives the timer interrupt decrements the timeslice counter for all CPUs, not just its own. If the counter for a CPU reaches zero, then it sends a timeslice interrupt to that CPU. On receiving the interrupt the destination CPU enters the scheduler and looks for another thread at the same priority to run. This is somewhat more efficient than distributing clock ticks to all CPUs, since the interrupt is only needed when a timeslice occurs.
All existing synchronization mechanisms work as before in an SMP system. Additional synchronization mechanisms have been added to provide explicit synchronization for SMP, in the form of spinlocks.
The main area where the SMP nature of a system requires special attention is in device drivers and especially interrupt handling. It is quite possible for the ISR, DSR and thread components of a device driver to execute on different CPUs. For this reason it is much more important that SMP-capable device drivers use the interrupt-related functions correctly. Typically a device driver would use the driver API rather than call the kernel directly, but it is unlikely that anybody would attempt to use a multiprocessor system without the kernel package.
Two new functions have been added to the Kernel API
to do interrupt
routing: cyg_interrupt_set_cpu
and
cyg_interrupt_get_cpu
. Although not currently
supported, special values for the cpu argument may be used in future
to indicate that the interrupt is being routed dynamically or is
CPU-local. Once a vector has been routed to a new CPU, all other
interrupt masking and configuration operations are relative to that
CPU, where relevant.
There are more details of how interrupts should be handled in SMP systems in the Section called SMP Support in Chapter 13.