PMCL Index 2015 11 - Real-Time Operating Systems
PMCL Index 2015 11 - Real-Time Operating Systems
PMCL Index 2015 11 - Real-Time Operating Systems
Contents
Introduction
A real-time operating system (RTOS) for an embedded system simplifies the
design of real-time software by allowing the application to be divided into
multiple threads managed by the RTOS. The kernel of an embedded RTOS
needs to support multithreading, pre-emption, and thread priority. The RTOS
will also provide services to threads for communication, synchronization and
coordination. A RTOS is to be used for a hard real-time system i.e. threads
have to be performed not only correctly but also in a timely fashion.
Operating systems for larger computers (such as the PC) are non-real-time
operating systems and usually provide a much larger range of application
services, such as memory management and file management which normally
do not apply to embedded systems.
11.1.1 Threads
A thread is a simple program that thinks it has the CPU all to itself. The design
process for a real-time application involves splitting the work to be done into
threads which are responsible for a portion of the problem. Each thread is
assigned a priority, its own set of CPU registers and its own stack area.
Each thread is typically an infinite loop that can be in one of four states:
READY, RUNNING, WAITING or INTERRUPTED.
WAITING
A thread is READY when it can execute but its priority is less than the current
running thread. A thread is RUNNING when it has control of the CPU. A thread
is WAITING when the thread suspends itself until a certain amount of time has
elapsed, or when it requires the occurrence of an event: waiting for an I/O
operation to complete, a shared resource to be available, a timing pulse to
occur etc. Finally, a thread is INTERRUPTED when an interrupt occurred and
the CPU is in the process of servicing the interrupt.
11.1.3 Kernel
11.1.4 Scheduler
The scheduler is the part of the kernel responsible for determining which thread
will run next. Most real-time kernels are priority based. Each thread is assigned
a priority based on its importance. Establishing the priority for each thread is
application specific. In a priority-based kernel, control of the CPU will always
be given to the highest priority thread ready to run. In a preemptive kernel,
when a thread makes a higher priority thread ready to run, the current thread is
pre-empted (suspended) and the higher priority thread is immediately given
control of the CPU. If an interrupt service routine (ISR) makes a higher priority
thread ready, then when the ISR is completed the interrupted thread is
suspended and the new higher priority thread is resumed.
Time
ISR
High-Priority Thread
11.2 Reentrancy
A reentrant function can be used by more than one thread without fear of data
corruption. A reentrant function can be interrupted at any time and resumed at
a later time without loss of data. Reentrant functions either use local variables
(i.e., CPU registers or variables on the stack) or protect data when global
variables are used. An example of a reentrant function is shown below:
char* strcpy(char* dst, const char* src)
{
char* ptr = dst;
while (*dst++ = *src++);
return ptr;
}
Since copies of the arguments to strcpy() are placed on the thread's stack,
and the local variable is created on the threads stack, strcpy() can be
invoked by multiple threads without fear that the threads will corrupt each
other's pointers.
swap() is a simple function that swaps the contents of its two arguments.
Since Temp is a global variable, if the swap() function gets preempted after
the first line by a higher priority thread which also uses the swap() function,
then when the low priority thread resumes it will use the Temp value that was
used by the high priority thread.
You can make swap() reentrant with one of the following techniques:
Use a semaphore.
Thread priorities are said to be static when the priority of each thread does not
change during the application's execution. Each thread is thus given a fixed
priority at compile time. All the threads and their timing constraints are known
at compile time in a system where priorities are static.
disabling interrupts,
using semaphores.
The easiest and fastest way to gain exclusive access to a shared resource is by
disabling and enabling interrupts, as shown in the pseudocode:
Disable interrupts;
Access the resource (read/write from/to variables);
Reenable interrupts;
Kernels use this technique to access internal variables and data structures. In
fact, kernels usually provide two functions that allow you to disable and then
enable interrupts from your C code: OS_EnterCritical() and
OS_ExitCritical(), respectively. You need to use these functions in
tandem, as shown below:
void Function(void)
{
OS_EnterCritical();
/* You can access shared data in here */
OS_ExitCritical();
}
You must be careful, however, not to disable interrupts for too long because
this affects the response of your system to interrupts. This is known as interrupt
latency. You should consider this method when you are changing or copying a
If you use a kernel, you are basically allowed to disable interrupts for as much
time as the kernel does without affecting interrupt latency. Obviously, you
need to know how long the kernel will disable interrupts.
11.4.2 Semaphores
the first thread that requested the semaphore (First In First Out).
Some kernels have an option that allows you to choose either method when the
semaphore is initialized. For the first option, if the readied thread has a higher
priority than the current thread (the thread releasing the semaphore), a context
switch occurs (with a preemptive kernel) and the higher priority thread resumes
execution; the current thread is suspended until it again becomes the highest
priority thread ready to run.
Listing 11.1 shows how you can share data using a semaphore. Any thread
needing access to the same shared data calls OS_SemaphoreWait(), and
when the thread is done with the data, the thread calls
OS_SemaphoreSignal(). Both of these functions are described later. You
should note that a semaphore is an object that needs to be initialized before it is
used; for mutual exclusion, a semaphore is initialized to a value of 1. Using a
semaphore to access shared data doesn't affect interrupt latency. If an ISR or
OS_ECB* SharedDataSemaphore;
void Function(void)
{
OS_ERROR error;
Semaphores are especially useful when threads share I/O devices. Imagine
what would happen if two threads were allowed to send characters to a printer
at the same time. The printer would contain interleaved data from each thread.
For instance, the printout from Thread 1 printing "I am Thread 1!" and
Thread 2 printing "I am Thread 2!" could result in:
Figure 11.3 shows threads competing for a semaphore to gain exclusive access
to the printer. Note that the semaphore is represented symbolically by a key,
indicating that each thread must obtain this key to use the printer.
Acquire semaphore
SEMAPHORE PRINTER
Acquire semaphore
THREAD 2
"I am Thread 2!"
The above example implies that each thread must know about the existence of
the semaphore in order to access the resource. There are situations when it is
better to encapsulate the semaphore. Each thread would thus not know that it is
actually acquiring a semaphore when accessing the resource. For example, the
UART port may be used by multiple threads to send commands and receive
responses from a PC:
THREAD 1 Packet_Put()
DRIVER UART
THREAD 2 Packet_Put()
Semaphore
The function Packet_Put() is called with two arguments: the packet and a
timeout in case the device doesn't respond within a certain amount of time. The
pseudocode for this function is shown in Listing 11.2.
uint8_t Packet_Put(TPacket* packet, const uint16_t timeout)
{
Acquire serial port's semaphore;
Send packet to device;
Wait for response (with timeout);
Release semaphore;
if (timed out)
return (error code);
else
return (no error);
}
Listing 11.2 Encapsulating a semaphore
Each thread that needs to send a packet to the serial port has to call this
function. The semaphore is assumed to be initialized to 1 (i.e., available) by the
communication driver initialization routine. The first thread that calls
Packet_Put() acquires the semaphore, proceeds to send the packet, and waits
for a response. If another thread attempts to send a command while the port is
busy, this second thread is suspended until the semaphore is released. The
second thread appears simply to have made a call to a normal function that will
not return until the function has performed its duty. When the semaphore is
released by the first thread, the second thread acquires the semaphore and is
allowed to use the serial port.
A counting semaphore is used when a resource can be used by more than one
thread at the same time. For example, a counting semaphore is used in the
management of a buffer pool as shown in Figure 11.5.
10
Buffer_Request() Buffer_Release()
Buffer manager
THREAD 1 THREAD 2
Assume that the buffer pool initially contains 10 buffers. A thread would
obtain a buffer from the buffer manager by calling Buffer_Request().
When the buffer is no longer needed, the thread would return the buffer to the
buffer manager by calling Buffer_Release(). The pseudocode for these
functions is shown in Listing 11.3.
BUF* Buffer_Request(void)
{
BUF* ptr;
Acquire a semaphore;
Disable interrupts;
ptr = BufFreeList;
BufFreeList = ptr->next;
Enable interrupts;
return (ptr);
}
The buffer manager will satisfy the first 10 buffer requests because there are 10
keys. When all semaphores are used, a thread requesting a buffer is suspended
until a semaphore becomes available. Interrupts are disabled to gain exclusive
access to the linked list (this operation is very quick). When a thread is finished
with the buffer it acquired, it calls Buffer_Release() to return the buffer
to the buffer manager; the buffer is inserted into the linked list before the
semaphore is released. By encapsulating the interface to the buffer manager in
Buffer_Request() and Buffer_Release(), the caller doesn't need to
be concerned with the actual implementation details.
Most kernels allow you to specify a timeout when acquiring a semaphore. This
feature allows a deadlock to be broken. If the semaphore is not available within
a certain amount of time, the thread requesting the resource resumes execution.
Some form of error code must be returned to the thread to notify it that a
timeout occurred. A return error code prevents the thread from thinking it has
obtained the resource. Deadlocks generally occur in large multithreading
systems, not in embedded systems.
11.5 Synchronization
A thread can be synchronized with an ISR (or another thread when no data is
being exchanged) by using a semaphore as shown in Figure 11.6.
Signal Wait
ISR THREAD
Signal Wait
THREAD THREAD
Note that, in this case, the semaphore is drawn as a flag to indicate that it is
used to signal the occurrence of an event (rather than to ensure mutual
exclusion, in which case it would be drawn as a key). When used as a
synchronization mechanism, the semaphore is initialized to 0. Using a
semaphore for this type of synchronization is called a unilateral rendezvous. A
thread initiates an I/O operation and waits for the semaphore. When the I/O
operation is complete, an ISR (or another thread) signals the semaphore and the
thread is resumed.
Depending on the application, more than one ISR or thread could signal the
occurrence of the event.
Signal Wait
THREAD THREAD
Wait Signal
For example, two threads are executing as shown in Listing 11.4. When the
first thread reaches a certain point, it signals the second thread (1) then waits
for a return signal (2). Similarly, when the second thread reaches a certain
point, it signals the first thread (3) and waits for a return signal (4). At this
point, both threads are synchronized with each other. A bilateral rendezvous
cannot be performed between a thread and an ISR because an ISR cannot wait
on a semaphore.
void Thread1(void)
{
for (;;)
{
Perform operation 1;
Signal thread #2; (1)
Wait for signal from thread #2; (2)
Continue operation 1;
}
}
void Thread2(void)
{
for (;;)
{
Perform operation 2;
Signal thread #1; (3)
Wait for signal from thread #1; (4)
Continue operation 2;
}
}
Listing 11.4 Bilateral rendezvous
When using global variables, each thread or ISR must ensure that it has
exclusive access to the variables. If an ISR is involved, the only way to ensure
exclusive access to the common variables is to disable interrupts. If two threads
are sharing data, each can gain exclusive access to the variables either by
disabling and enabling interrupts or with the use of a semaphore (as we have
seen). Note that a thread can only communicate information to an ISR by using
global variables. A thread is not aware when a global variable is changed by an
ISR, unless the ISR signals the thread by using a semaphore or unless the
thread polls the contents of the variable periodically. To correct this situation,
you should consider using either a message mailbox or a message queue.
A waiting list is associated with each mailbox in case more than one thread
wants to receive messages through the mailbox. A thread desiring a message
from an empty mailbox is suspended and placed on the waiting list until a
message is received. Typically, the kernel allows the thread waiting for a
message to specify a timeout. If a message is not received before the timeout
expires, the requesting thread is made ready to run and an error code
(indicating that a timeout has occurred) is returned to it. When a message is
deposited into the mailbox, either the highest priority thread waiting for the
message is given the message (priority based) or the first thread to request a
message is given the message (First-In-First-Out, or FIFO). Figure 11.8 shows
a thread depositing a message into a mailbox. Note that the mailbox is
represented by an I-beam and the timeout is represented by an hourglass. The
number next to the hourglass represents the number of clock ticks the thread
will wait for a message to arrive.
Mailbox
POST WAIT
THREAD THREAD
10
Get a message from a mailbox if one is present, but do not suspend the
caller if the mailbox is empty (ACCEPT). If the mailbox contains a
message, the message is extracted from the mailbox. A return code is
used to notify the caller about the outcome of the call.
As with the mailbox, a waiting list is associated with each message queue, in
case more than one thread is to receive messages through the queue. A thread
desiring a message from an empty queue is suspended and placed on the
waiting list until a message is received. Typically, the kernel allows the thread
waiting for a message to specify a timeout. If a message is not received before
the timeout expires, the requesting thread is made ready to run and an error
code (indicating a timeout has occurred) is returned to it. When a message is
deposited into the queue, either the highest priority thread or the first thread to
wait for the message is given the message. Figure 11.9 shows an ISR (Interrupt
Service Routine) depositing a message into a queue. Note that the queue is
PMcL Interthread Communication Index
Queue
POST WAIT
ISR 10 THREAD
Interrupt 0
Get a message from a queue if one is present, but do not suspend the
caller if the queue is empty (ACCEPT). If the queue contains a
message, the message is extracted from the queue. A return code is used
to notify the caller about the outcome of the call.
11.7 Interrupts
An interrupt is a hardware mechanism used to inform the CPU that an
asynchronous event has occurred. When an interrupt is recognized, the CPU
saves all of its context (i.e., registers) and jumps to a special subroutine called
an Interrupt Service Routine, or ISR. The ISR processes the event, and upon
completion of the ISR, the program returns to:
Time
Thread
ISR1
ISR2
ISR3
Interrupt 1
Interrupt 2
Interrupt 3
Interrupt latency
Interrupt response is defined as the time between the reception of the interrupt
and the start of the user code that handles the interrupt. The interrupt response
time accounts for all the overhead involved in handling an interrupt.
= Interrupt latency
Interrupt recovery is defined as the time required for the processor to return to
the interrupted code. Interrupt recovery in a foreground / background system
simply involves restoring the processor's context and returning to the
interrupted thread. Interrupt recovery is given by Eq. (11.4).
Figure 11.11 and Figure 11.12 show the interrupt latency, response, and
recovery for a foreground / background system and a preemptive kernel,
respectively.
Time
Interrupt Request
Background Background
You should note that for a preemptive kernel, the exit function either decides to
return to the interrupted thread (A) or to a higher priority thread that the ISR
has made ready to run (B). In the latter case, the execution time is slightly
longer because the kernel has to perform a context switch.
Time
Interrupt Request
Interrupt Recovery
Thread 1 Thread 1
Interrupt Recovery
A clock tick is a special interrupt that occurs periodically. This interrupt can be
viewed as the system's heartbeat. The time between interrupts is application
specific and is generally between 1 and 200 ms. The clock tick interrupt allows
a kernel to delay threads for an integral number of clock ticks and to provide
timeouts when threads are waiting for events to occur. The faster the tick rate,
the higher the overhead imposed on the system.
All kernels allow threads to be delayed for a certain number of clock ticks. The
resolution of delayed threads is one clock tick; however, this does not mean
that its accuracy is one clock tick.
Figure 11.13 through Figure 11.15 are timing diagrams showing a thread
delaying itself for one clock tick. The shaded areas indicate the execution time
for each operation being performed. Note that the time for each operation
varies to reflect typical processing, which would include loops and conditional
statements (i.e., if/else, switch, and ?:). The processing time of the Tick
ISR has been exaggerated to show that it too is subject to varying execution
times.
20 ms
Tick Interrupt
Tick ISR
All higher
priority threads
Call to delay 1 tick (20 ms) Call to delay 1 tick (20 ms) Call to delay 1 tick (20 ms)
Delayed thread
t1 t3
(19 ms) t2 (27 ms)
(17 ms)
Case 1 (Figure 11.13) shows a situation where higher priority threads and ISRs
execute prior to the thread, which needs to delay for one tick. The thread
attempts to delay for 20ms but because of its priority, it actually executes at
varying intervals. This causes the execution of the thread to jitter.
Index Interrupts PMcL
20 ms
Tick Interrupt
Tick ISR
All higher
priority threads
Call to delay 1 tick (20 ms) Call to delay 1 tick (20 ms) Call to delay 1 tick (20 ms)
Delayed thread
t1 t3
(6 ms) t2 (27 ms)
(19 ms)
Case 2 (Figure 11.14) shows a situation where the execution times of all higher
priority threads and ISRs are slightly less than one tick. If the thread delays
itself just before a clock tick, the thread will execute again almost immediately!
Because of this, if you need to delay a thread at least one clock tick, you must
specify one extra tick. In other words, if you need to delay a thread for at least
five ticks, you must specify six ticks!
20 ms
Tick Interrupt
Tick ISR
All higher
priority threads
Call to delay 1 tick (20 ms) Call to delay 1 tick (20 ms)
Delayed thread
t2
t1 (26 ms)
(40 ms)
Case 3 (Figure 11.15) shows a situation in which the execution times of all
higher priority threads and ISRs extend beyond one clock tick. In this case, the
thread that tries to delay for one tick actually executes two ticks later and
misses its deadline. This might be acceptable in some applications, but in most
cases it isn't.
Avoid using floating-point maths (if you must, use single precision).
Because each thread runs independently of the others, it must be provided with
its own stack area (RAM). As a designer, you must determine the stack
requirement of each thread as closely as possible (this is sometimes a difficult
undertaking). The stack size must not only account for the thread requirements
(local variables, function calls, etc.), it must also account for maximum
interrupt nesting (saved registers, local storage in ISRs, etc.). Depending on the
target processor and the kernel used, a separate stack can be used to handle all
interrupt-level code. This is a desirable feature because the stack requirement
Unless you have large amounts of RAM to work with, you need to be careful
how you use the stack space. To reduce the amount of RAM needed in an
application, you must be careful how you use each thread's stack for:
interrupt nesting,
You should consider using a real-time kernel if your application can afford the
extra requirements: extra cost of the kernel, more ROM/RAM, and 2 to 4
percent additional CPU overhead.