The Art of
ASSEMBLY LANGUAGE PROGRAMMING

Chapter Nineteen (Part 11)

Table of Content

Chapter Nineteen (Part 13)

CHAPTER NINETEEN:
PROCESSES, COROUTINES AND CONCURRENCY (Part 12)
19.5 - Synchronization
19.5.1 - Atomic Operations, Test & Set, and Busy-Waiting
19.5 Synchronization

Many problems occur in cooperative concurrently executing processes due to synchronization (or the lack thereof). For example, one process can produce data that other processes consume. However, it might take much longer for the producer to create than data than it takes for the consumer to use it. Some mechanism must be in place to ensure that the consumer does not attempt to use the data before the producer creates it. Likewise, we need to ensure that the consumer uses the data created by the producer before the producer creates more data.

The producer-consumer problem is one of several very famous synchronization problems from operating systems theory. In the producer-consumer problem there are one or more processes that produce data and write this data to a shared buffer. Likewise, there are one or more consumers that read data from this buffer. There are two synchronization issues we must deal with - the first is to ensure that the producers do not produce more data than the buffer can hold (conversely, we must prevent the consumers from removing data from an empty buffer); the second is to ensure the integrity of the buffer data structure by allowing access to only one process at a time.

Consider what can happen in a simple producer-consumer problem. Suppose the producer and consumer processes share a single data buffer structure organized as follows:

buffer          struct
Count           word    0
InPtr           word    0
OutPtr          word    0
Data            byte    MaxBufSize dup (?)
buffer          ends

The Count field specifies the number of data bytes currently in the buffer. InPtr points at the next available location to place data in the buffer. OutPtr is the address of the next byte to remove from the buffer. Data is the actual buffer array. Adding and removing data is very easy. The following code segments almost handle this job:

; Producer-     This procedure adds the value in al to the buffer.
;               Assume that the buffer variable MyBuffer is in the data segment.

Producer        proc    near
                pushf
                sti                             ;Must have interrupts on!
                push    bx

; The following loop waits until there is room in the buffer to insert
; another byte.

WaitForRoom:    cmp     MyBuffer.Count, MaxBufSize
                jae     WaitForRoom     

; Okay, insert the byte into the buffer.

                mov     bx, MyBuffer.InPtr
                mov     MyBuffer.Data[bx], al
                inc     MyBuffer.Count          ;We just added a byte to the buffer.
                inc     MyBuffer.InPtr          ;Move on to next item in buffer.

; If we are at the physical end of the buffer, wrap around to the beginning.

                cmp     MyBuffer.InPtr, MaxBufSize
                jb      NoWrap  
                mov     MyBuffer.InPtr, 0
NoWrap:
                pop     bx
                popf
                ret
Producer        endp


; Consumer-     This procedure waits for data (if necessary) and returns the
;               next available byte from the buffer.

Consumer        proc    near
                pushf                   ;Must have interrupts on!
                sti
                push    bx
WaitForData:    cmp     Count, 0        ;Is the buffer empty?
                je      WaitForData     ;If so, wait for data to arrive.

; Okay, fetch an input character

                mov     bx, MyBuffer.OutPtr
                mov     al, MyBuffer.Data[bx]
                dec     MyBuffer.Count
                inc     MyBuffer.OutPtr
                cmp     MyBuffer.OutPtr, MaxBufSize
                jb      NoWrap
                mov     MyBuffer.OutPtr, 0
NoWrap:
                pop     bx
                popf
                ret
Consumer        endp

The only problem with this code is that it won't always work if there are multiple producer or consumer processes. In fact, it is easy to come up with a version of this code that won't work for a single set of producer and consumer processes (although the code above will work fine, in that special case). The problem is that these procedures access global variables and, therefore, are not reentrant. In particular, the problem lies with the way these two procedures manipulate the buffer control variables. Consider, for a moment, the following statements from the Consumer procedure:

                dec     MyBuffer.Count

;       // Suppose an interrupt occurs here //

                inc     MyBuffer.OutPtr
                cmp     MyBuffer.OutPtr, MaxBufSize
                jb      NoWrap
                mov     MyBuffer.OutPtr, 0
NoWrap:

If an interrupt occurs at the specified point above and control transfers to another consumer process that reenters this code, the second consumer would malfunction. The problem is that the first consumer has fetched data from the buffer but has yet to update the output pointer. The second consumer comes along and removes the same byte as the first consumer. The second consumer then properly updates the output pointer to point at the next available location in the circular buffer. When control eventually returns to the first consumer process, it finishes the operation by incrementing the output pointer. This causes the system to skip over the next byte which no process has read. The end result is that two consumer processes fetch the same byte and then skip a byte in the buffer.

This problem is easily solved by recognizing the fact that the code that manipulates the buffer data is a critical region. By restricting execution in the critical region to one process at a time, we can solve this problem. In the simple example above, we can easily prevent reentrancy by turning the interrupts off while in the critical region. For the consumer procedure, the code would look like this:

; Consumer-     This procedure waits for data (if necessary) and returns the
;               next available byte from the buffer.

Consumer        proc    near
                pushf                   ;Must have interrupts on!
                sti
                push    bx
WaitForData:    cmp     Count, 0        ;Is the buffer empty?
                je      WaitForData     ;If so, wait for data to arrive.

; The following is a critical region, so turn the interrupts off.

                cli

; Okay, fetch an input character

                mov     bx, MyBuffer.OutPtr
                mov     al, MyBuffer.Data[bx]
                dec     MyBuffer.Count
                inc     MyBuffer.OutPtr
                cmp     MyBuffer.OutPtr, MaxBufSize
                jb      NoWrap
                mov     MyBuffer.OutPtr, 0
NoWrap:
                pop     bx
                popf                    ;Restore interrupt flag.
                ret
Consumer        endp

Note that we cannot turn the interrupts off during the execution of the whole procedure. Interrupts must be on while this procedure is waiting for data, otherwise the producer process will never be able to put data in the buffer for the consumer.

Simply turning the interrupts off does not always work. Some critical regions may take a considerable amount of time (seconds, minutes, or even hours) and you cannot leave the interrupts off for that amount of time. Another problem is that the critical region may call a procedure that turns the interrupts back on and you have no control over this. A good example is a procedure that calls MS-DOS. Since MS-DOS is not reentrant, MS-DOS is, by definition, a critical section; we can only allow one process at a time inside MS-DOS. However, MS-DOS reenables the interrupts, so we cannot simply turn off the interrupts before calling an MS-DOS function an expect this to prevent reentrancy.

Turning off the interrupts doesn't even work for the consumer/producer procedures given earlier. Note that interrupts must be on while the consumer is waiting for data to arrive in the buffer (conversely, the producers must have interrupts on while waiting for room in the buffer). It is quite possible for the code to detect the presence of data and just before the execution of the cli instruction, an interrupt transfers control to a second consumer process. While it is not possible for both processes to update the buffer variables concurrently, it is possible for the second consumer process to remove the only data value from the input buffer and then switch back to the first consumer that removes a phantom value from the buffer (and causes the Count variable to go negative).

One poorly thought out solution is to use a flag to control access to a critical region. A process, before entering the critical region, tests the flag to see if any other process is currently in the critical region; if not, the process sets the flag to "in use" and then enters the critical region. Upon leaving the critical region, the process sets the flag to "not in use." If a process wants to enter a critical region and the flag's value is "in use", the process must wait until the process currently in the critical section finishes and writes the "not in use" value to the flag.

The only problem with this solution is that it is nothing more than a special case of the producer/consumer problem. The instructions that update the in-use flag form their own critical section that you must protect. As a general solution, the in-use flag idea fails.

19.5.1 Atomic Operations, Test & Set, and Busy-Waiting

The problem with the in-use flag idea is that it takes several instructions to test and set the flag. A typical piece of code that tests such a flag would read its value and determine if the critical section is in use. If not, it would then write the "in-use" value to the flag to let other processes know that it is in the critical section. The problem is that an interrupt could occur after the code tests the flag but before it sets the flag to "in use." Then some other process can come along, test the flag and find that it is not in use, and enter the critical region. The system could interrupt that second process while it is still in the critical region and transfer control back to the first. Since the first process has already determined that the critical region is not in use, it sets the flag to "in use" and enters the critical region. Now we have two processes in the critical region and the system is in violation of the mutual exclusion requirement (only one process in a critical region at a time).

The problem with this approach is that testing and setting the in-use flag is not an uninterruptable (atomic ) operation. If it were, then there would be no problem. Of course, it is easy to make a sequence of instructions non-interruptible by putting a cli instruction before them. Therefore, we can test and set a flag in an atomic operation as follows (assume in-use is zero, not in-use is one):

                pushf
TestLoop:       cli                     ;Turn ints off while testing and
                cmp     Flag, 0         ; setting flag.
                je      IsInUse         ;Already in use?
                mov     Flag, 0         ;If not, make it so.
IsInUse:        sti                     ;Allow ints (if in-use already).
                je      TestLoop        ;Wait until not in use.
                popf

; When we get down here, the flag was "not in-use" and we've just set it
; to "in-us." We now have exclusive access to the critical section.

Another solution is to use a so-called "test and set" instruction - one that both tests a specific condition and sets the flag to a desired value. In our case, we need an instruction that both tests a flag to see if it is not in-use and sets it to in-use at the same time (if the flag was already in-use, it will remain in use afterward). Although the 80x86 does not support a specific test and set instruction, it does provide several others that can achieve the same effect. These instructions include xchg, shl, shr, sar, rcl, rcr, rol, ror, btc/btr/bts (available only on the 80386 and later processors), and cmpxchg (available only on the 80486 and later processors). In a limited sense, you can also use the addition and subtraction instructions (add, sub, adc, sbb, inc, and dec) as well.

The exchange instruction provides the most generic form for the test and set operation. If you have a flag (0=in use, 1=not in use) you can test and set this flag without messing with the interrupts using the following code:

InUseLoop:      mov     al, 0           ;0=In Use
                xchg    al, Flag
                cmp     al, 0
                je      InUseLoop

The xchg instruction atomically swaps the value in al with the value in the flag variable. Although the xchg instruction doesn't actually test the value, it does place the original flag value in a location (al) that is safe from modification by another process. If the flag originally contained zero (in-use), this exchange sequence swaps a zero for the existing zero and the loop repeats. If the flag originally contained a one (not in-use) then this code swaps a zero (in-use) for the one and falls out of the in use loop.

The shift and rotate instructions also act as test and set instructions, assuming you use the proper values for the in-use flag. With in-use equal to zero and not in-use equal to one, the following code demonstrates how to use the shr instruction for the test and set operation:

InUseLoop:      shr     Flag, 1         ;In-use bit to carry, 0->Flag.
                jnc     InUseLoop       ;Repeat if already in use.

This code shifts the in-use bit (bit number zero) into the carry flag and clears the in-use flag. At the same time, it zeros the Flag variable, assuming Flag always contains zero or one. The code for the atomic test and set sequences using the other shift and rotates is very similar and appears in the exercises.

Starting with the 80386, Intel provided a set of instructions explicitly intended for test and set operations: btc (bit test and complement), bts (bit test and set), and btr (bit test and reset). These instructions copy a specific bit from the destination operand into the carry flag and then complement, set, or reset (clear) that bit. The following code demonstrates how to use the btr instruction to manipulate our in-use flag:

InUseLoop:      btr     Flag, 0         ;In-use flag is in bit zero.
                jnc     InUseLoop

The btr instruction is a little more flexible than the shr instruction because you don't have to guarantee that all the other bits in the Flag variable are zero; it tests and clears bit zero without affect any other bits in the Flag variable.

The 80486 (and later) cmpxchg instruction provides a very generic synchronization primitive. A "compare and swap" instruction turns out to be the only atomic instruction you need to implement almost any synchronization primitive. However, its generic structure means that it is a little too complex for simple test and set operations. You will get an opportunity to design a test and set sequence using cmpxchg in the exercises.

Returning to the producer/consumer problem, we can easily solve the critical region problem that exists in these routines using the test and set instruction sequence presented above. The following code does this for the Producer procedure, you would modify the Consumer procedure in a similar fashion.

; Producer-     This procedure adds the value in al to the buffer.
;               Assume that the buffer variable MyBuffer is in the data segment.

Producer        proc    near
                pushf
                sti                             ;Must have interrupts on!

; Okay, we are about to enter a critical region (this whole procedure),
; so test the in-use flag to see if this critical region is already in use.

InUseLoop:      shr     Flag, 1
                jnc     InUseLoop

                push    bx

; The following loop waits until there is room in the buffer to insert
; another byte.

WaitForRoom:    cmp     MyBuffer.Count, MaxBufSize
                jae     WaitForRoom     

; Okay, insert the byte into the buffer.

                mov     bx, MyBuffer.InPtr
                mov     MyBuffer.Data[bx], al
                inc     MyBuffer.Count          ;We just added a byte to the buffer.
                inc     MyBuffer.InPtr          ;Move on to next item in buffer.

; If we are at the physical end of the buffer, wrap around to the beginning.

                cmp     MyBuffer.InPtr, MaxBufSize
                jb      NoWrap  
                mov     MyBuffer.InPtr, 0
NoWrap:
                mov     Flag, 1                 ;Set flag to not in use.
                pop     bx
                popf
                ret
Producer        endp

One minor problem with the test and set approach to protecting a critical region is that it uses a busy-waiting loop. While the critical region is not available, the process spins in a loop waiting for its turn at the critical region. If the process that is currently in the critical region remains there for a considerable length of time (say, seconds, minutes, or hours), the process(es) waiting to enter the critical region continue to waste CPU time waiting for the flag. This, in turn, wastes CPU time that could be put to better use getting the process in the critical region through it so another process can enter.

Another problem that might exist is that it is possible for one process to enter the critical region, locking other processes out, leave the critical region, do some processing, and then reenter the critical region all during the same time slice. If it turns out that the process is always in the critical region when the timer interrupt occurs, none of the other processes waiting to enter the critical region will ever do so. This is a problem known as starvation - processes waiting to enter the critical region never do so because some other process always beats them into it.

One solution to these two problems is to use a synchronization object known as a semaphore. Semaphores provide an efficient and general purpose mechanism for protecting critical regions. To find out about semaphores, keep reading...

Chapter Nineteen (Part 11)

Table of Content

Chapter Nineteen (Part 13)

Chapter Nineteen: Processes, Coroutines and Concurrency (Part 12)
29 SEP 1996