Windows System Programming Third Edition

Chapter 8. Thread Synchronization

Download 3.41 Mb.

Page	11/31
Date	31.07.2017
Size	3.41 Mb.
	#24970

1 ... 7 8 9 10 11 12 13 14 ... 31

Chapter 8. Thread Synchronization

Threads can simplify program design and implementation and also improve performance, but thread usage requires care to ensure that shared resources are protected against simultaneous modification and that threads run only when requested or required. This chapter shows how to use Windows' synchronization objectsCRITICAL_SECTIONs, mutexes, semaphores, and eventsto solve these problems and describes some of the problems, such as deadlocks and race conditions, that can occur when the synchronization objects are not used properly. Synchronization objects can be used to synchronize threads in the same process or in separate processes.

The examples illustrate the synchronization objects and discuss the performance impacts, both positive and negative, of different synchronization methods. The following chapters then show how to use synchronization to solve additional programming problems, improve performance, avoid pitfalls, and use more advanced features.

Thread synchronization is a fundamental and interesting topic, and it is essential in nearly all threaded applications. Nonetheless, readers who are primarily interested in interprocess communication, network programming, and building threaded servers can skip to Chapter 11 and return to Chapters 8 through 10 for background material as required.

The Need for Thread Synchronization

Chapter 7 showed how to create and manage worker threads, where each worker thread accessed its own resources. In the Chapter 7 examples, each thread processes a separate file or a separate area of storage, yet simple synchronization during thread creation and termination is still required. For example, the grepMT worker threads all run independently of one another, but the boss thread must wait for the workers to complete before reporting the results generated by the worker threads. Notice that the boss shares memory with the workers, but the program design assures that the boss will not access the memory until the worker terminates.

sortMT is slightly more complicated because the workers need to synchronize by waiting for adjacent workers to complete, and the worker threads are not allowed to start until the boss thread has created all the workers. As with grepMT, synchronization is achieved by waiting for one or more threads to terminate.

In many cases, however, it is necessary for two or more threads to coordinate execution throughout each thread's lifetime. For instance, several threads may access the same variable or set of variables, and this raises the issue of mutual exclusion. In other cases, a thread cannot proceed until another thread reaches a designated point. How can the programmer assume that two or more threads do not, for example, simultaneously modify the same global storage, such as the performance statistics? Furthermore, how can the programmer ensure that a thread does not attempt to remove an element from a queue before there are any elements in the queue?

Several examples illustrate situations that can prevent code from being thread-safe. (Code is thread-safe if several threads can execute the code simultaneously without any undesirable results.) Thread safety is discussed later in this chapter and the following chapters.

Figure 8-1 shows what can happen when two unsynchronized threads share a resource such as a memory location. Both threads increment variable N, but, because of the particular sequence in which the threads might execute, the final value of N is 5, whereas the correct value is 6. Notice that the particular result shown here is neither repeatable nor predictable; a different thread execution sequence could yield the correct results. Execution on an SMP system can aggravate this problem.

Figure 8-1. Unsynchronized Threads Sharing Memory

Critical Code Sections

Incrementing N with a single statement such as N++ is no better because the compiler will generate a sequence of one or more machine-level instructions that are not necessarily executed atomically as a single unit.

The core problem is that there is a critical section of code (the code that increments N in this example) such that, once a thread starts to execute the critical section, no other thread can be allowed to enter until the first thread exits from the code section. This critical section problem can be considered a type of race condition because the first thread "races" to complete the critical section before any other thread starts to execute the critical code section. Thus, we need to synchronize thread execution in order to ensure that only one thread at a time executes the critical section.

Defective Solutions to the Critical Section Problem

Similarly unpredictable results will occur with a code sequence that attempts to protect the increment with a polled flag.

while (Flag) Sleep (1000);

Flag = TRUE;

N++;

Flag = FALSE;

Even in this case, the thread could be preempted between the time Flag is tested and the time Flag is set to trUE; the first two statements form a critical code section that is not properly protected from concurrent access by two or more threads.

Another attempted solution to the critical section synchronization problem might be to give each thread its own copy of the variable N, as follows:

DWORD WINAPI ThFunc (TH_ARGS pArgs);

{ volatile DWORD N;

... N++; ...

}

This approach is no better, however, because each thread has its own copy of the variable on its stack, where it may have been required to have N represent, for example, the total number of threads in operation. Such a solution is necessary, however, in the case in which each thread needs its own distinct copy of the variable. This technique occurs frequently in the examples.

Notice that such problems are not limited to threads within a single process. They can also occur if two processes share mapped memory or modify the same file.

volatile Storage

Yet another latent defect exists even after we solve the synchronization problem. An optimizing compiler might leave the value of N in a register rather than storing it back in N. An attempt to solve this problem by resetting compiler optimization switches would impact performance throughout the code. The correct solution is to use the ANSI C volatile storage qualifier, which ensures that the variable will be stored in memory after modification and will always be fetched from memory before use. The volatile quantifier informs the compiler that the variable can change value at any time.

Interlocked Functions

If all we need is to increment, decrement, or exchange variables, as in this simple initial example, then the interlocked functions will suffice. The interlocked functions are simpler and faster than any of the alternatives and will not block the thread. The two members of the interlocked function family that are important here are InterlockedIncrement and InterlockedDecrement. They apply to 32-bit signed integers. These functions are of limited utility, but they should be used wherever possible.

The task of incrementing N in Figure 8-1 could be implemented with a single line:

InterlockedIncrement (&N);

N is a signed long integer, and the function returns its new value, although another thread could modify N's value before the thread that called InterlockedIncrement can use the returned value.

Be careful, however, not to call this function twice in succession if, for example, you need to increment the variable by 2. The thread might be preempted between the two calls. Instead, use the InterlockedExchangeAdd function described near the end of the chapter.

Local and Global Storage

Another requirement for correct thread code is that global storage not be used for local purposes. For example, the ThFunc function example presented earlier would be necessary and appropriate if each thread required its own separate copy of N. N might hold temporary results or retain the argument. If, however, N were placed in global storage, all processes would share a single copy of N, resulting in incorrect behavior no matter how well your program synchronized access. Here is an example of such incorrect usage. N should be a local variable, allocated on the thread function's stack.

DWORD N;

DWORD WINAPI ThFunc (TH_ARGS pArgs);

{

...

N = 2 * pArgs->Count; ...

}

Summary: Thread-Safe Code

Before we proceed to the synchronization objects, here are five initial guidelines to help ensure that the code will run correctly in a threaded environment.

Variables that are local to the thread should not be static and should be on the thread's stack or in a data structure or TLS that only the individual thread can access directly.
If a function is called by several threads and a thread-specific state value, such as a counter, is to persist from one function call to the next, store the state value in TLS or in a data structure dedicated to that thread, such as the data structure passed to the thread when it is created. Do not store the persistent value on the stack. Program 12-4 and 12-5 show the required techniques when building thread-safe DLLs.
Avoid race conditions such as the one that would occur in Program 7-2 (sortMT) if the threads were not created in a suspended state. If some condition is assumed to hold at a specific point in the program, wait on a synchronization object to ensure that, for example, a handle references an existing thread.
Threads should not, in general, change the process environment because that would affect all threads. Thus, a thread should not set the standard input or output handles or change environment variables. An exception would be the primary thread, which might make such changes before creating other threads.
Variables shared by all threads should be static or in global storage, declared volatile, and protected with the synchronization mechanisms that will be described next.

The next section discusses the synchronization objects. With that discussion, there will be enough to develop a simple producer/consumer example.

прямоугольник 1

Thread Synchronization Objects

Two mechanisms discussed so far allow processes and threads to synchronize with one another.

A thread running in a process can wait for another process to terminate, using ExitProcess, by waiting on the process handle using WaitForSingleObject or WaitForMultipleObjects. A thread can wait for another thread to terminate (ExitThread or return) in the same way.
File locks are specifically for synchronizing file access.

Windows provides four other objects designed for thread and process synchronization. Three of these objectsmutexes, semaphores, and eventsare kernel objects that have handles. Events are also used for other purposes, such as asynchronous I/O (Chapter 14).

The fourth object, the CRITICAL_SECTION, is discussed first. Because of their simplicity and performance advantages, CRITICAL_SECTIONs are the preferred mechanism when they are adequate for a program's requirements. There are some performance issues, however, which are described in Chapter 9.

Caution: There are risks inherent to the use of synchronization objects if they are not used properly. These risks, such as deadlocks, are described in this and subsequent chapters, along with techniques for developing reliable code. First, however, we'll show some synchronization examples in realistic situations.

Two other synchronization objects, waitable timers and I/O completion ports, are deferred until Chapter 14. Both these objects require the Windows asynchronous I/O techniques described in that chapter.

The CRITICAL_SECTION Object

A critical section, as described earlier, is a section of code that only one thread can execute at a time; more than one thread executing the critical section concurrently can result in unpredictable and incorrect results.

Windows provides the CRITICAL_SECTION object as a simple mechanism for implementing and enforcing the critical section concept.

CRITICAL_SECTION (CS) objects are initialized and deleted but do not have handles and are not shared by other processes. A variable should be declared to be of type CRITICAL_SECTION. Threads enter and leave a CS, and only one thread at a time can be in a specific CS. A thread can, however, enter and leave a specific CS at several places in the program.

To initialize and delete a CRITICAL_SECTION variable and its resources, use InitializeCriticalSection and DeleteCriticalSection, respectively.

VOID InitializeCriticalSection (