12    Developing Thread-safe Libraries

To support the development of multithreaded applications, the Digital UNIX operating system provides DECthreads, Digital's Multithreading Run-Time Library. The DECthreads interface is Digital UNIX's implementation of IEEE Standard 1003.1c-1995 threads (also referred to as POSIX 1003.1c threads).

In addition to an actual threading interface, the operating system also provides Thread-Independent Services (TIS). The TIS routines are an aid to creating thread-safe libraries (see Section 12.4.1).

This chapter addresses the following topics:


12.1    Overview of Thread Support

A thread is a single, sequential flow of control within a program. Multiple threads execute concurrently and share most resources of the owning process, including the address space. By default, a process initially has one thread.

The purposes for which multiple threads are useful include:

You can also use multiple threads as an alternative approach to managing certain events. For example, you can use one thread per file descriptor in a process that otherwise might use the select( ) or poll( ) system calls to efficiently manage concurrent I/O operations on multiple file descriptors.

The components of the multithreaded development environment for the Digital UNIX system include the following:

For information on profiling multithreaded applications, see Section 8.14.


12.2    Run-Time Library Changes for POSIX Conformance

For releases of the DEC OSF/1 operating system (that is, for releases prior to Digital UNIX Version 4.0), a large number of separate reentrant routines (*_r routines) were provided to solve the problem of static data in the C run-time library (the first two problems listed in Section 12.3.1). The Digital UNIX operating system fixes the problem of static data in the non-reentrant versions of the routines by replacing the static data with thread-specific data. Except for a few routines specified by POSIX 1003.1c, all of the alternate routines are no longer required and are retained only for binary compatibility.

The following functions are the only alternate thread-safe routines that are specified by POSIX 1003.1c and need to be used when writing thread-safe code:
alctime_r* ctime_r* getgrgid_r*
getgrnam_r* getpwnam_r* getpwuid_r*
gmtime_r* localtime_r* rand_r*
readdir_r* strtok_r  

Starting with Digital UNIX Version 4.0, the interfaces flagged with an asterisk (*) in the preceding list have new definitions that conform to POSIX 1003.1c. The old versions of these routines can be obtained by defining the preprocessor symbol _POSIX_C_SOURCE with the value 199309L (which denotes POSIX 1003.1b conformance). The new versions of the routines are the default when compiling code under Digital UNIX Version 4.0 or later, but you must be certain to include the header files specified on the manpages for the various routines.

For more information on programming with threads, see the Guide to DECthreads and cc(1), monitor(3), prof(1), and gprof(1).


12.3    Characteristics of Thread-Safe and Reentrant Routines

Routines within a library can be thread safe or not. A thread-safe routine is one that can be called concurrently from multiple threads without undesirable interactions between threads. A routine can be thread safe for either of the following reasons:

Reentrant routines do not share any state across concurrent invocations from multiple threads. A reentrant routine is the ideal thread-safe routine, but not all routines can be made to be reentrant.

Prior to Digital UNIX Version 4.0, many of the C run-time library (libc) routines were not thread safe, and alternate versions of these routines were provided in libc_r. Starting with Digital UNIX Version 4.0, all of the alternate versions formerly found in libc_r were merged into libc. If a thread-safe routine and its corresponding nonthread-safe routine had the same name, the nonthread-safe version was replaced. The thread-safe versions are modified to use Thread Independent Services (TIS) (see Section 12.4.1); this enables them to work in both single- and multithreaded environments - without extensive overhead in the single-threaded case.


12.3.1    Examples of Nonthread-safe Coding Practices

Some common practices that can prevent code from being thread safe can be found by examining why some of the libc functions were not thread safe prior to Digital UNIX Version 4.0:


12.4    Writing Thread-safe Code

When writing code that can be used by both single-threaded and multithreaded applications, it is necessary to code in a thread-safe manner. The following coding practices must be observed:


12.4.1    Using Thread Independent Services (TIS)

TIS is a package of routines provided by the C run-time library that can be used to write efficient code for both single-threaded and multithreaded applications. TIS routines can be used for handling mutexes, handling thread-specific data, and a variety of other purposes.

When used by a single-threaded application, these routines use simplified semantics to perform thread-safe operations for the single-threaded case. When DECthreads is present, the bodies of the routines are replaced with more complicated algorithms to optimize their behavior for the multithreaded case.

TIS is used within libc itself to allow a single version of the C run-time library to service both single-threaded and multithreaded applications. See the Guide to DECthreads and tis(3) for information on how to use this facility.


12.4.2    Using Thread-Specific Data

Example 12-1 shows how to use thread-specific data in a function that can be used by both single-threaded and multithreaded applications. For clarity, most error checking has been left out of the example.

Example 12-1: Threads Programming Example

#include <stdlib.h>
#include <string.h>
#include <tis.h>

 
static pthread_key_t key; void _ _init_dirname() { tis_key_create(&key, free); }
 
void _ _fini_dirname() { tis_key_delete(key); }
 
char *dirname(char *path) { char *dir, *lastslash; /* * Assume key was set and get thread-specific variable. */ dir = tis_getspecific(key); if(!dir) { /* First time this thread got here. */ dir = malloc(PATH_MAX); tis_setspecific(key, dir); }
 
/* * Copy dirname component of path into buffer and return. */ lastslash = strrchr(path, '/'); if(lastslash) { memcpy(dir, path, lastslash-path); dir[lastslash-dir+1] = '\0'; } else strcpy(dir, path); return dir; }

The following TIS routines are used in the preceding example:

tis_key_create
Generates a unique data key.

tis_key_delete
Deletes a data key.

tis_getspecific
Obtains the data associated with the specified key.

tis_setspecific
Sets the data value associated with the specified key.

The _ _init_ and _ _fini_ routines are used in the example to initialize and destroy the thread-specific data key. This operation is done only once, and these routines provide a convenient way of ensuring that this is the case, even if the library is loaded with dlopen(). See ld(1) for an explanation of how to use the _ _init_ and _ _fini_ routines.

Thread-specific data keys are a limited resource. A library that needs to create a large number of data keys should instead be written to create just one and to store all of the separate data items as a structure or an array of pointers pointed to by a single key.


12.4.3    Using Mutex Locks to Share Data Between Threads

In some cases, using thread-specific data is not the correct way to convert static data into thread-safe code, for example, when a data object is meant to be shareable between threads (as in stdio streams within libc). Manipulating per-process resources is another case in which thread-specific data is inadequate. The following example shows how to manipulate per-process resources in a thread-safe fashion:

#include <pthread.h>
#include <tis.h>

 
/* * NOTE: The putenv() function would have to set and clear the * same mutex lock before it accessed the environment. */
 
extern char **environ; static pthread_mutex_t environ_mutex = PTHREAD_MUTEX_INITIALIZER;
 
char *getenv(const char *name) { char **s, *value; int len;
 
tis_mutex_lock(&environ_mutex); len = strlen(name); for(s=environ; value=*s; s++) if(strncmp(name, value, len) == 0 && value[len] == '=') { tis_mutex_unlock(&environ_mutex); return &(value[len+1]); } tis_mutex_unlock(&environ_mutex); return (char *) 0L; }

In the preceding example, note how the lock is set once (tis_mutex_lock) before accessing the environment and is unlocked exactly once (tis_mutex_unlock) before returning. In the multithreaded case, any other thread attempting to access the environment while the first thread holds the lock is blocked until the first thread performs the unlock operation. In the single-threaded case, no contention occurs unless an error exists in the coding of the locking and unlocking sequences.

If it is necessary for the lock state to remain valid across a fork() system call in multithreaded applications, it may be useful to create and register pthread_atfork() handler functions to lock the lock prior to any fork() call, and to unlock it in both the child and parent after the fork() call. This guarantees that a fork operation is not done by one thread while another thread holds the lock. If the lock was held by another thread, it would end up permanently locked in the child because the fork operation produces a child with only one thread. In the case of an independent library, the call to pthread_atfork() can be done in an _ _init_ routine in the library. Unlike most pthread routines, the pthread_atfork routine is available in libc and may be used by both single-threaded and multithreaded applications.


12.5    Building Multithreaded Applications

The compilation and linking of multithreaded applications differs from that of single threaded applications in a few minor but important ways.


12.5.1    Compiling Multithreaded C Applications

Many system include files behave differently when they are being included into the compilation of a multithreaded application. Whether the single-threaded or thread-safe include file behavior applies is determined by whether the \_REENTRANT preprocessor symbol is defined. When the -pthread flag is supplied to the cc or c89 command, the \_REENTRANT symbol is defined automatically; it is also defined if the pthreads.h system include file is included. This include file must be the first file included in any application that uses the pthreads library, libpthread.so.

The -pthread flag has no other effect on the compilation of C programs. The reentrancy of the actual code generated by the C compiler is determined only by proper use of reentrant coding practices by the programmer, by use of only thread-safe support libraries, and by use of only thread-safe support libraries - not by any special options.


12.5.2    Linking Multithreaded C Applications

To link a multithreaded C application, use the cc or c89 command with the -pthread flag. When linking, the -pthread flag has the effect of modifying the library search path in the following ways:

The -pthread flag does not modify the behavior of the linker in any other way. The reentrancy of the linked code is determined by use of proper programming practices in the orginal code, and by compiling and linking with the proper include files and libraries, respectively.


12.5.3    Building Multithreaded Applications in Other Languages

Not all compilers necessarily generate reentrant code; the definition of the language itself can make this difficult. It is also necessary for any run-time libraries linked with the application to be thread safe. For details on such matters, you should consult the manual for the compiler you are using.