POSIX Threads

Why Threads?

Threads are essentially a way for code to be executed in several places (pseudo-)concurrently. In other words, you can have a program "simultaneously" run, say, two different functions (I say "simultaneously" because unless you have a multi-processor system, they are of course not really simultaneous). "But wait," you say, "we learned how to do that in the first week!" Which is true, but threads offer several benefits over forking:

IMPORTANT THREAD NOTES

Not all functions can be used in multi-threaded programs! Functions come in two varieties: thread-safe, and not; using a call that isn't thread-safe in a multi-threaded program is a really good way to make really bad things happen. Before using a call, check the man pages for notes about thread safety. For example, the following is from man -s 3C rand (3C is the section for C system calls):

The rand() is unsafe in multi-threaded applications. The rand_r() function is MT-Safe, and should be used instead. The srand() function is unsafe in multi-threaded applications.

Include the line #define _REENTRANT at the top of all your thread-using code, otherwise you'll need to pass that constant as a parameter every time you compile your code (see Monkey book pg. 332).

Few thread calls return -1 on error, or set errno. So be sure to check what the failure return for each call is, and use a descriptive printf instead of perror to report failures (note that some return an int corresponding to the error type, so you can print that out).

Thread Basics

While threads look fairly complicated, they are actual much cleaner to use than many of the Unix C APIs, at least to do simple tasks.

pthread_create (pg. 333)

The first thread call you will need is called, appropriately enough, pthread_create, and takes four parameters.

int pthread_create(pthread_t *new_thread_ID, const pthread_attr_t *attr,
void * (*start_func) (void *), void *arg);

The first is a pointer to variable where you want to store the ID of the new thread you are creating (pthread_t is just a typedef'd unsigned int, like size_t). As usual, you must allocate space for this variable yourself; the easiest way is to declare a pthread_t variable and pass its address, but if you really want to declare a pointer you must malloc space for it. You can pass a NULL if you don't care what the thread's ID is, but usually it's useful to have. (If you are making several threads, consider making an array to store them in instead of a lot of separate variables.)

The second lets you control the behavior of the thread in a variety of ways. Among other things, you can use it to make the thread detached (meaning it will exit cleanly without a join), change its priority, or change the thread scheduling algorithm. If you don't care about any of that, you can pass NULL, which means "use default behaviors" (listed on pg. 343).

The third LOOKS very ugly, but like signal it isn't that bad. Threads start in a specific function (unlike fork, which starts where you call it), and that the function must have a special form: it must return a void pointer, and take exactly one void pointer as an argument (e.g. void * myfun(void *args)). This parameter is just a function pointer to a function with that prototype (e.g. ..., myfun, ...). If you want it to return something return a pointer to the return value (making sure to declare it static, so there's something left to point to after the function returns) cast as a void pointer. Similarly, if you want several arguments to the function, put them in a struct so you can pass them using a void-cast pointer to the struct.

The fourth parameter is simply a pointer to the arguments you want to pass to the function you gave in parameter three.

pthread_exit (pg. 335)

pthread_exit takes one argument; which refers to the status it will return to any thread waiting to pthread_join it (see below). You won't need to use this much (if ever), as it is implicitly called for you when the function the thread started in returns.

pthread_join (pg. 336)

This is the thread equivalent of wait. It takes two parameters; the id of the thread to wait for, and a pointer to an int where it can store status information, cast as a void**. As usual, if you don't care about the status this parameter can be NULL.

Note: Join is more important than wait in that you must wait on all threads (unless they are created detached, or detached with pthread_detach), otherwise they won't release their system resources.

Thread Mutexes

Since threads share variables, there must of course be controls on the access to those variables. Luckily, threads have their own system of concurrency control (thread mutexes) which is much easier to use than semaphores.

Note: The main drawback to thread mutexes is that they should not ever be unlocked by a different process than the locking process. This will give undefined behavior but no errors, so be careful. If you need to implement an algorithm that requires doing this, you'll need to use multi-threaded semaphores.

The pthread_mutex_t Type (pg. 358)

The thread mutex is just a type, like ints or chars, so there is no mucking about with semaphore descriptor numbers, strange calls to allocate them, etc. You just declare them. Thread mutexes: 1, semaphores: 0

pthread_mutex_init (pg. 361)

pthread_mutex_init takes two arguments; the pthread_mutex_t variable to initialize, and a pointer to an attribute type with a long name. But usually, you don't have to care; pass it NULL, and it will initialize it with default settings, which makes it unlocked (which is usually what we want for a mutex for a shared variable). No nasty structs unless you want non-standard behavior. Thread mutexes: 2, semaphores: 0

Alternately, you can just do pthread_mutex_t count_mutex = PTHREAD_MUTEX_INITIALIZER to initialize it, and skip the function call altogether.

pthread_mutex_lock, _unlock, and _destroy (pg. 361)

Each of these takes a pointer to the pthread_mutex_t you want to lock, unlock, or destroy. I think these are pretty self-explanatory. Lock and unlock are like wait and signal for semaphores. Destroy is a standard cleanup function: call it once when everyone is done with the mutex. No need to write your own functions for wait and signal with structs of arrays for parameters. Thread mutexes: 3, semaphores: 0

Thread Condition Variables

But wait, there's more! Threads also have built in condition variables, which work like the wait and signal calls in the monitor pseudo-code from lecture. Not only do they let you easily wait for a condition to be satisfied before moving on, they aren't busy waits. So if you wait on a condition variable, you waste no processor time by periodically scanning the variable, and blocking other threads with the mutex for that variable. AND, the condition variables take care of releasing the mutex for the variable in question, then re-acquiring it when the thread is woken up with a condition signal. NOW how much would you pay?

Thread mutexes: 100,000...you get the idea.

So now you know how amazingly great condition variables are, lets see how to use them.

The pthread_cond_t Type (pg. 373)

The condition variable is also just a simple typedef'd type.

pthread_cond_init (pg. 373)

This works just like pthread_mutex_init, or you can just set the pthread_cond_t to PTHREAD_COND_INITIALIZER directly.

pthread_cond_signal, _broadcast, _wait, and _timedwait (pg. 373)

pthread_cond_signal is as easy as can be; all it takes is a pointer to the condition variable to signal. _broadcast takes the same, but wakes up all processes waiting on that condition variable, instead of just one.

pthread_cond_wait and _timedwait are only slightly more complex. Both take a pointer to the condition variable and to the mutex protecting the associated variable. _timedwait also takes a timespec struct (pg. 375) telling how long to wait before giving up. Both will, as I said above, release the mutex, then wait for a signal without using processor cycles, then re-acquire the mutex when they are woken up. All transparently to you.

Thread Semaphores

If you need more flexibility than thread mutexes provide, multi-threaded semaphores are easy to use as well. (The semaphore code you learned before is not multi-thread safe.)

sem_init (pg. 391)

sem_init is very simple to use. It takes a pointer to the semaphore (sem_t) to initialize, then an int (0 if it will be used only by threads in this program, 1 otherwise), and an unsigned int which is the starting value of the semaphore, since thread semaphores are counting semaphores.

sem_wait and sem_post (pgs. 392-4)

sem_wait and sem_post are the wait and signal functions, and both take only one argument: a pointer to the semaphore to wait or signal.

Putting it all together

To see many of those functions in action, and how they interact, take a look at Steve Huwig's excellent example program, which solves the Cookie-Jar problem from 2002's concurrent algorithms homework (solution).

Other Thread Stuff

These notes only scratch the surface of threads. To learn more about thread attributes, or Thread Specific Data (which is a way of having data specific to one thread, instead of shared by all the threads in the process), or any number of other things, read the appropriate sections of chapter 11 in the Monkey book.

Stuart Morgan, 2003