Using Semaphores in C++

by Eddy Kim

Introduction
In this article I will describe an implementation of two C++ classes that are part of the synchronization section of my OS/2 C++ class library. The design came from many sources, but the two major sources of inspiration were the C++ Report and the ACE_wrappers [1] library by Douglas Schmidt.

I will assume that you are familiar with C++ and its features, although I will try to explain all uses of what may be non-obvious C++. Prior knowledge of semaphores is helpful but a short description will follow, including references.

Semaphores
If your first exposure to synchronization mechanisms was in another operating system such as UNIX, you may initially be confused by the supplied methods in OS/2. I will attempt to give a broad overview of common definitions and what you will encounter in OS/2.

In a pre-emptive multi-tasking and multi-threading operating system there is a need for synchronization primitives to prevent what is known as a race condition. A race condition is what happens when two or more threads/processes access and modify shared data or resources simultaneously, causing unpredictable and unrepeatable program behavior.

In OS/2 there are a number of synchronization methods available. One of them is the OS/2 Critical Section which is similar to the definition commonly found in UNIX textbooks. A critical section is a segment of code which must not be pre-empted or run by multiple threads concurrently. It is specified by a call to the functions DosEnterCritSec and DosExitCritSec. A critical section, when entered, prevents ALL other threads within the process from receiving time-slices, essentially disabling multi- threading within that process. This should only be used, if it must be used, when the code to be executed is very short, or the tradeoffs versus using other synchronization methods is too high. Another drawback of using a critical section is that it does not address inter-process synchronization. If inter-process synchronization is needed, public OS/2 semaphores must be used.

Example 1: Critical Section Pseudo-code. f {     // manipulate thread/process shared data DosEnterCritSec; // access shared data // The critical section prevents other // threads from accessing the shared // data, but not other processes. DosExitCritSec; } From [1]:

"A semaphore is an integer valued object manipulated by the kernel that has the following atomic operations defined for it:
 * Initialization of the semaphore to a non-negative value;
 * A P operation which decrements the value of the semaphore. If the value of the semaphore is less than 0 after decrementing its value, the process which performed the P goes to sleep;
 * A V operation which increments the value of the semaphore. If the value of the semaphore becomes greater than or equal to 0 as a result, one process which had been sleeping as the result of a P operation is woken up;
 * A conditional P operation, abbreviated CP, which decrements the value of the semaphore and returns an indication of true, if its value is greater than 0. If the value of the semaphore is less than or equal to 0, the value of the semaphore is unchanged and the return value is false."

E. W. Dijkstra, the originator of the concept, was Dutch, which accounts for the seemingly arbitrary mnemonic values P and V.

There are two types of semaphores available through the API. A Mutex (Mutual Exclusion) semaphore can define a critical section, but unlike the DosEnterCritSec, it does not block all threads in the process. When a thread/process attempts to acquire a mutex, if the mutex is already held by another thread/process, the acquiring thread/process is blocked and put to sleep. OS/2 allows a mutex to be private (intra- process) or public (inter-process) as well as giving the calling thread an option to block (sleep) or receive an error on attempting to acquire a held lock. Also note that OS/2 mutexes are recursive, while other systems may not offer recursive mutexes directly.

Example 2: Mutex Pseudo-code. F(HMTX MutexHandle) {     // Request the mutex. If another // thread/process already hold the mutex, // the thread requesting the mutex will // block (depending on the value of timeout). DosRequestMutex(MutexHandle, timeout); //  :      // manipulate shared data //  :      // Release the mutex DosReleaseMutexSem(MutexHandle); } An Event semaphore is used for signalling the state of a resource or event. There are several operations that one can perform on an event semaphore: post, reset, and wait. Posting causes all threads that were sleeping on the semaphore to unblock. Resetting the semaphore puts it into a state which allows subsequent threads calling Wait to block. The Wait call blocks the thread/process when the semaphore is in a reset state. The corresponding API functions are DosPostEventSem, DosResetEventSem, and DosWaitEventSem.

Applying some class
Why bother encapsulating the simple C API into a C++ class? There are several benefits: as well as some drawbacks:
 * Potentially simplifies the interface to the semaphores.
 * Allows automatic acquiring and releasing of semaphores.
 * The mechanics of using semaphores is encapsulated, reducing "coupling" of code. This allows code which is easier to write, maintain and reuse than functionally equivalent code which is tightly coupled.
 * Increased code size.
 * Higher overhead (can be reduced with inlining)

When using the standard C API, the semaphore is usually associated with shared data. When you have code that could potentially access the data concurrently, you call the API to serialize access. When you're done with the access, you call another function to release the semaphore and allow other threads to access the data. While not terribly complicated, there is a non-trivial amount of work for correct synchronization. Any logic flaws can result in deadlock which will stop execution of your program.

Instead of the error prone method of accessing the mutex through multiple function calls, why not let the compiler do the work for you? When C++ objects are created on the stack (automatic variables), the constructor is called. When they go out of scope the destructor is called. The mutex class described in this article consist of two parts; a semaphore and a lock. The lock automatically acquires and releases the semaphore through the constructor and destructor. Thus, there are no explicit calls made by the programmer. This can be especially useful when your code is employing C++ exceptions. By using these larger conceptual components, your code is simplified. As a testament to the utility of abstraction, I have an example in my own library.

Semaphores are used extensively throughout the library; the logging modules, the thread manager, and other modules use semaphores in their implementation. Initially the Mutex and Event classes were not related by derivation. The former encapsulated HMTX and the latter encapsulated HEV. As I was writing this article, I decided to start adding a MuxWait class for completeness. The result was a drastic redesign of the semaphore classes, but the semaphore lock and all other classes that used the semaphores were unchanged. Despite the fact that the internal representation and relationship of the semaphore classes changed, no client code needed to change because they depended on the behavior, not the implementation of the concept of a semaphore.

There is no parallel lock for the Event Semaphore, because the event semaphore is not used in the same manner as a mutex semaphore. Instead, you use the event semaphores to signal events to other threads or processes.

Description of classes
Base class - Semaphore class Semaphore { public: Semaphore(HSEM hSem=0, ULONG udd=0); Semaphore(const Semaphore& aSem); virtual ~Semaphore=0; HSEM GetHandle const; ULONG GetUDD const; void SetUDD(ULONG udd); protected: SEMRECORD sem_; // return a pointer to the semaphore handle HSEM* GetPHSEM; // return the semaphore handle HSEM GetHSEM const; // return a pointer to user defined data ULONG* GetPUDD; // return user defined data ULONG GetUDD const; }; This is the base class from which the Mutex and Event classes are derived. This is an abstract class. It cannot be instantiated due to the virtual destructor being declared as "pure". The reason for using a base class is that the MuxWait semaphore (not described in this article) can then have a uniform interface to the semaphores. The only public member functions of note are the accessors, which start with the prefixes Get and Set. The functionality provided by this base class is the accessing of the underlying handle used in the semaphore operations.

Mutex semaphore class
class Mutex { public: enum Openmode {OPEN, CREATE, DONTCARE }; enum Attr {SHARED=DC_SEM_SHARED, PRIVATE=0, DEFAULTATTR=SHARED }; enum InitState {OWNED=1, UNOWNED=0, DEFAULTSTATE=UNOWNED }; Mutex; Mutex(HMTX mtx); Mutex(const Mutex& mtx); Mutex(string& mtxname,          Openmode mode=DONTCARE,           ULONG attr=DEFAULTATTR,           BOOL32 state=DEFAULTSTATE); virtual ~Mutex; APIRET QueryOwner(PPID pid, PTID tid, PULONG count) const; APIRET Release; virtual APIRET Request(ULONG timeout=SEM_INDEFINITE_WAIT); private: Mutex operator=(const Mutex& mtx); };

There are 4 constructors;

Mutex;

Default - Creates an unnamed public semaphore.

Mutex(HMTX mtx);

Opens an existing mutex semaphore.

Mutex(const Mutex& mtx);

Mutex copy constructor - opens an existing mutex semaphore.

The two copy constructors call DosOpenMutexSem on the existing handle. This is to have reference counting semantics. If the new object simply copied the handle value directly, when the first object goes out of scope and destructs, the handle is no longer valid. The alternative to this is to use reference counting, but the increased complexity and performance degradation (a pointer dereference on every operation) dissuaded me from using this method. Mutex(string& mtxname,      Openmode mode=DONTCARE,       ULONG attr=DEFAULTATTR,       BOOL32 state=DEFAULTSTATE); This flexible constructor allows you to specify the mutex name, open mode, attributes and initial state. Here is some pseudo-code for the logic of this function (see the source for all the gory details): switch (openmode) {     case DONTCARE: if (openMutex) fails createMutex case OPEN: openMutex case CREATE: createMutex } Of note is the DONTCARE case. When creating the mutex using the open mode (which is the default), the object attempts to open the mutex, and if that fails with a not found error, it attempts to create a new mutex. This behavior is useful when two threads or processes need a mutex or event semaphore and you are not guaranteed which thread creates the semaphore.

virtual ~Mutex;

notice that the destructor is virtual. In the event that any classes derived from Mutex are created, deleting the derived class through the base class pointer will call the correct destructor. The destructor closes the mutex. This is a simple function since every constructor initializes the object into a valid state.

4 member functions:

APIRET QueryOwner(PPID pid, PTID tid, PULONG count) const;

This is the equivalent function for DosQueryMutexSem.

APIRET Release;

This Releases the semaphore

virtual APIRET Request(ULONG timeout=SEM_INDEFINITE_WAIT);

This attempts to acquire the semaphore. If the semaphore is already owned by another thread, the calling thread will block for timeout. The second parameter, if unspecified, will be set so that the thread will wait indefinitely.

Mutex Lock Class
class MutexLock { public: MutexLock(Mutex& mtx); MutexLock(Mutex* mtx); ~MutexLock; private: Mutex& mutex; operator=(const MutexLock&); }; This is the lock class, known as a "Guard" in ACE_wrappers parlance. It is used to automatically acquire and release locks on a mutex.

MutexLock(Mutex& mtx); MutexLock(Mutex* mtx);

These two constructors are identical in function, but accept either a reference or a pointer to a mutex.

~MutexLock;

The destructor releases the associated mutex.

Example 3. Function F(Mutex& m) { {          MutexLock lock(m); // do some calculations with the shared resource. }     /* Regardless of how this function exits, the lock will always be released (actually,     as long as the exit mechanism properly cleans      up the stack). This is especially important when exceptions are present. */ }

Event Semaphore Class
class Event { public: enum Openmode {OPEN, CREATE, DONTCARE }; enum Attr {SHARED=DC_SEM_SHARED, PRIVATE=0, DEFAULTATTR=SHARED }; enum InitState {OWNED=1, UNOWNED=0, DEFAULTSTATE=UNOWNED }; Event; Event(const Event&); Event(string& name,          Openmode mode=DONTCARE,           ULONG attr=DEFAULTATTR,           BOOL32 state=DEFAULTSTATE); virtual ~Event; APIRET Post; // causes blocked threads to execute virtual APIRET Wait(ULONG timeout=SEM_INDEFINITE_WAIT); APIRET Query(ULONG* postCount); APIRET Reset(ULONG* postCount=0); // threads that wait on a reset sem, block }; The event semaphore is used to signal events to other threads/processes. There is no "lock" class for it because the automatic Acquire/Release behavior does not apply to Events.

Event;

Default constructor that creates a shared, unnamed event semaphore.

Event(const Event&);

Copy constructor. Opens an existing event semaphore. Event(string& name,      Openmode mode=DONTCARE,       ULONG attr=DEFAULTATTR,       BOOL32 state=DEFAULTSTATE); Constructor. This is identical to the Mutex Semaphore constructor.

virtual ~Event;

Virtual destructor. Closes the semaphore.

Normally the event semaphore is in the reset state. In this state any threads calling the Wait function will wait for timeout to occur. When the desired event has occurred, you call the Post function to unblock the waiting threads.

APIRET Post; APIRET Reset(ULONG* postCount=0);

If you want to find out how many times the event semaphore has been posted since the last reset, pass the address of a ulong to the function, otherwise you may leave that parameter blank. This is similar to using the Query function. The difference is that you get a more accurate value. If the calls to reset and query are done separately, it is possible that you may lose some information (e.g. there is a post immediately before the reset).

virtual APIRET Wait(ULONG timeout=SEM_INDEFINITE_WAIT); APIRET Query(ULONG* postCount);

Use this function to find out how many time Post has been called since the last Reset.

You may have noticed that the Wait members for the classes are virtual. This was done to ensure that the correct function was called in derived mutex classes that use the WinWait* family of functions. The first version of this class did not have virtual Wait members, highlighting a shortcoming of this hybrid O-O language. If I had been unable to modify the source for the Semaphore classes, any classes deriving from the Semaphore would not have had correct polymorphic behavior.

Because the presented classes use constructors and destructors to perform significant operations, any errors that are encountered cannot be returned in the normal way to the caller. I use a combination of error logging and exceptions. To keep the sample code as simple as possible, I have left out error handling.

The classes also hide the assignment operator, mainly because when designing the classes, I asked "what happens when I assign one semaphore to another?" Based on the possible non-intuitive behaviors that could happen I decided to disallow assignment. If the client code wants a reference to an existing handle, it could always get a pointer or reference. If the client wants a new handle based on the mutex, it could use the copy constructor. The inclusion of an assignment operator would have me looking through the source to remember what the operation actually did.

Example
I have included a sample program that uses the three classes in example.cpp. It was written using Watcom C++ 10.5, but should be usable on any OS/2 C++ compiler with little modification. The only compiler switch I used was -bm, to use the multi-threaded Watcom libraries. The sample file shows what happens when a common resource (the terminal display) is shared among multiple threads.

When argc is 1, the five threads use a mutex to control access to the display. When argc is not 1, they all write to the screen at the same time. While the library handles the low level synchronization (because the threads are created with _beginthread and not DosCreateThread), we still need a mutex to control the access, otherwise a jumble of characters are seen.

The example program also uses an event semaphore to determine when thread 1 (the main thread that spawned the five workerThreads) can exit. Without waiting for the child threads, control would reach the end of main and the process would end prematurely.

Conclusion
The example implementation is a modified version of my own C++ class library. The missing features include Filename (specialized, low overhead strings), log/trace facility, and exception support. Additional classes I have not shown are a MuxWait semaphore that encapsulates the OS/2 API of the same name and WMutex and WEvent semaphores which use the WinWait* functions to block.

Possible additions to these classes I am considering are Reader/Writer Semaphores, and template classes that automatically associate a mutex with an integral type. If there is enough interest in continuing similar articles, I will follow up to this article in the coming months. Additionally, if there are any errors or suggestions, please let me know. Please refer to the contact information to reach me.

In the final analysis, there is a slight overhead in space and time over direct use of the API, but in most situations this is negligible and the benefits of simplified use (with the associated code reusability and maintainability) easily outweigh the disadvantages.