Gearing Up For Games - Part 3

Written by Michael T. Duffy

Introduction
Welcome to the third instalment of Gearing Up For Games. The past couple of months have been really busy for me, and in August I took a two week trip to Japan that took me completely away from the computer. Add to this the fact that I'm trying desperately to get my own game out before the end of the year, and you begin to see why I didn't get a chance to finish this article.

I had planned on covering threads and semaphores, and basic sprites in this article. It turns out that I have only had time to write about threads and semaphores. However I wrote the code for the article before I started the text, so I finished all of the basic sprite code. Rather than pull the sprite code out, I have left it in. For those of you already familiar with sprites, you should be able to follow the code fairly easily. The next to last section of this article briefly describes how I approached sprites, though it is not meant to be an in-depth explanation. It will have to suffice for now.

Also, I should point out that my discussion of threads and semaphores is oriented towards games, and does not cover all aspects of threads or semaphores. A complete discussion is beyond the scope of this article, and I would suggest that you look into the reference documentation in the OS/2 toolkit (namely Control Program Guide and Reference) as well as a good book or two on OS/2 programming. I have found both Petzold's OS/2 Presentation Manager Programming (pub. Ziff-Davis Press) and Real World Programming for OS/2 2.11 from SAMS Publishing to be useful references. Each book covers some material that the other doesn't, so it may be a good idea to look at more than one explanation. The material presented in this article should be enough to give you a basic understanding of threads and along with the accompanying sample code, you should be able to use threads and semaphores in your own game programs.

Enough for excuses...time to get to the meat of the matter!

Threads
One of the advantages of an operating system like OS/2 is that the programmer has available to him or her a multitasking environment. This not only means that more than one program can be running concurrently, but also that different parts of the same program can execute concurrenly.

In DOS, programmers did not have multitasking and instead relied on interrupt handlers and reprogramming the internal timer chip in order to achieve this, e.g. reading the keyboard or joystick while blitting to the screen. This is no longer necessary in OS/2, and part of the reason is because of threads.

What is a thread? A thread is a separate unit of execution within a program. Each thread has its own set of CPU register variables and its own stack. For the beginner, register variables are the variables used inside of the CPU to perform calculations and keep track of where the program is currently executing. The stack is where local variables are allocated and deallocated from when a new routine is entered. Threads are owned by a process, where a process is usually the program that you are running.

Multitasking is handled by the CPU, with the operating system telling the CPU how much time to give each thread. It basically works like this: the CPU executes a thread for a given amount of time, stopping the thread after that time has elapsed. The CPU stops the thread dead in its tracks, even if it is in the middle of executing a single line of C code, like usVariable1 = usVariable2. All of the variables in the CPU registers are stored. The OS then instructs the CPU which thread to run next, and the CPU loads its registers with the saved registers of the next thread, then executes that thread for a given amount of time, and so on. An operating system like OS/2 can determine how much CPU time a thread needs, and adjust the amount of time it gives a thread for its next execution. The OS can also suspend a thread if the CPU is needed elsewhere, such as reading keyboard, COM port, or disk drive data. This is equal to the interrupts of the days of DOS.

As a result, you never know when control will be given to or taken from your thread. You can indirectly affect how much time your thread is given by setting the priority level of the thread. A priority level basically tells the CPU how important your thread is compared to other threads. Threads that have a more important priority level are serviced before threads of lower priority. Threads of the same priority level are serviced in a round-robin fashion. For example, let's say you have two threads of the priority, A and B, and one thread of a lower priority, C. Depending on the load level of the machine and other settings, these three threads might gain control in the order: A B A B C A B A B A B A B C A B C A B C. Since A and B are of the same priority, they will be serviced equally. Thread C will be serviced when A and B have been handled, and there is CPU time left over.

As mentioned before OS/2 can adjust on the fly how much time each thread gets before the CPU stops it. How does OS/2 determine how much time a thread uses? A thread can signal that it has used enough CPU time for now in two ways. One, if you have a message queue then OS/2 knows when you're done processing because your message handling routine returns to the operating system once it has handled the message. The second way is that threads can be suspended, resumed, or paused (with DosSleep). This will be discussed shortly.

How are threads useful, and when do you use one?
Threads are needed when you must do two or more things at the same time. An important thing to remember about threads however, is that on Intel based machines, two threads don't really execute at the same time. The CPU runs one thread for a short time, switches to other threads, switches back to the first thread and so on. Even on machines with SMP (Symmetrical Multi-Processors) often all of the threads of a given process are run on a single CPU. A CPU can only execute a certain number of instructions per second, and all threads take up a share of those instructions. It also takes time to switch from one thread to another, even though this amount of time is very, very short overall.

Threads are therefore useful when you need to do several tasks at basically the same time, and you don't want to have to worry about giving each task its share of the CPU. Instead, OS/2 will handle the thread management. Good examples of threads are the thread that handles the main message queue, a thread that blits images to the screen, a thread that loads/saves/sends/receives data in the background, a thread that mixes sound, and perhaps a thread that handles the artificial intelligence (AI) of the other opponents in a game. In your games you will always want to have at least two threads: one to handle the main message queue, and one to handle the game mechanics. The reason for this is that you should always spend a little time as possible in the message queue routine, otherwise you risk slowing down overall system performance. One of the flaws of OS/2 is that it has a single system message queue. Since the OS can't send out other messages until the message it just sent is processed, a greedy program that uses a lot of time between when it is sent a message and when it returns to the OS holds up other messages meant both for itself and for other programs. If the processing of a message will take a long time, it is better to instruct a second thread to handle the processing and then return to the OS. This way the second thread can perform the processing while the OS continues to send out other messages.

Threads are not very useful when you have a certain number of things that must be done in a certain order. For example, in an action game you may have several tasks to complete in a certain time frame, and one must follow the other. Every frame of a shooting game, you must read the joystick or keyboard, update the player's position, call the enemy AI routines and update their positions, move all missiles and bullets, and then draw the new graphics with the new positions to the display. Even though you could theoretically calculate player movement and enemy AI at the same time, the player only sees the finished screen and everything else must be completed before that screen is rendered. No matter how you arrange the above tasks, together all of the above tasks will require the same amount of CPU time before the screen can be blitted. Placing each task in a separate thread would not speed things up, and in fact things would slow down because you will have to synchronize the threads' access to variables, and the overhead of switching threads itself will slow things down some. The solution to the above problem is to place all of the tasks in the same thread.

Other games can have different tasks running at the same time. Consider a simulation or an adventure game. These kind of games can have complex AI for the computer opponents, and the calculations for the AI take a long time. However, the player is not continually giving input and the opponents are not continually moving like in a fast paced action game. The player might be studying graphics or maps, or even reading text while not inputting requests for new actions. Rather than having the computer wait for the player to make a move before deciding how the computer opponents would move (as a DOS game would do), the computer can decide it's next move while the player is involved with non-intensive CPU activities. If the computer does not finish a task before the player makes a move, then it can continue to process until it does make a decision. Depending on the type of game, this would mean either stopping the game until the decision were made, or continuing with the previous action this turn until a new action is decided upon.

Types of Threads
There are two basic types of threads: message queue and non-message queue. The type is determined obviously enough based upon whether or not that thread has created a message queue. The type of thread you use will depend on your purpose. Aside from having a message queue and a message queue handle, message queue threads differ from non-message queue threads in a few significant ways. Non-message queue threads cannot do the following: create windows, contain window procedures, send messages to window procedures with WinSendMsg (although they can call WinPostMsg), or call functions that cause messages to be sent to window procedures. On the other hand, message queue threads cannot be suspended or blocked, or else they will hold up the rest of the operating system. Also, it is important to note that you cannot use the message queue handle of another thread to call windowing procedures. The strengths and weaknesses of each type should be weighed carefully when organizing your program.

Using Threads
Threads can be created either through the functions in a standard library, such as C/C++'s _beginthread function, or through the OS/2 API. We will be using the latter approach since the OS/2 API allows more control over threads than the former approach. I suggest that you obtain a copy of IBM's Control Program Guide and Reference for a complete reference to the thread and semaphore API.

Note that when you use threads, you must compile with the switch that tells the compiler that you are working with a multithreaded program. This switch tells the compiler to use a special version of the standard libraries that are "multithread-safe". If two threads call the same or related functions in the standard C or C++ library at the same time, then there could be problems if this flag is not specified. With the Watcom C/C++ compiler, the flag is -bm.

To create a thread we call DosCreateThread. We tell this function what routine the thread is to run and give a single parameter to be passed to that routine, tell it what the stack size of the new thread is to be and whether or not to commit the pages of the stack, and whether the thread it to be executed immediately or created in a suspended state.

Once a thread is created, you should set its priority status with DosSetPriority. You will have to experiment to find out what priority levels to use for each of your threads. It is a good idea to have your message queue to be at a higher priority than your other threads so that the entire system will stay responsive.

A thread ends itself by either by returning from initial thread function, or calling DosExit. A thread may kill another thread by calling DosKillThread, though it is often a good idea to signal a thread that it should kill itself so that it may shut down what it is doing before it terminates.

Threads can also be temporarily stopped and restarted with the functions DosSuspendThread and DosResumeThread. While a thread is suspended, it does not take up any CPU time; it is simply not called. When a message queue thread returns to the OS/2 kernel and there are no messages for it, the OS/2 suspends the thread until new messages are available. Note that threads obviously cannot resume themselves since they are not running, though they can suspend themselves and other threads. If a thread is created in a suspended state, you must call DosResumeThread to begin its execution.

Another way to temporarily suspend a thread is to call DosSleep. This function takes a single parameter that tells the system how many milliseconds the thread should be suspended before it is resumed. Note that this can not be used for exact timing purposes. As soon as the time runs out, the thread that called DosSleep will be considered again for CPU time slices, but other threads may execute before the recently resumed thread gets it's turn, especially if the other threads have a higher priority. Calling DosSleep(0) causes the thread to release the remainder of it's CPU time for the current time slice back to OS/2 so that it can be used elsewhere. This should be used in non-message queue threads where you don't want to hog CPU time and you don't have anything special to do. It is especially useful at the bottom of a loop where you have just finished a task and you don't need to start the next task immediately. This is not needed if you suspend the thread through other means.

Finally, a thread can suspend itself until another thread has finished executing. When DosWaitThread is called, a flag is passed that specifies whether the function should wait for end of execution or not. If DCWW_NOWAIT is specified, the function returns immediately and the error status tells whether or not the specified thread is still running. If DCWW_WAIT is specified, then the function does not return until the specified thread terminates. DosWaitThread is useful when thread A wants to terminate thread B. Thread A signals thread B that it should terminate. Thread A then calls DosWaitThread with the wait flag specified. Thread B receives the shutdown signal, cleans up what it needs to, and the exits gracefully. As soon as thread B exits, DosWaitThread returns and thread A can continue about its business.

As far as the creating, executing, suspending, and resuming the execution of threads, that's all there is too it. To communicate amongst different threads and to coordinate their use of global variables and memory, you will have to rely on semaphores.

Semaphores
A semaphore is like a traffic cop that keeps threads from crashing into each other. They are needed because when the CPU switches from one thread to another, it may interrupt it anywhere. For example, Thread A might be in the middle of updating a structure. It gets halfway through the update, and then it is interrupted by the CPU and thread B gets control. Thread B tries to use the information in the structure, but only half of it is correct. Imagine if the structure contained a forward and backward linked list, and only one of the two pointers were updated. Behaviour of the program would be unpredictable and buggy. It gets even worse. Actions we consider atomic in C may actually be compiled down into several assembly language instructions. Consider the expression ++usVar1; The compiler might break it down like: Figure 1) Assembler for ++usVar1

What happens if Thread A is interrupted by Thread B during the middle of the command, and thread B tries something simple like incrementing usVar1. Remember that each thread has its own set of CPU register variables. Figure 2) Accessing usVar1 by multiple threads

Although at the end of two ++usVar1 instructions it should go from 0 to 2, the final value comes out to be 1. This is obviously not good.

What you would want to do to solve this problem is to make sure your thread were not interrupted in the middle of using a variable, structure, or other piece of data. This can be accomplished in two ways. One way is to call DosEnterCritSec before the calculation, and call DosExitCritSec afterwards. This causes all thread switching to stop between the two function calls. All threads. The threads that control OS/2, other threads in your program, all threads in other programs, and all threads that handle interrupt processing. Use of these two functions may be fine for simple operations such as a single increment or decrement of a variable, but should never be used for processing that takes longer than just a few assembly language instructions. If you suspend system threads for too long you risk losing data coming in from the keyboard or serial ports, and you will prevent other updates that need to be done on a regular basis. Besides, you are only really worried about threads A and B accessing a given piece of data.

A much better solution is to use semaphores. Semaphores come in three varieties:	mutually-exclusive (mutex), event, and multiple-wait. Mutex semaphores are like a hall pass; only one thread may have the hall pass at any one time, and the others must wait until the pass is returned. Event semaphores are like traffic lights where you can have a stop and a go status. Multiple wait semaphores are like a collection of the other two types; you have several hall passes or traffic lights to consider at the same time.

Semaphores may also be private or shared, and named or unnamed. Private semaphores may only be used by a single process; shared semaphores may be used by several processes. Other processes may obtain the handle of a semaphore in a couple of ways. The handle of the thread can be passed in a message. If the semaphore is named, then the other process may use a known name to request the handle from the operating system.

The programming samples in Gearing Up For Games will use only private, unnamed semaphores, so only they will be covered. Also, we will not be using multiple wait semaphores. For information on named and unnamed, private and shared, and multiple wait semaphores check out the OS/2 Toolkit documentation, or a good third party book on OS/2 programming.

Mutex Semaphores
Mutex semaphores allow two or more threads to take turns in accessing information. First, the mutex semaphore must be created with DosCreateMutexSem. Often it is convenient to do this in the main thread during initialization before any threads are even created. When the semaphore is no longer needed, DosCloseMutexSem is called. This can also be done in the main thread during cleanup.

When a thread wants to access a piece of information that two or more threads use, it calls DosRequestMutexSem. If this function returns without error, then the thread "owns" the semaphore. When this function is called, a timeout value is also specified. The timeout value is the number of milliseconds that DosRequestMutexSem will wait for the semaphore to become available. If this time runs out, the function returns with a timeout error. The timeout value may also be given the values of SEM_IMMEDIATE_RETURN and SEM_INDEFINITE_WAIT. An immediate return request means that the function will not wait for the semaphore at all. An indefinite wait request means DosRequestMutexSem will not return until it has gained ownership of the semaphore, and this is the setting we will use.

When the thread is finished accessing the shared data, it calls DosReleaseMutexSem. This allows another thread to have access to the data. If two or more threads are waiting for a semaphore when it is released, the thread with the highest priority gains control. Note that an internal counter is kept for each semaphore that is incremented upon a request and decremented upon a release. Therefore you must call DosReleaseMutexSem for each time that you called DosRequestMutexSem.

Let's consider two threads, A and B, trying to use the variable ulVar1. During setup DosCreateMutexSem is called and the handle is stored in hmtxSem1. Threads A and B run. Thread A comes to a place where it needs ulVar1, so it calls DosRequestMutexSem with the semaphore handle stored in hmtxSem1. No other thread owns this semaphore, so A gains ownership. It can now safely access ulVar1. While thread A is using ulVar1, thread B comes to a place where it needs to access the same variable. Thread B calls DosRequestMutexSem with hmtxSem1 and an indefinate wait specified. However since A already owns the semaphore, thread B blocks. When thread A finishes accessing ulVar1, it calls DosReleaseMutexSem. Now thread B's call to DosRequestMutexSem returns and thread B can safely access ulVar1. Thread B must release the semaphore when it is done accessing ulVar1.

Mutex semaphores can also be used when one thread wants to stop another thread, but wants to let the second thread finish what it is doing. An example of this is pausing a game. The main loop of the game thread looks something like this: Figure 3) Processing loop for a game thread

Now the player presses a button that pauses the game. The button press is intercepted by the message processing routine in the main thread. It requests ownership of hmtxPause, and chances are that it will get it. The game thread will finish blitting the frame that it is working on, and before it starts the next frame, it will request hmtxPause. Since this semaphore is already owned by the main thread, the main game thread will block and the game will be paused; new positions will not be calculated, and new screens will not be blitted. Note that this means that if the game is in a window and the window needs to be updated, then you will have to blit the last completed frame elsewhere. Most likely this will be in the handling of the WM_PAINT message. When the player presses the button to unpause the game, the message handler in the main thread receives the request. If the game is paused, then the main thread releases hmtxPause, and the game thread may continue. Be certain to check to see if a game is paused or unpaused before requesting or releasing the semaphore, because you only want to have one request active at a time.

Finally, you can use the function DosQueryMutexSem to find out what (if any) process and thread owns the semaphore, as well as find out how many outstanding requests for the semaphore exist (including the request made by the thread that currently owns the semaphore).

Event Semaphores
Event semaphores are useful for one thread to let another know that something has happened. They are also useful because they can be hooked up to timers, and the timer can use the semaphore to let a thread know when a certain time period has passed.

Like mutex semaphores, event semaphores can be private or shared, named or unnamed. You create an event semaphore with DosCreateEventSem, and destroy it when you are finished with DosCloseEventSem. This is usually done in the main thread during setup and cleanup respectively. If you wish to hook a timer up to the semaphore, this can be done right after it is created with a call to DosStartTimer.

Event semaphores have two states: posted and reset. You post to an event semaphore when an event occurs, and you reset it when you want to clear it and wait for the next event. The number of times an event semaphore is posted to is recorded and given upon request.

You post an event semaphore with a call to DosPostEventSem. If you have hooked the event semaphore up to a timer, then the timer will automatically post the semaphore after the requested period of time. You clear the semaphore with DosResetEventSem, and this call will return the number of times the semaphore was posted since the last reset.

Although you can use DosQueryEventSem to find out how many times it has been posted, most often you will check an event semaphore with DosWaitEventSem. DosWaitEventSem blocks a thread like DosRequestMutexSem, and it acts much the same way. You provide a timeout value, SEM_INDEFINITE_WAIT, or SEM_IMMEDIATE_RETURN. If you specify an indefinite wait, the function will return immediately if the semaphore has already been posted, or it will block the calling thread until the semaphore is posted to by another thread or a timer. After DosWaitEventSem returns, you will probably want to immediately reset it so that you know if any new posts have been made between this and the next call to DosWaitEventSem. DosResetEventSem clears the number of posts to 0.

Often you will use event semaphores and timers together. There are only a limited number of timers available in OS/2. These timers are hooked up not to the Programmable Interrupt Timer (PIT) that is usually reprogrammed in DOS games, but rather to a different hardware timer that I believe is also used for managing multitasking. Although you request the time between timer posts in milliseconds, time is actually handled in units of clock ticks, which varies depending on the computer system. For Intel based systems, the resolution of this timer is about 32 milliseconds, and this gives a rate of about 31 or 32 posts per second.

Before requesting a timer, you will want to make sure there are still some available. The number of available timers can be found with a call to WinQuerySysValue to find the value of SV_CTIMERS. If no timers are available, you should exit the game with an error message. You request a timer with a call to DosStartTimer after you have created the event semaphore. You pass the handle of the event timer to this function, along with a millisecond count. Now the event semaphore will be posted every x milliseconds, with x being the value you specified in the call to DosStartTimer. When you are done with the timer, you call DosStopTimer to stop it and release it for another program to use.

You can use event semaphores and timers to regulate the speed of your game. Let's consider an action game. You want the game to run at a constant frame rate regardless of the speed of the computer it is running on. Since the timer runs at around 32 posts per second, we will target our frame rate to this speed. For faster frame rates we need a higher resolution timer. I believe that at least two such timers currently exist; one is available on the DevCon Device Driver Kit and is designed for use by device drivers, and the other timer is accessed through the two DosTmr* functions. The DosTmr* functions require 128-bit division and multiplication to use. I have not used either of these timers yet however, so I will not cover them.

To achieve a constant frame rate at the top of your main game loop, you check your timer. If the timer interval has passed, you reset the timer and continue on. Otherwise you wait until the required interval has passed. With a timer and an event semaphore, you know the correct interval has passed when the semaphore is posted. With a timer that only allows you to read its value and doesn't generate interrupts, post messages, or post semaphores (such as the DosTmr* functions), you will need to check the elapsed time yourself. If enough time has passed, update your timer variables and continue. If not, you will have to continue to poll the timer since you don't have a function that blocks the current thread. In order to keep the polling loop from burning up CPU time when it doesn't need to, call DosSleep (0) after every unsuccessful check.

With the standard timer and event semaphore, our main game loop for an action game looks something like this: Figure 4) Processing loop revisited

One final note about blocking the game thread with either an event or a mutex semaphore. When you request that the game ends (by setting bEndGameThread to true), then you must make sure the game thread is not blocked by either a pause or a speed regulating semaphore. You must therefore release any mutex semaphores that the main thread owns, and post any event semaphores that the game thread is waiting on.

Basic Sprites
I have not had time to write the Basic Sprite explanation for this article, but I have included the code. I implemented sprite drawing in a very simple way. All sprites are stored in a PCX file. This file is read into a canvas, and the canvas is passed to the sprite class. Each sprite is defined as a rectangular region of the canvas. When the sprite is drawn, the sprite is copied pixel by pixel from the sprite canvas over to the destination canvas. If the pixel's value is greater or equal to 248, then the pixel is not written, thus making those pixels clear since the background remains. This makes the top 8 colours clear.

The background is stored in its own PCX file and read into its own canvas. To erase the sprites, I just copy part of the background canvas over to the display canvas. This covers everything that was written last time, and then the new sprites are written directly to the display canvas. This display canvas is then blitted to the screen.

Sprites are referenced with a book number and a sprite number. A book is a collection of several sprites, up to 65,535 sprites per book. There can be up to 255 books of sprites. You would probably want to have a book of sprites for the player, a book for each of the enemies, and a book for missiles and explosions. Each book will contain all of the frames of animation for that object.

When we get into more advanced sprites, we will actually encode them from a canvas and store the encoded data. Then we can discard the canvas the sprites were encoded from, thus saving memory. Furthermore, encoded sprites will display much, much faster than the current method. Also we will store the books of encoded sprites to disk in library files. I will be explaining library files in the future when I cover organizing game data.

The sprite class has a number of routines which are really just place holders for more advanced features. If a function doesn't seem to have a use now then just ignore it; we will come back to it in the future.

The Canvas Class
One small addition has been made to the canvas class as well. I have added the routine CopyDirtyRec, and this copies a rectangle from one canvas to another. This routine is used to copy part of the background canvas to the display canvas when the display canvas needs to be erased. The code of this routine is a good demonstration of how to move canvas data around, and some of the speedups that canvases allow.

Those Silly Bugs
The MAKEFILE that accompanied my last article appears to have had a slight bug in it. In calling the linker, I neglected to put the full path name before the mmpm2.lib file. Therefore libr mmpm2.lib should read libr c:\toolkit\lib\mmpm2.lib or add whatever path you have to the lib directory of the OS/2 toolkit.

Aside from that, I don't think there were any other bugs in the article for EDM/2 3-7.Am I mistaken?

News and other Nonsense
By the time you read this, the Entertainment Toolkit for OS/2 may already be out on DevCon 8. It was not yet available at press time, so I am unable to cover anything on it with any certainty. I will be writing about it in the near future.

If you haven't visited it yet, I'd like to suggest that you drop by IBM's Games Home Page. The URL is http://www.austin.ibm.com/os2games and you will need a Web browser to look at it. IBM seems to be using this as their main notification to the public (that's us) of game related information, so it makes good sense to check it out every week or so for new stuff. Recently the beta documentation for the Entertainment Toolkit has been placed on this site, as well as the joystick drivers and documentation and other stuff.

While we're on the topic of Web pages, I have finally put one up of my own. I hope to eventually have online versions of my EDM/2 articles up there, and I may even be placing code and text on the web page before it sees print in EDM/2. The page is really, really sparse right now, but it should grow as I find time to work on it. I also have a number of OS/2 game related links on my page already that you may want to check out.

This month I also did things a little bit differently in that I didn't distribute duplicates of code from previous articles when the code didn't change at all. The README.1ST file in the code archive will tell you which files you need from the last article. Simply place those files in the directory with the new code, and run the makefile (if of course you are using the Watcom compiler, otherwise the makefile will not work).

Next month I hope to cover sprites in depth, including how to construct, store, and display Run Length Encoded (RLE) sprites, and how to clip sprites. If I have time, I may show how dirty rectangles can be used in some cases to speed up sprite updating, though with a full scrolling background dirty rectangles don't do us much good. Also I should be able to cover DIVE's full-screen abilities in the 320x200x256 colour mode, since they are a breeze to implement.

As always, I welcome comments. You can reach me at mduffy@ionet.net, or find me lurking on comp.os.os2.games among other places. Until next time...