Ultra-Fast Audio for Games and Multimedia

by Linden deCarmo

Speed kills! While this statement applies to many aspects of life, it definitely is not true for games and multimedia programs. Games must have instantaneous digital audio response – any operating system overhead can make them unplayable. In this article, we'll examine the new Direct Audio RouTines (DART) that let programs use high-speed audio while remaining compatible with existing OS/2 multimedia applications.

Before DART, multimedia applications used playlists to obtain the best performance (playlists stream digital audio directly from user-allocated memory to the audio device). While they are excellent for simple memory playback or record, playlists are not the ideal solution for programs that have time-critical performance requirements or need precise buffer flow control for the following reasons:
 * Memory copies - Playlists currently copy information from an application's buffer to internal OS/2 multimedia buffers that have been locked down (locked buffers are required by the audio device driver).
 * Thread overhead - Playlists use at least two threads to transfer buffers from the application to the audio device (see Figure 1).
 * Device independent layers - Playback and record commands must be processed by the Media Device Manager (MDM) and Audio MCI Driver (ADMC) before actual data movement begins.
 * Notification speed – Playlists notify applications when buffers are processed via Presentation Manager (PM) messages.



Figure 1. Architectural Overview of Playlists. The Memory Stream Handler (MSH) retrieves the buffer from application memory and passes it to the Sync-Stream Manager (SSM). SSM then forwards the buffer to the Audio Stream Handler (ADSH). ADSH in turn gives the buffer to the audio device to play or record.

Since DART has a minimal code path from an application to the audio driver, it is able to achieve performance levels unattainable with playlists. In fact, DART offers the following advantages over previous methods:
 * Pre-allocation of user memory - Because DART allows applications to lock down buffers, there is no need to copy audio data to or from user memory before sending it to the audio driver.
 * No threads - Applications are responsible for thread management to ensure that their application is responsive.
 * Function pointers rather than APIs - For time-critical functions (such as reading or writing data from the audio device), applications use function pointers rather than the longer code path of an API, which has several device-independent layers. (Note: The function pointer is still device independent.)
 * Time-critical notifications - Applications are notified as each buffer is consumed or filled via high-priority threads and direct function callbacks.
 * Tailorable number of buffers – Programs can allocate an arbitrary number of buffers of a user-definable size.



Figure 2. Code Path for DART. Applications send buffer(s) directly to the Amp-Mixer device, which in turn forwards the buffer to the Vendor Specific Driver (VSD) for playback or recording.

Note: A Vendor Specific Driver (VSD) is the lowest device independent layer in OS/2 multimedia. For more information on VSDs, see the article titled "OS/2 Enters the DSP Age" in Volume 5 of The Developer Connection News on your accompanying Developer Connection for OS/2 CD-ROMs.

In order to use DART, the following MCI commands have been added to the MCI Amp-Mixer driver (which is responsible for interfacing to the audio device):
 * MCI_MIX_SETUP sets up the device in the correct mode (for example, PCM, MPEG audio, or MIDI).
 * MCI_BUFFER allocates or deallocates memory for use with the audio device.
 * MCI_STOP stops playback or recording.
 * MCI_PAUSE pauses playback or recording.
 * MCI_RESUME resumes playback or recording.
 * MCI_STATUS, with the MCI_STATUS_POSITION flag, returns the current position of the device.

MCI_MIX_SETUP initializes the Amp-Mixer device for a specific mode (for example, 8 bits per sample, 11025 KHz, stereo). It also informs the mixer device of a function pointer to use for event notifications (that is, this function will be called when buffers are empty, full, or if an error has occurred). If the call is successful, the mixer device returns two function pointers (one for reading data and the other for writing data to the audio device).  // mix_setup informs the mixer device of the entry point // to report buffers being read or written. // we will also need to tell the mixer which media type // we will be streaming. In this case, we'll use // MCI_DEVTYPE_WAVEFORM_AUDIO, but we could use MIDI // if that's what we wish to do.

memset( &MixSetupParms, '\0', sizeof( MCI_MIXSETUP_PARMS ) );

MixSetupParms.ulBitsPerSample = 16; MixSetupParms.ulFormatTag     = MCI_WAVE_FORMAT_PCM; MixSetupParms.ulSamplesPerSec = 22050; MixSetupParms.ulChannels      = 2;      /* Stereo */ MixSetupParms.ulFormatMode    = MCI_PLAY; MixSetupParms.ulDeviceType    = MCI_DEVTYPE_WAVEFORM_AUDIO;

// the mixer will inform us of entry points to  // read/write buffers to and also give us a   // handle to use with these entry points

MixSetupParms.pmixEvent = MyEvent;

rc = mciSendCommand( usDeviceID,                 MCI_MIXSETUP,                  MCI_WAIT | MCI_MIXSETUP_INIT,                  ( PVOID ) &MixSetupParms,                  0 );  ''Sample Code 1. Using MCI_MIX_SETUP to prepare the audio device for 16-bit, 22050 KHz Stereo mode''

After MCI_MIX_SETUP has been successfully called, you can use MCI_BUFFER to allocate or deallocate memory for communication with the audio driver. Due to device driver restrictions, these buffers are limited to 64K on Intel machines (this restriction will not apply to OS/2 Warp Connect (PowerPC Edition)).  MCI_MIX_BUFFER  MyBuffers[ MAX_BUFFERS ];

BufferParms.ulNumBuffers = 40; BufferParms.ulBufferSize = 4096; BufferParms.pBufList = MyBuffers;

rc = mciSendCommand( usDeviceID,                 MCI_BUFFER,                  MCI_WAIT | MCI_ALLOCATE_MEMORY,                  ( PVOID ) &BufferParms,                  0 );

if ( ULONG_LOWD( rc) != MCIERR_SUCCESS ) {      printf( "Error allocating memory.  rc is : %d", rc ); exit ( 1 ); }

// MCI driver will return the number of buffers it   // was able to allocate // it will also return the size of the information // allocated with each buffer.

ulNumBuffers = BufferParms.ulNumBuffers;

for ( ulLoop = 0; ulLoop < ulNumBuffers; ulLoop++ ) {      rc = mmioRead ( hmmio,                      MyBuffers[ ulLoop ].pBuffer,                      MyBuffers[ ulLoop ].ulBufferLength);

if ( !rc ) {         exit( rc ); }      MyBuffers[ ulLoop ].ulUserParm = ulLoop;

}  ''Sample Code 2. Using MCI_BUFFER to allocate memory''

Once the device has been set up and memory has been allocated, the caller can use the mixRead or mixWrite function pointers obtained via MCI_MIX_SETUP to communicate with the Amp-Mixer driver. One important performance enhancement this interface provides is the ability to send multiple buffers to the device with a single call. This allows applications to pool buffers at the driver and minimize the likelihood of data underruns or overruns should the application thread not receive enough time slices.  // we can write multiple buffers at once, // just tell the mixer how many buffers // we are sending.

MixSetupParms.pmixWrite( MixSetupParms.ulMixHandle,                          MyBuffers,                           40 );  ''Sample Code 3. Sending multiple digital audio buffers with one call''

As the device produces or consumes data, the mixer device calls the application with a function pointer given to it during the MCI_MIX_SETUP call. This function pointer communicates the filling or emptying of buffers and error conditions. Each buffer that is returned to the application has a time stamp (in milliseconds) attached so the program can determine the current time of the device.  // The mixer device will report buffer completion // to the entry point specified under MIX_SETUP // in this case, we specified MyEvent // // Note: this event comes on a HIGH priority thread // (i.e. this is NOT the application thread). // it is not a good idea to do a lot of work here // or wait on semaphores etc., since the system // will get bogged down

LONG APIENTRY MyEvent ( ULONG           ulStatus,                        PMCI_MIX_BUFFER  pBuffer,                        ULONG            ulFlags        )

{ ULONG ulSearchLoop = 0; static LONG lDebug = 0; // on input, we will receive three parameters: // ulStatus. Detailed error message: //  can contain one of the following values: //    ERROR_DEVICE_UNDERRUN or ERROR_DEVICE_OVERRUN // pBuffer. Buffer that was returned. NOTE: this can be NULL if // only an error gets returned. // ulFlags: can contain one or more of the following values: //  MIX_STREAM_ERROR //  MIX_READ_COMPLETE //  MIX_WRITE_COMPLETE

switch( ulFlags ) {   case MIX_STREAM_ERROR | MIX_READ_COMPLETE :     // error occur in device case MIX_STREAM_ERROR | MIX_WRITE_COMPLETE:    // error occur in device {      if ( ulStatus == ERROR_DEVICE_UNDERRUN) {         // handle ERROR_DEVICE_UNDERRUN or OVERRUN here

}      // note: no break--example code below should fall through // and always be executed. }   case MIX_READ_COMPLETE :            // for recording case MIX_WRITE_COMPLETE:           // for playback ulBufferCount++; if ( ulBufferCount >= (ulNumBuffers * ulTotalLoops) ) {         DosPostEventSem (hEventSem); }      else {         MixSetupParms.pmixWrite( MixSetupParms.ulMixHandle,                                   pBuffer,                                   1 ); }      printf( "Buffer # : %d time : %d\n",                pBuffer->ulUserParm,                pBuffer->ulTime );

if ( ( lDebug + 1) != pBuffer->ulUserParm) {         lDebug = pBuffer->ulUserParm; }      lDebug = pBuffer->ulUserParm;

} // switch ulFlags

} // MyEvent  ''Sample Code 4. How to process events with DART''

MCI_STOP, MCI_PAUSE, and MCI_RESUME are used to stop, pause, or resume the audio device, respectively. MCI_STOP and MCI_PAUSE can only be sent to the Amp-Mixer device after mixRead and mixWrite have been called. Likewise, MCI_RESUME will only work after MCI_PAUSE has been sent. (Note: You should stop the device with MCI_STOP after your program has completed data transfers to the Amp-Mixer device. If you don't use MCI_STOP, you might see that the device has a noticeable pause once it is started again.)

If a program needs more precise timing information than provided by the time stamp returned with each buffer, you can use MCI_STATUS with the MCI_STATUS_POSITION flag to retrieve the current time of the device in MMTIME units.

In addition to digital audio playback and record, you can use DART for MIDI playback and record. All of the principles described above work with MIDI, and the only thing that must change is MCI_MIX_SETUP. Sample Code 5 illustrates how to prepare the device for MIDI information. Additional information on the format of MIDI data required is described in the OS/2 Multimedia Programming Reference (which is part of the Developer's Toolkit for OS/2 Warp).  // mix_setup informs the mixer device of the entry point // to report buffers being read or written. // we will also need to tell the mixer which media type // we will be streaming. In this case, we'll use // MCI_DEVTYPE_WAVEFORM_AUDIO, but we could use MIDI // if that's what we wish to do.

memset( &MixSetupParms, '\0', sizeof( MCI_MIXSETUP_PARMS ) );

MixSetupParms.ulBitsPerSample = 0; MixSetupParms.ulFormatTag     = 0; MixSetupParms.ulSamplesPerSec = 0; MixSetupParms.ulChannels      = 0; MixSetupParms.ulFormatMode    = MCI_PLAY; MixSetupParms.ulDeviceType    = MCI_DEVTYPE_SEQUENCER;

// the mixer will inform us of entry points to  // read/write buffers to and also give us a   // handle to use with these entry points.

//  MixSetupParms.ulMixHandle =  filled in

//  MixSetupParms.pmixWrite =;   filled in //   MixSetupParms.pmixRead =;    filled in

MixSetupParms.pmixEvent = MyEvent;

rc = mciSendCommand( usDeviceID,                 MCI_MIXSETUP,                  MCI_WAIT | MCI_MIXSETUP_INIT,                  ( PVOID ) &MixSetupParms,                  0 );

if ( rc ) {      printf("Mix setup failed: rc %d", ULONG_LOWD(rc) ); exit(1); }  ''Sample Code 5. Using DART for MIDI''

OS/2 Multimedia Compatibility
Because DART uses the existing MCI interface to open the Amp-Mixer device and process MCI messages, DART applications can share audio devices with any other OS/2 multimedia application simply by processing the MM_MCIPASSDEVICE message. (For more information on the MM_MCIPASSDEVICE message, see the article titled "Maximizing the Audio Support in OS/2 2.1" in Volume 2 of The Developer Connection News.)

Performance Considerations
One critical implementation detail that playlists encapsulate and that DART does not provide is the prevention of audio breakup under heavy load conditions. To explain, both the memory stream handler and audio stream handler threads run as TIME_CRITICAL threads. As a result, even though the system may be heavily burdened, these threads are always scheduled, ensuring that audio is never starved for data (or underruns). By contrast, DART relies on the application to control priorities for the thread that feeds the Amp-Mixer. To ensure that audio does not break up when using DART, the application might have to tailor the thread priority.

A second factor that can affect performance is the priority of the thread that calls the application event routine (this thread's priority is fixed at TIME_CRITICAL + 3). Once the event routine is called, applications should not do any action that takes a lot of time (such as going into a loop), because the system performance will suffer.

Note: To try DART, install the beta DART sample that is on disc 3 of your accompanying Developer Connection for OS/2 CD-ROMs in TOOLKIT\BETA\SAMPLES\ENTOOLKT\AUDIO\DAUDIO\*. Two system DLLs, AMPMXMCD.DLL and AUDIOSH.DLL, must be updated before DART applications can work. If these DLLs are not installed, you will receive MCIERR_UNSUPPORTED_FUNCTION when you try to use the new messages.

Summary
Because it has a shorter code path and improved notification mechanism, DART can noticeably enhance your application. If you are willing to manage low-level details rather than relying on the system to handle them, the results should be well worth it.