UsingThreads:SynchronizationTimings: Difference between revisions

Latest revision as of 02:50, 21 February 2020

All thread related material I've come across the 'net that mention that the OS/2 API DosEnterCritSec() is slow and should not be used. Here I present my own timings of it, compared to DosRequestMutexSem().

The Timing Method

First, let me introduce the method I used for timing the system calls: DosTmrQueryTime(). This system call returns a snapshot of the high resolution timer. This is a 64 bit value, directly from the high resolution timer device [or, so I believe]. It is returned in a QWORD structure, and I use the following method to get it into OpenWatcoms 64 bit long long, given a QWORD time value:

unsigned long long int timer_snapshot =
  ( static_cast< unsigned long long int >( time.ulHi ) << 32 ) | time.ulLo;

To time a function, a snapshot is taken just before, and just after the function call, like so:

QWORD start, end;

DosTmrQueryTime( &start );
DosEnterCritSec();
DosTmrQueryTime( &end );

Now, the idea is to measure the time it takes the system to perform DosEnterCritSec() compared to DosRequestMutexSem() and not the actual real time elapsed for the duration. To do this, I measured 300 samples of both system calls, and compared the smallest time value of both. Since I'm measuring the relative speed, the actual frequency of the high resolution timer is irrelevant.

The Timings

Here is the code I used to make the measurments:

// file: perf.c++
#include <iostream>
#include <limits>

#define INCL_DOSPROFILE
#define INCL_DOSPROCESS
#define INCL_DOSSEMAPHORES
#include <os2.h>

int main( int argc, char *argv[] )
{
  unsigned long long int critical_min = std::numeric_limits< unsigned long long int >::max();
  unsigned long long int mutex_min = std::numeric_limits< unsigned long long int>::max();

  for ( int i = 0; i < 300; i++ )
  {

    QWORD start, end;
    DosTmrQueryTime( &start );
    DosEnterCritSec();
    DosTmrQueryTime( &end );
    DosExitCritSec();

    unsigned long long int long_start =
      ( static_cast< unsigned long long int >( start.ulHi ) << 32 ) | start.ulLo;
    unsigned long long int long_end =
      ( static_cast< unsigned long long int >( end.ulHi ) << 32 ) | end.ulLo;

    unsigned long long int current_crit = long_end - long_start;

    if ( critical_min > current_crit )
      critical_min = current_crit;

    // Mutex timer

    HMTX mutex;
    DosCreateMutexSem( NULL, &mutex, 0, FALSE );

    DosTmrQueryTime( &start );
    DosRequestMutexSem( mutex, SEM_INDEFINITE_WAIT );
    DosTmrQueryTime( &end );
    DosReleaseMutexSem( mutex );
    DosCloseMutexSem( mutex );

    long_start = ( static_cast< unsigned long long int >( start.ulHi ) << 32 ) 
                 | start.ulLo;
    long_end = ( static_cast< unsigned long long int >( end.ulHi ) << 32 )
               | end.ulLo;

    unsigned long long int current_mutex = long_end - long_start;
    if ( mutex_min > current_mutex )
      mutex_min = current_mutex;

  }

  std::cout << "DosEnterCritSec(): " << critical_min << std::endl
            << "DosRequestMutexSem(): " << mutex_min << std::endl;

  return 0;
}

This can be compiled with OpenWatcom 1.4 or later, like so:

>wcl386 -cc++ -bm "perf.c++"

Or with GCC like so:

>g++ -Zmt "perf.c++" -o perf.exe

This gives the following output, on my eComStation 1.2MR system, most of the time:

DosEnterCritSec(): 13
DosRequestMutexSem(): 12

I have also seen the following numbers:

DosEnterCritSec(): 14
DosRequestMutexSem(): 12

And:

DosEnterCritSec(): 12
DosRequestMutexSem(): 12

So my sample space of 300 experiments is probably too low, or there may always be some fluctuations in timings. Also, it does not seem to matter how many threads there are in the process, the timings are always the same. It is left as an excersize for the reader to test with more than one thread.

Conclusion

Almost all literature that goes into the OS/2 threading API states that using DosEnterCritSec() will kill performance, indicating that the system call itself is expensive. I have shown that it is not so, at least with later OS/2 kernels.

However, it is true that using it is probably overkill, so think very carefully before using it in your project.

UsingThreads