Porting from Unix platforms

From EDM2
Jump to: navigation, search

Introduction

This article will cover the basic steps for porting programs written for the Unix platform.

Only command line programs will be covered, since porting the X window interface is a second step. If you are working on a GUI program, consider using the EverBlue or the wxWidgets (formerly wxWindows) project. Remember that the steps described here probably still apply to your project.

System requirements

The most common environment used for porting Unix programs, is the EMX/GCC compiler & tools. It is based on gcc 2.8.1 and provides most of the required tools.

A newer environment is the InnoTek GCC compiler, based on GCC 3.2.2 for libc 0.5 level, on GCC 3.3.5 for the libc 0.6 release. This environment is considered to be the preferable way for starting.

To create an EMX basic environment you need to download the following files:

For an advanced environment, you need also:

  • GNU automake ([automake-1_6_2.zip])
  • GNU autoconf ([autoconf213.zip])
  • GNU m4 command processor ([gnum4.zip])

For using InnoTek GCC (recommended), get also:

Get the above packages and install following the included instructions. I will add a scheme here sometimes. Since you can use GNU utilities with different environments, I suggest putting EMX core files into a different directory tree. So you can use the utilities and other tools with InnoTek GCC without too much trouble.

Install compilers and tools in the same drive of your projects: this will make your life easier, since Unix doesn't know about drive letters and most makefiles/scripts specify root as starting point (e.g. /bin/sh). So having different drive letters could require some effort in editing scripts or makefiles. One solution for different drives may be the Toronto Virtual File System (TVFS) driver, which is IBM Employee Written Software (EWS) and can be found here:

  • [tvfs211.zip] - TVFS 2.11

Getting the source code

First step is to get the program source code. I can't tell you where to find it, but SourceForge is a good place for starting :-)

Usually source code comes as a 'tarball': that means all files are packed into a single file with tar (extension .tar) and most of the time also compressed (extension .tar.gz or .tgz).

Untar the compressed tar archive with 'tar xzf place_here_my_tar.gz' and you will get all the sources extracted into a subdirectory (usually a tar file doesn't write files to the current directory).

If someone worked on the same program in the past, you can review the old work and use the old patches with your version. Keep in mind that old patches cannot always be applied to current source code, so a manual review is better in such cases.

Configuring the scripts

Since Unix is not a single platform, but has a wide range of choices (like Linux, various BSDs, Solaris, AIX), most UNIX programs are coming with a so-called 'configure' script. This is a shell script that must be run to recognize the current system configuration, so the correct compiler switches, libraries path and tools will be used.

When configure is missing, you should modify makefile to add OS/2 specific changes.

The configure script is usually a very long script, and it takes really long to execute. When this script is created from recent autoconf releases (2.57 and later), it will work under OS/2 with minor (or nothing at all) modifications. Run it with

sh -c ./configure

Since default switches are not always optimal for the OS/2 environment, I use a customized startup script named configure.os2:

#! /bin/sh
CFLAGS="-s -Zomf -O3 -march=pentium -mcpu=pentium3" \
CXXFLAGS="-s -Zomf -O3 -march=pentium -mcpu=pentium3" \
LDFLAGS="-s -Zmap -Zhigh-mem -Zomf -Zexe" \
LN_CP_F="cp.exe" \
RANLIB="echo" \
AR="emxomfar" \
./configure --with-libdb-prefix=e:/usr/local/berkeleydb.4.2 --prefix=/usr/local/bogofilter

I place this script in the same directory of configure, so I run it with

sh -c ./configure.os2

My personal script will setup my preferred makefile parameters, tell which libraries link (like socket library), where to place installed binaries (--prefix) or where to find optional libraries (--with-berkeley-db).

A good way to execute scripts makes use of a config.site file: this script is automatically executed before configure, and it will adjust some variables for proper script parsing.

# This file is part of UnixOS/2. It is used by every autoconf generated
# configure script if you set CONFIG_SITE=%UNIXROOT%/etc/unixos2/config.site.
# You can add your own cache variables at the end of this file.
#
echo "Loading UnixOS/2 config.site"
#
LIBS="-lsocket" 
#
# Executables end on ".exe". You may add other executable extensions.
# Supported since autoconf 2.53 (?).
test -n "$ac_executable_extensions" || ac_executable_extensions=".exe"
#
# Replace all '\' by '/' in your PATH environment variable
ux2_save_IFS="$IFS"
IFS="\\"
ux2_temp_PATH=
for ux2_temp_dir in $PATH; do
  IFS="$ux2_save_IFS"
  if test -z "$ux2_temp_PATH"; then
    ux2_temp_PATH="$ux2_temp_dir"
  else
    ux2_temp_PATH="$ux2_temp_PATH/$ux2_temp_dir"
  fi
done
export PATH="$ux2_temp_PATH"
unset ux2_temp_PATH
unset ux2_temp_dir
unset ux2_save_IFS

You can place this file into \emx\share and add SET CONFIG_SITE=x:\emx\share\config.site to your environment setup command file.

There are at least two causes able to stop your running script: the required files are too old, or you are missing some tools or libraries.

Required files are too old

Your scripts are too old to execute properly under OS/2, or they don't recognize OS/2 as a possible target. You need to replace config.sub, config.guess, mkinstalldirs with more recent versions (I get them from my other projects, so I can't suggest where to get them).

The configure script has been generated with an older autoconf/automake release: you must rebuild the script using autoconf and/or automake. In the same configure directory there is a file named configure.in: this is the input script for autoconf; autoconf will read the .in file and write a new configure file. The new file is supposed to execute better, so this doesn't mean it will complete the required steps. Run

sh -c ./autoconf

in your current directory to rebuild.

Once autoconf completes his task, compare the new script with the old one: most of the time the new script is slightly longer (because of new stuffs); so if your new script is really shorter, that means you missed something during the build. In my experience, this is related to missing .m4 files or a different run command (e.g. Berkeley DB requires running the s_conf script to properly rebuild configure).

You are missing something

Here the range of problems is quite wide: from 'unable to find a compiler' to 'you don't have xxx installed'.

When configure can't find your compiler, probably you need to update the files (see above), and the script will stop running almost at initial steps. This can be related to conflicting switches in configure.os2 (if used) like -Zomf.

When you don't have something installed, the solution can be different: if it turns out that you need another package to compile, like Berkeley DB, search for it, download and start porting it. Once you got the required files installed, go back to your main project. Most of the time, the required libraries are already existing for OS/2, so it is only a matter to download them, place somewhere and tell configure the correct path for them (like --with-berkeley-db above).

The best way to discover problems, is to check the config.log file: every check done by configure is logged into this file; look near the end of the file or search text using the failing test as key. Maybe the check failed because of wrong library order: libraries must be specified in the correct order, since every missing function is searched in the next library; e.g. syslog library must be placed before socket library (-lsyslog -lsocket).

Under EMX, it may happen that some compiler flags are incompatible with configure: I found -Zomf to conflict in some cases.

If you still get problems, consider rebuilding the configure script.

As I wrote, script execution is really long: for every needed tool, header, function or library the script will execute the compiler to verify its presence. This is a compile/link/execute sequence for every test.

Once completed, the configure script will write a config.status file; this is a shell script used to update all project files: makefiles at least, but also a config.h can be written (this header is supposed to contain constants for existing and non-existing functions in your environment), or dependent files can be created.

Compiling the code

When the above steps completed successfully, you are ready to run make. There are many make programs available for OS/2, like (GNU) make, smake, dmake, imake: usually GNU make is working with most projects, but you can find projects specifically designed for a custom make program (e.g. Star backup requires smake to build).

I'm using GNU make 3.81rc2 for my development, but also 3.76 worked well with my environment: I'm writing this, because a faulty make can break your build system. You can get problems like broken rules or non-working makefiles or missing builds. Since make uses the shell to perform many tasks, the shell is also important. Place a copy of sh.exe into your /bin directory: since most shell scripts require a /bin/sh, it is better to place one there, so you will not need to edit every single shell script.

Finally you are ready to type

make

to compile your source code. Please remember to read INSTALL (or similar files) to discover possible additional steps.

Also here you have a wide choice of problems: missing libraries, wrong paths, broken shell commands, wrong compiler flags, code using functions not tested in configure.

Inspect code and correct your files. If you are about to change makefile, remember that makefile.am and/or makefile.in are used by automake/configure to write makefile: the next time you run configure, you will lose all your changes; so apply changes also to these files.

If your code is using some functions you want to skip, you can check config.h and undefine some constants, or add a #ifdef __EMX__/#endif around the code.

If you need to change flags or libraries, instead of editing all makefiles, you can edit config.status in your root directory and run it again as

sh -c config.status

This will rebuild all makefiles, config.h, dependent files and possibly other. Note: this will erase your changes to makefiles and other files.

Running tests

Complex projects usually have a set of scripts to verify the proper program execution: this is a good way to discover problems in the ported code. In most projects (check the project documentation) make check will run the tests. Here the shell integration is fundamental: without a good shell, you are likely to fail test execution.

Usually tests compares the program output with a pre-recorded output: since Unix uses LF as line terminator instead of OS/2 CR LF sequence, the check can fail simply because of different file length. So convert your input/output set of data to CR LF (e.g. zip -m -LL xx * & unzip xx & del xx).

Installing the binaries

Binaries can be installed with

make install

Older makefiles can install the dummy executable (e.g. mysqld instead of mysqld.exe). Stripping debug information will fail when -Zomf has been used (strip recognizes only a.out format). Remember to use --prefix when running configure: otherwise files will be copied to their default location, and this is not good until you get a fully working port.

Submitting patch file

Once you reached the end of your porting job, you should submit your changes to the main development team. If you worked on a GPL/LGPL licensed project, your changes must be make available to anyone requests them. In every case, it is not required to send them to the main team, but you should think about the future of the port: maybe at some time you will no longer involved in this project, or someone else wants to cooperate and join your effort. Also including the OS/2 specific changes will make porting the next project release a lot easier (maybe a simple configure/make/make_install sequence).

A GNU diff in unified format is a good way to send patches. Usually I make a copy of the original source files and I place them at the same level of the development tree. E.g.

...\mysql\MySQL-4.1.7
...\mysql\MySQL-4.1.7.0 (this is a copy)

To get a difference file, enter mysql directory and run

diff -urBbi MySQL-4.1.7.0 MySQL-4.1.7 > patch-4.1.7

flags: -u unified diff format, -r recurse directories, -bB ignore blank spaces and carriage returns, -i ignore case.

Edit the patch-4.1.7 file, remove personal changes and send it to the team together with a description of your work: a simple list with file name, functions changed and why is usually enough to understand your work.

If the project has some platform specific readme files, create a readme.os2 (or similar) and send it with your patch file.

Remember also to update the project documentation to reflect the new status.

Conclusions

The task of porting a new project from scratch to the OS/2 platform could require a lot of time and work, but most of the time this task is a lot easier than described. Also having some old code is very useful and makes life happier.

But the ending result will keep you happy, and probably also many other OS/2 users around the world!

My porting projects are mainly driven by personal or customer requirements.

As starting point, I can suggest to start with a simple project: simple means few source files, some directories.

But don't be worried by bigger projects: many times the porting to other platforms is already supported by the core team, so most platforms dependent changes are confined to a few files.

If you need help, consider joining the netlabs IRC channel/mailinglist or the Unix OS/2 mailing list.

Resources