Porting from Unix platforms: Difference between revisions
| Line 3: | Line 3: | ||
| This article will cover the basic steps for porting programs written for the Unix platform. | This article will cover the basic steps for porting programs written for the Unix platform. | ||
| Only command line program will be covered, since porting the X window interface is a second step. If you are working on a GUI program, consider using the [http://everblue.netlabs.org EverBlue Project] or the  [http://www.wxwindows.org/  | Only command line program will be covered, since porting the X window interface is a second step. If you are working on a GUI program, consider using the [http://everblue.netlabs.org EverBlue Project] or the  [http://www.wxwindows.org/ wxWidgets] (formerly wxWindows) project. Remind that the steps described here, probably still apply to your project. | ||
| == System requirements == | == System requirements == | ||
Revision as of 18:00, 28 November 2004
Introduction
This article will cover the basic steps for porting programs written for the Unix platform.
Only command line program will be covered, since porting the X window interface is a second step. If you are working on a GUI program, consider using the EverBlue Project or the wxWidgets (formerly wxWindows) project. Remind that the steps described here, probably still apply to your project.
System requirements
The most common environment used for porting Unix programs, is the EMX/GCC compiler&tools. It is based on gcc 2.8.1 and provides most of the required tools.
A new environment actually coming up, is the Innotek gcc compiler, based on gcc 3.2.2 for libc 0.5 level, on gcc 3.3.5 for the libc 0.6 release (now in beta test). This is an environment currently in development, but I consider it the preferable way for starting.
To create an EMX basic environment you need to download the following files:
- Files required for developing programs with emx (part 1)
- Files required for developing programs with emx (part 2)
- The GNU C compiler, the GNU debugger, and other tools (part 1)
- The GNU C compiler, the GNU debugger, and other tools (part 2)
- Additional files for GCC required for compiling C++ programs
- libg++ 2.8.1.1a
- emxgnu.inf (emxgnu.doc in OS/2 .inf format)
- gnudiff.zip GNU diff v2.7.1 file differencer
- gnufutil.zip GNU file utilities v3.13
- gnugrep.zip GNU grep/egrep/fgrep 2.0
- gnututil.zip GNU text utilities v1.19
- gnupatch.zip GNU patch v2.5
- GNU Make 3.81rc2
- A shell (e.g. pdksh)
- GNU awk 3.0.6
- GNU sed 3.0.2
- GNU tar
- gzip 1.2.4
For an advanced environment, you need also:
- GNU Automake
- GNU Autoconf
- GNU M4 command processor
For using Innotek gcc (recommended), get also:
- Innotek GCC 3.2.2/LibC 0.5.1 runtime
- GCC for OS/2 Beta 4
- GCC for OS/2 Beta 4 CSD1
- GCC 3.3.5/LibC 0.6 beta 1
Get the above packages and install following the included instructions. I will add a scheme here sometimes. Since you can use GNU utilities with different environments, I suggest to put EMX core files into a different directory tree. So you can use the utilities and other tools with Innotek gcc without too much trouble.
Install compilers and tools in the same drive of your projects: this will make your like easier, since unix doesn't know about drive letters and most makefiles/scripts specify root as starting point (e.g. /bin/sh). So having different drive letters could require some effort in editing scripts or makefiles. One solution for different drives may be the Toronto Virtual File System (TVFS) driver, which is IBM Employee Written Software (EWS) and can be found here:
Getting the source code
First step is to get the program source code. I can't tell you where to find it, but SourgeForge is a good place for starting :-)
Usually source code comes as a 'tarball': that means all files are packed into a single file with tar (extension .tar) and most of the times also compressed (extension .tar.gz).
Untar the tarball with 'tar xzf place_here_my_tar.gz' and you will get all the sources extracted into a subdirectory (usually a tar file doesn't write files to the current directory).
If someone worked on the same program in the past, you can review the old work and use the old patches with your version. Keep in mind that old patches cannot always be applied to current source code, so a manual review is better in such cases.
Configuring the scripts
Since Unix is not a single platform, but has a wide range of choices (like Linux, various *BSD, Solaris, AIX, ...), most unix programs are coming with a so-called 'configure' script. This is a shell script that must be run to recognize the current system configuration, so the correct compiler switches, libraries path and tools will be used.
When configure is missing, you should modify makefile to add OS/2 specific changes.
The configure script is usually a very long script, and it takes really long to execute. When this script is created from recent autoconf releases (2.57 and later), it will work under OS/2 with minor (or nothing at all) modifications. Run it with
sh -c ./configure
Since default switches are not always optimal for the OS/2 enviroment, I use a customized startup script named configure.os2:
#! /bin/sh CFLAGS="-s -Zomf -O3 -march=pentium -mcpu=pentium3" \ CXXFLAGS="-s -Zomf -O3 -march=pentium -mcpu=pentium3" \ LDFLAGS="-s -Zmap -Zhigh-mem -Zomf -Zexe" \ LN_CP_F="cp.exe" \ RANLIB="echo" \ AR="emxomfar" \ ./configure --with-libdb-prefix=e:/usr/local/berkeleydb.4.2 --prefix=/usr/local/bogofilter
I place this script in the same directory of configure, so I run it with
sh -c ./configure.os2
My personal script will setup my preferred makefile parameters, tell which libraries link (like socket library), where to place installed binaries (--prefix) or where to find optional libraries (--with-berkeley-db).
A good way to execute scripts makes use of a config.site file: this script is automatically executed before configure, and it will adjust some variables for proper script parsing.
# This file is part of UnixOS/2. It is used by every autoconf generated
# configure script if you set CONFIG_SITE=%UNIXROOT%/etc/unixos2/config.site.
# You can add your own cache variables at the end of this file.
#
echo "Loading UnixOS/2 config.site"
#
LIBS="-lsocket" 
#
# Executables end on ".exe". You may add other executable extensions.
# Supported since autoconf 2.53 (?).
test -n "$ac_executable_extensions" || ac_executable_extensions=".exe"
#
# Replace all '\' by '/' in your PATH environment variable
ux2_save_IFS="$IFS"
IFS="\\"
ux2_temp_PATH=
for ux2_temp_dir in $PATH; do
  IFS="$ux2_save_IFS"
  if test -z "$ux2_temp_PATH"; then
    ux2_temp_PATH="$ux2_temp_dir"
  else
    ux2_temp_PATH="$ux2_temp_PATH/$ux2_temp_dir"
  fi
done
export PATH="$ux2_temp_PATH"
unset ux2_temp_PATH
unset ux2_temp_dir
unset ux2_save_IFS
You can place this file into \emx\share and add SET CONFIG_SITE=x:\emx\share\config.site to your enviroment setup command file.
There are at least two causes able to stop your running script: the required files are too old or you are missing some tools or libraries.
Required files are too old
Your scripts are too old to execute properly under OS/2, or they don't recognize OS/2 as a possible target. You need to replace config.sub, config.guess, mkinstalldirs with more recent versions (I get them from my other projects, so I can't suggest where to get them).
The configure script has been generated with an older autoconf/automake release: you must rebuild the script using autoconf and/or automake. In the same configure directory there is a file named configure.in: this is the input script for autoconf; autoconf will read the .in file and write a new configure file. The new file is supposed to execute better, so this doesn't means it will complete the required steps. Run
sh -c ./autoconf
in your current directory to rebuild.
Once autoconf completes his task, compare the new script with the old one: most of the times the new script is slightly longer (because of new stuffs); so if your new script is really shorter, that means you missed something during the build. In my experience, this is related to missing .m4 files or a different run command (e.g. Berkeley DB requires to run the s_conf script to properly rebuild configure).
You are missing something
Here the range of problems is quite wide: from 'unable to find a compiler' to 'you don't have xxx installed'.
When configure can't find your compiler, probably you need to update the files (see above), and the script will stop running almost at initial steps. This can be related to conflicting switches in configure.os2 (if used) like -Zomf.
When you don't have something installed, the solution can be different: if it turns out that you need another package to compile, like Berkeley DB, search for it, download and start porting it. Once you got the required files installed, go back to your main project. Most of the times, the required libraries are already existing for OS/2, so it is only a matter to download them, place somewhere and tell configure the correct path for them (like --with-berkeley-db above).
The best way to discover problems, is to check the config.log file: every check done by configure is logged into this file; look near the end of the file or search text using the failing test as key. Maybe the check failed because of wrong library order: libraries must be specified in the correct order, since every missing function is searched in the next library; e.g. syslog library must be placed before socket library (-lsyslog -lsocket).
Under EMX, it may happen that some compiler flags are incompatible with configure: I found -Zomf to conflict in some cases.
If you still get problems, consider rebuilding the configure script.
As I wrote, script execution is really long: for every needed tool, header, function or library the script will execute the compiler to verify its presence. This is a compile/link/execute sequence for every test.
Once completed, the configure script will write a config.status file; this is a shell script used to update all project files: makefiles at least, but also a config.h can be written (this header is supposed to contain constants for existing and non-existing functions in your environment), or dependent files can be created.
Compiling the code
When the above steps completed successfully, you are ready to run make. There are many make programs available for OS/2, like (GNU) make, smake, dmake, imake: usually GNU make is working with most projects, but you can find projects specifically designed for a custom make program (e.g. Star backup requires smake to build).
I'm using GNU make 3.81rc2 for my developement, but also 3.76 worked well with my enviroment: I'm writing this, because a faulty make can break your build system. You can get problems like broken rules or non-working makefiles or missing builds. Since make uses the shell to perform many tasks, also the shell is important. Place a copy of sh.exe into your /bin directory: since most shell scripts require a /bin/sh, it is better to place one there, so you will not need to edit every single shell script.
Finally you are ready to type
make
to compile your source code. Please remember to read INSTALL (or similar files) to discover possible additional steps.
Also here you have a wide choice of problems: missing libraries, wrong paths, broken shell commands, wrong compiler flags, code using functions not tested in configure.
Inspect code and correct your files. If you are about to change makefile, remember that makefile.am and/or makefile.in are used by automake/configure to write makefile: the next time you run configure, you will lose all your changes; so apply changes also to these files.
If your code is using some functions you want to skip, you can check config.h and undefine some constants, or add a #ifdef __EMX__/#endif around the code.
If you need to change flags or libraries, instead of editing all makefiles, you can edit config.status in your root directory and run it again as
sh -c config.status
This will rebuild all makefiles, config.h, dependent files and possibly other. Note: this will erase your changes to makefiles and other files.
Running tests
Complex projects usually have a set of scripts to verify the proper program execution: this is a good way to discover problems in the ported code. In most projects (check the project documentation)
make check
will run the tests. Here the shell integration is fundamental: without a good shell, you are likely to fail test execution.
Usually tests compares the program output with a pre-recorded output: since unix uses LF as line terminator instead of OS/2 CR LF sequence, the check can fail simply because of different file length. So convert your input/output set of data to CR LF (e.g. zip -m -LL xx * & unzip xx & del xx).
Installing the binaries
Binaries can be installed with
make install
Older makefiles can install the dummy executable (e.g. mysqld instead of mysqld.exe). Stripping debug informations will fail when -Zomf has been used (strip recognizes only a.out format). Remeber to use --prefix when running configure: otherwise files will be copied to their default location, and this is not good until you get a fully working port.
Submitting patch file
Once you reached the end of your porting job, you should submit your changes to the main developement team. If you worked on a GPL/LGPL licensed project, your changes must be make available to anyone requests them. In every case, it is not required to send them to the main team, but you should think about the future of the port: maybe at some time you will no longer involved in this project, or someone else wants to cooperate and join your effort. Also including the OS/2 specific changes will make porting the next project release a lot easier (maybe a simple configure/make/make_install sequence).
A GNU diff in unified format is a good way to send patches. Usually I make a copy of the original source files and I place them at the same level of the developement tree. E.g.
...\mysql\MySQL-4.1.7 ...\mysql\MySQL-4.1.7.0 (this is a copy)
To get a difference file, enter mysql directory and run
diff -urBbi MySQL-4.1.7.0 MySQL-4.1.7 > patch-4.1.7
flags: -u unified diff format, -r recurse directories, -bB ignore blank spaces and carriage returns, -i ignore case.
Edit the patch-4.1.7 file, remove personal changes and send it to the team together with a description of your work: a simple list with file name, functions changed and why is usually enough to understand your work.
If the project has some platform specific readme files, create a readme.os2 (or similar) and send it with your patch file.
Remember also to update the project documentation to reflect the new status.
Conclusions
The task for porting a new project from scratch to the OS/2 platform could require a lot of time and work, but most of the time this task is a lot easier than described. Also having some old code is very usefull and makes life happier.
But the ending result will keep you happy, and probably also many other OS/2 users around the world!
My porting projects are mainly driven by personal or customer requirements.
As starting point, I can suggest to start with a simple project: simple means few source files, some directories.
But don't be worried by bigger projects: many times the porting to other platforms is already supported by core team, so most platforms dependent changes are confined to a few files.
If you need help, consider joining the netlabs IRC channel/mailinglist or the Unix OS/2 mailing list.