The HPFS FAQ

By Les Bell

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Introduction
Since I can't resist answering questions about HPFS in various news groups, I've finally taken the time to collate some of the most commonly-asked questions and answers. Further questions (and even answers) are welcome by email to lesbell@lesbell.com.au. All contributions will be credited.

What Is the High Performance File System?
HPFS is OS/2's preferred, native file system or disk format.

When OS/2 1.0 first shipped, in the rush to get it out the door, Microsoft took developers off one part of the operating system and put them to work on other components. The part that took the schedule hit was HPFS, and so OS/2 1.0 shipped with the same disk format and file system code as DOS - the FAT (File Allocation Table) File System. Performance was less than impressive, and did not help OS/2 win customers over DOS.

In OS/2 1.2, however, Microsoft and IBM finally caught up and introduced support for Installable File Systems. An IFS is a special kind of dynamic link library which provides support for a disk format and hooks into the kernel's file system router, through the IFS= statement in CONFIG.SYS.

Several installable file systems are available for OS/2: HPFS provides a variety of advantages over the FAT file system.
 * HPFS - the High Performance File System
 * CDFS - the CD-ROM File System
 * NETWKSTA - the LAN Server remote file system
 * HPFS386 - the LAN Server Advanced network version of HPFS
 * HPOFS - the High Performance Optical File System
 * SRVIFS - The CID remote installation thin file system

What are the Benefits of HPFS over FAT?

 * Support for long file names - up to 254 characters in length
 * upper and lower case - HPFS preserves case, but is not case sensitive
 * extended character sets
 * Native (non-fragile) support for EAs: FAT is just _too_ fragile to support this, and the workplace shell depends on it heavily.
 * Higher performance. FAT degrades rapidly as drive size goes up. HPFS degrades a lot less. FAT degrades rapidly with lots of files in directories, HPFS degrades a lot less. Try unzipping an icon file with say 3,000 files in it to see the difference: FAT: 15 mins to get half-way through, HPFS: 3 mins till done!
 * Much greater integrity: signatures at the beginning of system structure sectors, forwards and backwards links in fnode trees. CHKDSK (run repeatedly at /F:2) can repair almost all damage to HPFS systems. On FAT, you need to buy additional utilities, and even then, there are no guarantees.
 * Much better granularity: FAT allocates clusters which can be very large (64 KBytes max), giving rise to wasted 'slack space'. HPFS allocates sectors only: much less wasted space.
 * Much less fragmentation, although correctly-written OS/2 applications cause less fragmentation on FAT, too.
 * Not compatible with Win95: you don't have to worry about what Win95 might be doing to your file system.

What are the Disadvantages of HPFS?

 * Requires 300 KBytes RAM for code, plus more for its cache memory.
 * Requires third-party drivers to access under DOS and Linux. But these are inexpensive.

Why Does HPFS Perform So Well?
Before looking at HPFS's internals, it's probably a good idea to see why FAT performs so badly. First, the critical data structures - the root directory and file allocation table are on the outermost cylinders of the disk, but your most recently-created (and most-frequently-accessed) files are on the innermost cylinders (your disk is almost full, right?). This forces a lot of head movement, which is the slowest part of disk access.

Then there's the directory structures themselves. Directory entries are simply organised as a linear list, so that just about the best search strategy is to start at the beginning and search forward until you either find what you want or hit the end of the list. And it's even worse: if you give a command, like CHKDSK, even if the first entry in a directory is CHKDSK.EXE, the command processor must still search the entire directory in case there is a CHKDSK.COM there, which would take precedence.

HPFS is a much better design. First, the disk is divided up into bands, 16 megabytes across, with the directory structures and allocation bitmaps for each band located in the centre. This means that the content of a file is located at most 8 MB distant on the disk surface from the structures which control access to it. And the root directory is located in the central band, further helping to reduce head movement.

The directories themselves consist of B-trees. This means that searching for a file is done by conducting a modified binary search strategy, with significantly fewer disk accesses than the linear searching of FAT. This is also why, when you issue a DIR command against an HPFS drive, the resultant listing is alphabetically sorted: the code recursively traverses the tree, so the natural output order is alphabetically sorted, unlike the disorganised jumble produced by the DIR command on a FAT drive.

In addition, HPFS uses a more complex caching algorithm than FAT. All of this adds up to a substantial performance advantage for HPFS, especially on large drives, or where there are lots of files in a single directory. In fact, in early benchmark tests I conducted using Clipper on OS/2 2.0 betas (with an Adaptec 1.1 driver!) I was never able to make HPFS slower than 40% faster than FAT with the same cache size.

Do I Need to Defragment My HPFS Drives?
The basic answer here is 'not really'. HPFS does a pretty good job of keeping files from getting fragmented, and because of its design, it's not really a big deal if some files are fragmented. However, if you are doing a lot of updates and appends on random access files, or if a particular heavily-used file becomes fragmented then defragging is a good idea. In some cases, just copying the file between drives and back again may fix the problem. In others, running a defrag program may be the best approach. The major defragmentation utilities available are: Gordon Letwin, the original architect of HPFS wrote an interesting short study on fragmentation of HPFS drives.
 * The Graham Utilities for OS/2
 * Gammatech Utilities for OS/2

What Is the Maximum Drive Size for HPFS (and FAT)?
FAT has a maximum drive size of 2 GBytes. The maximum for HPFS is currently 64 GBytes, although the current 16-bit version of CHKDSK is not able to deal with drives larger than 16 GB.

However, you should note that the FAT filesystem allocates storage space in units called clusters, which are groups of adjacent sectors. The larger the drive, the larger the cluster, so that for a drive in the range of 1 - 2 GB, a cluster can typically consist of 64 sectors or 32 KB (worst case, PC DOS 4.0: 128 sectors or 64 KB). Since each file typically ends a little shy of half-way through its last cluster, this means that there is a little over half a cluster (typically 16 KB) of slack space associated with each file. For this reason, large drives which will carry lots of files are usually partitioned into multiple logical drives when used with the FAT file system, leading to complications in storage management. HPFS, by contrast, always allocates sectors, with approximately half a sector of slack space per file in all cases, so that there is no need to partition large drives (at least, for storage efficiency reasons).

But FDISK Won't Let Me Create a Drive Larger than 512MB!

Actually, the limit is not 512 MB, it's 1024 cylinders, and is caused by a limitation in your machine's BIOS. The BIOS INT 13H subfunction 08H, which returns the drive geometry information, places only ten bits of cylinder count information in the CH and CL registers. Thus, the BIOS - and consequently, the bootstrap loader - can 'see' only the first 1024 cylinders of a drive. Any bootable operating system partitions must be placed entirely within the first 1024 cylinders of a drive, and FDISK will not let you create a single partition that breaks this rule (annoyingly, it won't tell you what the problem is, it just greys out the menu option).

This problem applies to IDE drives in particular. SCSI drives do not have the problem, and it should not happen with E-IDE (ATA-2) drives either. However, E-IDE relies upon various BIOS extensions, and could run into trouble with OS/2. For further details, see Rod Smith's Large Hard Drives in OS/2 page.

What Is the Maximum File Size for HPFS (and FAT)?
The maximum file size for FAT is 2 GB, limited by the drive size. On HPFS, the maximum file size is also 2 GB, due to the 32-bit argument passed to the DosChgFilePtr API, which is the OS/2 equivalent to lseek. However, there is an undocumented DosFsCtl subfunction (FSCTL_SECTORIO - 0x9014), which allows a switch to sector-based (rather than byte-based) addressing, and thereby allows access to much larger files. To make sure that the programmer understands the implications of what he is doing, he is required to provide, as one of the parameters, a pointer to a password string. This API is known informally as the DEADFACE API.

How Does HPFS Have Greater Integrity than FAT?
The FAT filesystem has very little redundancy to support data recovery. Yes, there are two FATs, but these simply guard against the (very low) possibility of a bad sector making the first FAT unreadable: if the in-memory FAT is corrupted, then it gets written out twice, making recovery impossible.

HPFS is much smarter. First, every sector used in the filesystem structures has a signature, between two and eight bytes in length, at the beginning, which uniquely identifies it. For example, the Super Block, in the 16th sector of the drive, has a signature of "49 E8 95 F9 C5 E9 53 FA", so that it can be identified regardless of its position. Directory blocks start with a signature of "AE 0A E4 77", while fnodes for used directory entries have a signature of "AE 0A E4 F7". This feature makes it possible for CHKDSK to recognise filesystem data structures even on a badly corrupted drive and perform appropriate recovery.

In addition, the FAT filesystem has only forward links in the FAT, and no reverse links either in the FAT or between clusters and the FAT. HPFS, by contrast, has both forward and backward pointers between the various directory blocks and other structures.

For these reasons, CHKDSK can recover most filesystem problems without the need for additional 'disk maintenance' or 'crash recovery' software. However, because CHKDSK performs recovery in a forward direction down the links, it may need to be run several times in order to perform a full recovery. The guideline is to run CHKDSK repeatedly to recover disk errors, and every time a message is produced saying that something was repaired, run CHKDSK again until there are no more indications of repairs.

The presence of the signatures in filesystem structures is also why CHKDSK is able to recognise and recover such structures in apparently unused parts of the disk, particularly following a 'quick' format. See the warning below.

Can DOS Applications Access HPFS Drives?
Yes: DOS applications make file system calls like open, close, read and write through DOS's INT 21H API. They neither know nor care what the actual disk format DOS is managing is; if it was not for this, DOS applications would not be able to work with the wide range of floppy and hard disk formats, not to mention magneto-optical and floptical drivers on the market today. And of course, DOS applications are able to access files located on network servers, including Netware, which has its own weird format, and LAN Server, which supports HPFS.

When a DOS application running under OS/2 makes a file system call, this is intercepted by the Virtual DOS Machine, which then traps into OS/2 protected mode and the OS/2 kernel satisfies the request. Since it's the OS/2 kernel doing the actual work, the requested file can reside on any file system supported by OS/2 - including HPFS, HPOFS, and network redirected drives.

Can DOS Utilities Access HPFS Drives?
DOS utility programs for disk management generally cannot be used under OS/2. Such utilities use DOS interrupts INT 25H and INT 26H to read or write absolute sector locations on the disk surface. In a VDM under OS/2, INT 25H reads are allowed, but INT 26H writes to absolute sector locations are intercepted and not allowed. This is because it is dangerous in a multitasking environment to allow, e.g. a defragmentation utility to move a file while that file is in use. However, INT 26H writes to absolute sector locations on a floppy are allowed, in order to support certain copy protection mechanisms.

As a consequence, utilities such as Norton Disk Doctor, PC Tools, etc. Cannot be used to perform defragmentation, file undeletion or other tasks which require writing to the disk.

In addition, while DOS utilities know or are able to calculate the locations of system structures such as the root directory on a FAT file system, they have absolutely no knowledge of the layout of the components of HPFS. Consequently, about the only things one can do with (e.g.) Norton Disk Doctor on an HPFS drive are a) identify it as HPFS (from the boot sector signature) and b) read sectors based on their absolute location, with no support for formatting their contents or identifying them.

Can Windows Programs Access HPFS Drives?
Yes, in this respect Windows programs are just DOS programs, and use the same file system API's. Windows applications running under OS/2 using WIN-OS/2 function perfectly, and in many cases will perform better (especially databases and other disk-bound programs).

Can DOS Access HPFS Drives?
It depends. OS/2 allows the user to run genuine DOS in a virtual machine, by setting the DOS_STARTUP_DRIVE DOS Setting to either a floppy drive (A:), a DOS logical drive (C:) or a Virtual Machine Boot file (e.g. D:\VMBS\DOS5.VMB). This is done a) to support DOS applications that look for DOS data structures at specific locations, b) to support block-oriented device drivers (e.g. network drivers like Lantastic) or c) for software development and testing purposes.

When running under OS/2, the DOS CONFIG.SYS should be edited to load the device driver FSFILTER.SYS. This acts like a network redirector, allowing all access to FAT drives to proceed normally but redirecting HPFS access through the OS/2 kernel which is running in the background.

However, when DOS is booted on the 'bare metal' of the machine, OS/2 is no longer running, and FSFILTER.SYS has no effect. In this case, the user should obtain a shareware utility called AMOS310.ZIP or its equivalent. This program allows files to be read from HPFS drives, and in the registered version it can also write to HPFS drives.

Can I Mix FAT and HPFS Drives on My Machine?
Yes, you can use a mixture of FAT and HPFS logical drives. However, there are some considerations to bear in mind.

First, with versions of DOS before 4.01, FAT drives placed after an HPFS logical drive will not be seen. That is, the following scheme will not work: +--+-+---+-+ | C: FAT (DOS 4.0) | D: HPFS (OS/2) | E: FAT (DOS Apps) | F: HPFS (OS/2 Apps) | +--+-+---+-+ This is because the E: drive follows an HPFS drive D:, and DOS 4.0 will not see it.

With DOS 4.01 and later, FAT logical drives located after HPFS drives will be seen, but there is a second problem. Because both operating systems allocate drive letters to logical drives in sequence, but DOS cannot use HPFS drives, the drive lettering will change between operating systems. For example, under OS/2, the previous drive layout will look like this: +--+-+---+-+ | C: FAT (DOS 5.0) | D: HPFS (OS/2) | E: FAT (DOS Apps) | F: HPFS (OS/2 Apps) | +--+-+---+-+ But when the machine is rebooted under DOS, it will look like this: +--+-+---+-+ | C: FAT (DOS 5.0) | NV HPFS (OS/2) | D: FAT (DOS Apps) | NV HPFS (OS/2 Apps) | +--+-+---+-+ NV = Not Visible

Notice that the DOS Apps drive is D: under DOS, but E: under OS/2. The shifting of drive letters can be annoying at best, and will break drive letter assumptions in batch files, programs' .INI files and other places.

For this reason, the simplest configuration is to place all FAT drives before all HPFS drives: +--+---++-+ | C: FAT (DOS 5.0) | D: FAT (DOS Apps) | E: HPFS (OS/2) | F: HPFS (OS/2 Apps) | +--+---++-+

How Do I Undelete Files?
First of all, you don't use your DOS utilities for the purpose - they won't work. Instead, you need an undelete command for HPFS, such as those in the Gammatech Utilities or Graham Utilities. Alternatively, there is an IBM employee-written-software program called File Phoenix which will do the job.

But be advised that recovery is not a sure thing. Under DOS, if you don't explicitly write to the disk surface, the former file's location will not be overwritten, and the file can be recovered. But under OS/2, although you may act promptly to recover the file, the space which it occupied may already have been reused by the system, for swap file growth, INI files, temp files, or data saved from other applications.

How Do I Install Both Win95 and OS/2 on a New Computer?
Start by removing all partitions on the hard disk (back up data if necessary). Partition the drive so that you have a C: drive big enough for Win95 and an extended partition big enough for OS/2 and data, plus 2 MB or so of free space at the end for Boot Manager.

Install Win95 first, into the C: drive.

Then install OS/2. Start by rejecting C: as the target drive for the install, so that it then loads FDISK. Install Boot Manager, then create other drives as required.

If your C: drive is big enough for Win95 and all your apps, then install OS/2 into the D: drive and make it HPFS. If you need a D: drive for Win95 apps/data, then create it and then create an E: drive for OS/2. Mark D: or E: as installable, then exit FDISK, reboot and resume the install.

Problems? Don't install OS/2 first, then Win95, as you'll get snotty remarks from MS about OS/2. Put Boot Manager at the end of the disk if you can, but no more than 1024 cylinders into the drive. Remember to add Win95 to the Boot Manager menu after installing OS/2.

Try to create all the drives required as you install OS/2. Adding them later can push your CD-ROM drive letter up, and since OS/2 remembers the letter of the drive it was installed from (to later install printer drivers, etc.) it misplaces the CD-ROM.

I believe there may be some issues relating to Win95's use of long filenames on the C: drive, but have not experienced them myself as I don't run Win95 and can't imagine ever needing to. Main point: read the OS/2 Warp user manual before installing, as well as the README on diskette 1. There's a lot of information there.

How Do I Format a Drive to HPFS?
Assuming there is no data in the existing drive that you want to retain (or that you have already backed it up), then from the command line, give the command: FORMAT d: /FS:HPFS where d: is the drive you want to format. Notice that by default, this performs a quick (/Q) format on a hard disk, and does not perform any kind of surface media test. In addition, should you subsequently run CHKDSK /3, you may put the file system into an inconsistent state. To avoid these problems, use the /L parameter for a long format: FORMAT d: /FS:HPFS /L Alternatively, you can open your Drives folder, select the drive, pop up its menu and choose "Format disk...". Enter the volume label and select "HPFS", then press the "Format" button. Note that this performs a quick format.

Can I Format a Floppy to HPFS?
Huh? The terms 'floppy disk' and 'high performance' just cannot appear in the same sentence!

By definition, HPFS does not support removable media. In the case of floppy disks, the use of FAT as the only officially supported format ensures interchangeability with other operating systems. If you were able to format a floppy to FAT, some idiot Windows user would only pick it up, try to use it, decide it was unformatted and reformat it, losing your data.

However, with the advent of Iomega Zip and Syquest drives of acceptable performance and much higher capacity, HPFS is much more attractive for these forms of removable media. Some users have reported success with using HPOFS, the High Performance Optical File System. Others are using a utility called HPFSRem.

Can I Convert a FAT Drive to HPFS?
There are two techniques for converting an existing drive from FAT to HPFS. The first (called the BFBI technique) consists of three steps: Note that in order to be successful, the backup and restore utility used must be capable of correctly dealing with extended attributes, i.e. it must be an OS/2-specific backup utility such as Back Again/2 or BakupWiz.
 * Back up all files on the drive
 * Reformat the drive, using the /FS:HPFS option
 * Restore the files

The alternative technique is to use a utility designed for partition and logical drive management such as Powerquest's Partition Magic. This utility is able to resize and move partitions, and most importantly, to convert a logical drive from FAT to HPFS.

What Are the CHKDSK Options for HPFS?
The CHKDSK options are:

The most important CHKDSK option is the /F:n parameter: Warning! If you use CHKDSK /F:3 after formatting a drive using FORMAT /Q (the default), CHKDSK will attempt to recover files which existed prior to the reformat. Generally speaking, you should never use CHKDSK /F:3.

What Does "The Disk is In Use Or Locked By Another Process" Mean?
This occurs when you try to run CHKDSK with a /F parameter at a level greater than zero, and there are open files on the disk. The error message looks like this: The current hard disk drive is: D: The type of file system for the disk is HPFS. The HPFS file system program has been started. SYS0108: The disk is in use or locked by another process. In order to perform analysis of the drive, CHKDSK has to open the entire drive for exclusive use as though it was a file. It cannot do this if there are any open files on the disk. Reasons for open files include: The simplest fix for these problems is to shut down the system and reboot from a set of 'Utility Diskettes'. These can be made using the 'Create Utility Diskettes' object in the 'System Setup' folder. A better alternative, however, is to boot from a second, small, logical drive on your hard disk which contains a maintenance copy of OS/2 that boots to an OS/2 command prompt. This can be made manually, by copying the files from the utility diskettes or by using a utility like BOOTOS2.ZIP. After booting from a drive other than the main system drive, and not running any applications, you should be able to CHKDSK the drive.
 * Applications running which have opened files
 * The swap file being located on the drive
 * Spool files being located on the drive
 * System or application DLLs or workplace shell class DLLs being on the drive

If you do not have a maintenance partition or a set of utility diskettes, then you can boot from the original installation diskettes and use Escape or F3 to get to a command prompt.

What is the 'HPFS File System Program'?
Because CHKDSK has to be able to cope with installable file systems, including those which may not even have been developed yet, it does not, in itself, have the ability to analyse volumes mounted using IFS. FORMAT, likewise, must be able to format media where installable filesystems support that ability.

Consequently, the CHKDSK and FORMAT programs are simply skeletons which call routines in a utility DLL (dynamic link library) which must be supplied with each IFS driver. This DLL has the same root name as the IFS driver, so that the one for HPFS is called UHPFS.DLL and for the CD-ROM IFS it is UCDFS.DLL. When the DLL is loaded and initialised, it produces the message

The HPFS file system program has been started.

which appears as part of the CHKDSK output.

This is why, if you are making up your own set of utility diskettes, you need not just CHKDSK.COM, but also UHPFS.DLL.

What's HPFS386?
HPFS386 is the 32-bit version of HPFS which is installed on servers running IBM LAN Server Advanced and Warp Server Advanced (the Entry level versions do not include HPFS386).

The benefits of HPFS386 over ordinary HPFS are: Note that despite the 386 in the name, the 32-bitness of HPFS386 is not terribly significant. File system drivers spend a lot of their time waiting for the disk controller to return sectors from the drive, and 32-bit code doesn't wait any faster than 16-bit code!
 * Larger cache sizes
 * Direct connection between the filesystem and the network
 * Local security (need an account to access files, even at the machine)

HPFS386 includes a SMB (Server Message Block protocol) server for file system requests. This means that incoming file system requests are routed directly from the NETBIOS driver in ring zero to the HPFS386 SMB server, also in ring zero, without passing through the higher levels of LAN Server at ring three. The elimination of the ring transitions provides a considerable performance improvement.

Local security means that, even to access files while sitting in front of the machine, you must be logged in. All access permissions for users and groups are embedded in the file system in a similar fashion to EAs, and if you do not have the appropriate access permissions, access will be denied.

Generally, HPFS386 is not particularly useful on stand-alone machines or workstations. Besides, the price of a LAN Server Advanced licence is prohibitive for a single user.