Getting Object-Oriented with DB2 for OS/2 V2.1

From EDM2
Jump to: navigation, search

Reprint Courtesy of International Business Machines Corporation, © International Business Machines Corporation

By Rick Weaver

With DB2 for OS/2 V2.1, IBM has greatly enhanced its premier 32-bit database system. Along with many functional and performance enhancements to the database engine, the new version of DB2 is poised to better support and promote object-oriented concepts and object-oriented development.

What are some of these object-oriented extensions? How can they be used? This article shows how these powerful extensions can be implemented in your environment.

DB2 for OS/2 V2.1 comes with several new features that enhance and support object orientation and object-oriented development. These features include large object support, user-defined functions, user-defined data types, and C++ support. Each of these features is discussed in detail in this article.

CHECK Example

Throughout the article, a simple example illustrates these object-oriented concepts. The example requires a table that contains information about a checking account. The CHECK table must store the following data elements:

  • Account number
  • Check number
  • Check payee
  • Check date
  • Cleared status
  • Scanned image of the check (75 KB bitmap)

The examples in this article use the CHECK table as a model.

Large Object Support

Before DB2 for OS/2 V2.1, the maximum length of a row without a LONG VARCHAR was limited to 4096 bytes. If a LONG VARCHAR was used, the maximum size was 32768 bytes. This was limiting if you wanted to store large data elements such as images, sounds, binary executables, or other data elements larger than 32768 bytes.

In this check example, the check's scanned image is larger than 32 KB, so if you were to create the check table with DB2/2 V1.x, you would probably create it using the data definition language (DDL) shown in Figure 1.

CREATE TABLE CHECK (ACCT_NUM CHAR(16) NOT NULL,
	CHECK_# INTEGER NOT NULL, 
	PAID_TO VARCHAR(50) NOT NULL, 
	CHK_DATE DATE NOT NULL, 
	CLEARED INTEGER NOT NULL, 
	CHECK_IMAGE_PATH VARCHAR(254) NOT NULL)

Figure 1. DDL for CHECK Example Using DB2/2 V1.1

In CHECK_IMAGE_PATH, you can then store a pointer (a fully qualified path) to the bitmap image, but not the bitmap itself, in the database.

Although this design is functional, it does open up some interesting considerations:

  • When performing a database backup, the backup does not store the bitmap with the database.
  • What if the bitmap were moved to another location?
  • What process synchronizes the location of the bitmap with the CHECK_IMAGE_PATH column?

In DB2 for OS/2 V2.1, all the above limitations no longer exist, because the new version of DB2 now stores objects up to 2 GB in the database. Binary large objects (BLOBs), character large objects (CLOBs), and double-byte character large objects (DBCLOBs) can all be stored.

If you want to implement the same check example in DB2 for OS/2 V2.1, the DDL to create the table is shown in Figure 2.

CREATE TABLE CHECK (ACCT_NUM CHAR(16) NOT NULL,
	CHECK_# INTEGER NOT NULL, 
	PAID_TO VARCHAR(50) NOT NULL, 
	CHK_DATE DATE NOT NULL, 
	CLEARED INTEGER NOT NULL, 
	CHECK_IMAGE BLOB(75K) LOGGED COMPACT)

Figure 2. DDL for CHECK Example Using DB2 for OS/2 V2.1

You can then store all the data elements inside the database and eliminate any extra files needed to support large objects. This reduces the database administration as well as the programming effort to maintain the database.

To LOG or NOT LOG

As shown in Figure 2, the

LOGGED 
option on an LOB logs changes to the
CHECK_IMAGE 
column in the database transaction log. If you use very large LOB elements, then logging to the transaction log may be extremely expensive and create additional input/output (I/O). If you choose the
NOT LOGGED 
option, place the log in a separate table space for recovery (and performance). The
COMPACT 
option specifies that minimal disk space will be used for storing LOBs.

DB2 frees any extra disk pages in the last group used by a LOB value. However, storing data this way may penalize performance if the LOB is updated with a larger size.

Making Your Own Functions?

DB2 for OS/2 V2.1 introduces a powerful way to expand Structured Query Language (SQL) functionality with user-defined functions (UDFs). User-defined functions let you write scalar functions in an external 3GL programming language (the programming language must conform to the ANSI C standard for the calling and linkage conventions), and to invoke the UDF via SQL, a trigger, or a view definition. This opens the door to many possibilities:

  • UDFs can send an e-mail message when a new row has been added to a queue table.
  • UDFs can upload a text file to a mainframe when an SQL statement is issued.
  • UDFs can perform any function that the standard SQL function repository does not support.

The only limitations to UDFs are that they must not contain any SQL, and they must return a scalar (single) return value.

Creating a UDF

Back to the check example, create a UDF that sends an e-mail message and updates the CLEARED column with the UDF's return code. First, register the function to the database by issuing the CREATE FUNCTION DDL shown in Figure 3.

CREATE FUNCTION EMAIL (INT) RETURNS INT
	EXTERNAL NAME 'C:\C\UDF\UDF!email'
	LANGUAGE C
	PARAMETER STYLE DB2SQL
	VARIANT NOT FENCED
	NO SQL EXTERNAL ACTION

Figure 3. Registering a UDF with the Database

UDF Parameters

In Figure 3, the first line of the CREATE FUNCTION DDL defines the UDF function name known by DB2, the parameter(s) passed to the function, and the datatype returned by the UDF. The EXTERNAL NAME is the name of the DLL (in this case, UDF.DLL) and !email refers to the entry point (function name) in the UDF.DLL. The entry point must be exported in the .DEF file when building the UDF DLL. Additionally, the entry point name and the UDF external name must both have the same case.

The LANGUAGE C line is mandatory for any UDF declaration; it denotes the calling convention and linkage option that the user-defined function must use. Any UDF must conform to the ANSI C prototype for it to function properly as a UDF.

The PARAMETER STYLE DB2SQL, another mandatory clause, specifies the conventions for passing parameters to and from external functions. The DB2SQL value reflects the ISO/ANSI draft standard as of September, 1994.

VARIANT specifies that the UDF may or may not return the same value if the same parameter values are passed to the UDF. NOT VARIANT functions will return the same value if the same parameter values are passed to the UDF.

After completely debugging a UDF, you can configure the UDF to run within the DB2 engine. If a function is NOT FENCED, then it runs within the DB2 engine and is potentially much faster than a FENCED UDF. A FENCED function is safer, because it runs separately outside the DB2 engine, which reduces the risk of database engine failure due to a UDF.

The NO SQL clause is also mandatory, because a UDF cannot execute SQL statements internally. The EXTERNAL ACTION clause specifies that the !email function is going to perform an action outside the domain of DB2 for OS/2 (e.g., sending an e-mail message).

Writing and Calling a UDF

The second step is to write the UDF in a 3GL language such as C. Figure 4 illustrates sample source code for an e-mail function.

/*UDF to send an email message*/
#include 
#include 
#include 
void SQL_API_FN email
 (
	short *input,	/*ptr to input argument*/
	short *output,	/*ptr to output argument*/
	short *input_ni,	/*ptr to input null indicator*/
	short *output_ni,	/*ptr to output null indicator*/ 
	char sqlstate[6],	/*sqlstate*/
	char fname[28],	/*fully qualified function name*/
	char finst[19],	/*function specific name*/
	char msg[71])	/*msg text buffer*/
 {
	int rc=0;	/*return code*/
	char emailstr[80];
	char user[9]="XXXXXXXX";

	/*Send E-MAIL message, return 0 if successful, return 1 if unsuccessful*/
	rc=emailstd(user,emailstr);
	*output=rc;	/*returns rc back to calling SQL statement.*/
	return;
 }

Figure 4. Sample Source Code for an E-Mail Function

Once compiled into a DLL, the UDF can be invoked through the SQL statement shown in Figure 5.

UPDATE CHECK
SET CLEARED=email(CHECK_#)
WHERE CHECK_#=:check_num;

Figure 5. SQL Statement to Invoke the UDF

In Figure 5, the SQL statement sets the CLEARED flag in the CHECK table to the return code of the e-mail application programming interface (API) call, as passed by the output variable shown in Figure 4.

UDFs can also be overloaded. Overloaded UDFs simply mean that you can have multiple UDF functions with the same name. If the UDF is overloaded, the arguments or data types must be different so that DB2 can determine which UDF to call.

UDFs can also be registered to other source functions. If you want to use the built-in function avg with a new User-Defined Type (UDT) (discussed later), then a UDF can be created which will use the source average function. The DDL to create a source function would look like this (all on one line):

CREATE FUNCTION
	avg(newdatatype)
	RETURNS newdatatype
	SOURCE SYSIBM.AVG(INT)

Unlimited Typ(ing)

User-Defined Types (UDTs) support the creation of non-standard data types that are derived from existing built-in DB2 data types. Once created, the UDT can then be used in DDL to create tables, triggers, and so on. UDTs provide data abstraction from the base data type, which promotes encapsulation and provides a foundation for future object-oriented extensions.

UDTs can only be created over the built-in DB2 data types. If the UDT is not created upon a BLOB, CLOB, DBCLOB, LONG VARCHAR, or LONG VARGRAPHIC, then DB2 can create comparison functions on the new data type if the WITH COMPARISONS clause is specified in the CREATE DISTINCT TYPE statement. These comparison functions include: =, >=, >, <, and <=. Additionally, casting functions will be created to cast the UDT back to its base data type, and the base data type back to the UDT.

For the BLOB, CLOB, LONG VARCHAR, and LONG VARGRAPHIC data types, you must create UDFs to support comparison with those data types.

In the check example, if you want to create a UDT BITMAP in place of the BLOB data type, issue the following DDL statement:

CREATE DISTINCT TYPE

BITMAP AS BLOB(75K) Once the data type is created, the DDL to create the new check table is shown in Figure 6.

CREATE TABLE CHECK (ACCT_NUM CHAR(16) NOT NULL,
	CHECK_# INTEGER NOT NULL, 
	PAID_TO VARCHAR(50) NOT NULL, 
	CHK_DATE DATE NOT NULL, 
	CLEARED INTEGER NOT NULL, 
	CHECK_IMAGE BITMAP LOGGED COMPACT)

Figure 6. DDL for Creating a UDT BITMAP

If you create a UDT of Boolean (BOOL) to represent the CLEARED column name (CREATE DISTINCT TYPE BOOL AS INTEGER WITH COMPARISONS), and then want to issue an SQL statement:

SELECT * FROM CHECK

WHERE CLEARED = 1 an error would occur.

Since the comparison is between a BOOL and an INT (integer), the numeric 1 must be cast into a BOOL data type for the SQL statement to work. (DB2 cannot compare two different data types without a cast.) To fix the SQL statement, do the following:

SELECT * FROM CHECK

WHERE CLEARED = BOOL(1)

C++ Support

If you were developing in C++ with prior versions of DB2/2, it was difficult to embed SQL in an application, because the DB2/2 V1.x precompiler did not support or allow SQL to be embedded within a C++ module. The solution for Version 1.x C++ applications was to build function wrappers in C, embed the SQL in C code, and invoke those functions from C++.

In DB2 for OS/2 V2.1, the precompiler can now preprocess a C++ program. Within a C++ program, you can embed host variables in a class called data members and can embed SQL statements within member functions.

A sample of using C++ with SQL is shown in Figure 7. For any LOB object data member, you should make sure that the data member is declared as a pointer to the LOB (as shown by the new operator in the constructor). If the object data member is declared as a normal variable, the memory to store the object is allocated on the stack and can cause stack overflow errors. Allocating the memory for the LOB on the heap through the new operator eliminates this problem and is much more efficient.

CHECK.HPP

//Check class definition
class CHECK
 {
	EXEC SQL BEGIN DECLARE SECTION;
	char acct_num[17];
	long check_num;
	char paid_to[51];
	char chk_date[12];
	long cleared;
	sql type is blob image(75k) *check_image;
	EXEC SQL END DECLARE SECTION;
	public:
	...
	check(char *, long, char *, char *, long, PBYTE, long);  //Constructor
 ~check(void); //Destructor
 };

CHECK.SQC

...
EXEC SQL INCLUDE 'CHECK.HPP';

void check::check(char *acct, long ch_num, char *pay, char
*chk_dt, long cl, PBYTE image, long imagelen) :
check_num(ch_num), cleared(cl), check_image(new check_image_t)
 {
	strcpy(acct_num,acct);
	strcpy(paid_to,pay);
	strcpy(chk_date,b);
	check_image->length=imagelen; 
	memcpy(check_image->data,image,imagelen);
	
	EXEC SQL INSERT INTO CHECK VALUES(:acct_num, :check_num,
:paid_to, :check_date, :cleared, :*check_image);
...
 }

check::~check(void)
 {
	delete check_image;
 }

Figure 7. Using C++ in SQL

DB2 will use this pointer only to resolve references to host variables. A host variable cannot be referenced by explicitly qualifying an object instance (i.e.,

SELECT  CLEARED FROM CHECK WHERE CHECK_# = :my_obj.check_num

).

As with C programs, all host variables must be unique within a C++ module, even if the variables are local to a class or function.

Get Object-Oriented!

With its large object support, user-defined functions, user-defined types, and C++ support, the DB2 for OS/2 V2.1 product is poised to greatly enhance your object-oriented development effort. Exploitation of each of these extensions, along with with the addition of triggers in the new DB2, opens a new world of increased functionality and productivity for DB2 for OS/2 developers, administrators, and end users.