An Introduction to C++ Programming - Part 6/13

From EDM2
Revision as of 18:50, 3 December 2018 by Ak120 (Talk | contribs)

Jump to: navigation, search
An Introduction to C++ Programming / Part
1 2 3 4 5 6 7 8 9 10 11 12 13

Written by Björn Fahller

Inheritance

Introduction

This month's article is dedicated to the buzzword "inheritance." Inheritance is a way of expressing similarity. As an example, we all have an idea of what a chair is, despite that there are many totally different kinds of chairs. This similarity can be expressed with inheritance. Before delving into inheritance, though, let's finish where we left last month, and take care of formatting output for our own types.

What do we want?

We're now facing a tough situation. There are a number of formatting parameters, and we must know how to handle every single one of them. For example, say we're to print our Range left aligned with a field width of 20 characters, a padding character of '.' and showing also the positive sign. How do we do this? To begin with, what appearance do we want? One thing is for sure, whoever wants to print our Range with that formatting, expects it to be valid for the Range itself, and not for the first '[' or upper bound of the Range only. In other words, we must see to it that all of our range gets a field width of 20 characters, right aligned. Of course, since we cannot really know how wide the upper and lower limit will be when printed, we cannot solve the problem. Tough indeed. OK, let's do second best. Whatever the width is set to, we'll occupy, and we'll give the upper and lower limit equally much space.

  ostream& operator<<(ostream& os, const Range& r)
  {
    if (!os.opfx()) return os;

    int width=os.width(); // get current width setting.
    os << setw(1) << '[' << setw(width) << r.upperBound()
       << ',' << setw(width) << r.lowerBound() << ']';
    os.osfx();
    return os;
  }

In the above we make use of the fact that width is cleared after printing. Not optimal, given our wishes, but it's OK. Now for the topic of this month.

Inheritance

Let's have a look at a classic problem, with a classic Object Oriented solution. We've been contracted by Big Company, to write software for their staff related issues. At Big Company, we find managers, engineers, secretaries and project leaders (hmm, looks like where I work, except for the lack of marketers.) Managers manage a number of employees and have access to a secretary. A project leader reports to the manager responsible for his project. An engineer works on a project and thus reports to a project leader. Every engineer also has a manager. Now, how do we model this?

We recognise one thing for sure. Every employee has a name and a manager. Now let's make a simple minded attempt:

  class Manager {
  public:
    Manager(const char* aName,
            const Manager* manager,
            Secretary* aSecretary);
    const Manager* manager(void) const;
    const char* name(void) const;
    Secretary* secretary(void);
  private:
    char* theName;
    const Manager* theManager;
    Secretary* secretary;
  };

  class Engineer {
  public:
    Engineer(const char* aName,
             const Manager* aManager,
             const ProjectLeader* aProjectLeader);
    const Manager* manager(void) const;
    const ProjectLeader* projectLeader(void) const;
    const char* name(void) const;
  private:
    char* theName;
    const Manager* theManager;
    const ProjectLeader* theProjectLeader;
  };

  class Secretary {
  public:
    Secretary(const char* aName,
              const Manager* aManager);
    const Manager* manager(void) const;
    const char* name(void) const;
  private:
    char* theName;
    const Manager* theManager;
  };

  class ProjectLeader {
  public:
    ProjectLeader(const char* aName,
                  const Manager* aManager);
    const char* name(void) const;
    const Manager* manager(void) const;
  private:
    char* theName;
    const Manager* theManager;
  };

One problem here should be apparent. All employees will have identical code for handling name and manager. That's bad. Duplicated code is always bad. What's worse, this problem itself will duplicate. Everything we want to be able to do to any employee, must be written for the four employee types. Sure, a template can help, but it will not be the solution. What if, we could instead express within the programming language, that we have something called employee, and that employees have a name and a manager. If we could then say that a manager is an employee, but with a secretary. Likewise we could model that an Engineer is an employee, but with the extras that they have project leaders. The case for the secretary and project leader is of course analogous.

This is what inheritance is all about. We create a class employee, with all the things that are common to all kinds of employees, and then we let the other classes inherit from it, and add only the extras.

An example

We can write the Employee class like this:

  class Manager; // Forward declaration

  class Employee {
  public:
    Employee(const char* aName, const Manager* aManager);
    const char* name(void) const;
    const Manager* manager(void) const;
  private:
    const char* theName;
    const Manager* theManager;
  };

On the line marked "Forward declaration" we say that there is a class called Manager. That's really all we say. Since we've said that there is such a class, we are allowed to use pointers to the class, and that's needed in the Employee class. Since there's a circular dependency between Employee and Manager, a forward declaration is needed. Note that it is not possible to instantiate an object of a forward declared class. The class must be defined before you can instantiate objects of it, but you can declare and define pointers and references to forward declared types. You can even declare functions accepting and returning objects of forward declared types, but you cannot define nor call the function before the type's definition is known.

Now let's define the manager class, by inheriting from Employee:

  class Secretary; // Forward declaration

  class Manager : public Employee {
  public:
    Manager(const char* aName,
                   const Manager* aManager,
                   Secretary* aSecretary = 0);
    void setSecretary(Secretary* aSecretary);
    Secretary* secretary(void);
  private:
    Secretary* theSecretary;
  };

Can you feel something cool going on here? When we declare Manager as "public Employee", we say that a manager is, for all intents and purposes, an employee. Everything you can do to an employee, you can do to a manager (OK, so the model isn't 100% realistic.) Everything that is public in "Employee" is public in "Manager" as well (except for the constructor and a few other special member functions.) That is, it's legal to call the member function "name" for a manager, and when you do, it's the member function defined in "Employee" that is executed. That is, what we have done is to define a new type "Manager" that is an extension of the type "Employee." Cool eh? The member function "setSecretary" is needed, since the first manager would other wise never be assigned a secretary (you can have a manager without a secretary, but not a secretary without a manager, thus when the company first starts as an one man business, there is no secretary, but one is hired later.) Let's add the Secretary, Engineer and ProjectLeader and then implement them all:

  class Secretary : public Employee
  {
  public:
    Secretary(const char* aName,
              const Manager* aManager);
  };

  class ProjectLeader : public Employee
  {
  public:
    ProjectLeader(const char* aName,
                  const Manager* aManager);
  };

  class Engineer : public Employee
  {
  public:
    Engineer(const char* aname,
             const Manager* aManager,
             const ProjectLeader* aProjectLeader);
    const ProjectLeader* projectLeader(void) const;
  private:
    const ProjectLeader* theProjectLeader;
  };

  Employee::Employee(const char* aName,
                     const Manager* aManager)
    : theName(aName),
      theManager(aManager)
  {
  }

  const char* Employee::name(void) const
  {
    return theName;
  }

  const Manager* Employee::manager(void) const
  {
    return theManager;
  }

  Secretary::Secretary(const char* aName,
                       const Manager* aManager)
    : Employee(aName, aManager) //****
  {
  }

  ProjectLeader::ProjectLeader(const char* aName,
                               const Manager* aManager)
    : Employee(aName, aManager) //****
  {
  }

  Engineer::Engineer(const char* aName,
                     const Manager* aManager,
                     const ProjectLeader* aProjectLeader)
    : Employee(aName, aManager), //****
      theProjectLeader(aProjectLeader)
  {
  }

  const ProjectLeader* Engineer::projectLeader(void) const
  {
    return theProjectLeader;
  }

  Manager::Manager(const char* aName,
                   const Manager* aManager,
                   Secretary* aSecretary)
    : Employee(aName, aManager), //****
      theSecretary(aSecretary)
  {
  }

  void Manager::setSecretary(Secretary* aSecretary)
  {
    theSecretary = aSecretary;
  }

  Secretary* Manager::secretary(void)
  {
    return theSecretary;
  }

The "Employee" implementation is familiar. The constructors of the other classes is the only somewhat odd thing. In the initialiser list of the constructors we call the constructor of "Employee" on the lines marked //****. Since a Manager (for example) is an Employee, the employee side of the Manager must be constructed, and it is done with an explicit call to the constructor of the ancestor.

This is actually all there is to it. Here's proof!

  int main(void)
  {
    Manager CEO("Big Boss", 0);
    Secretary sec1st("1st secretary", &CEO);
    CEO.setSecretary(&sec1st);

    Secretary shared("shared secretary", &CEO);

    Manager middle1("Medium boss 1", &CEO, &shared);
    Manager middle2("Medium boss 2", &CEO, &shared);


    ProjectLeader p("Proj1", &middle1);
    Engineer e("Eng", &CEO, &p); // Managed by CEO, but
                                  // work on a project
                                  // controlled by p.

    const Manager* pm = CEO.manager();
    cout << "CEO is :" << CEO.name()
         << " whose manager is "
         << (pm ? pm->name() : "nobody") << endl;
    cout << "The secretary of CEO is "
         <<  CEO.secretary()->name() << endl;
    cout << CEO.secretary()->name() << "'s manager is "
         << CEO.secretary()->manager()->name() << endl;

    cout << "The name of middle1 is: " << middle1.name()
         << " whose secretary is "
         << middle1.secretary()->name() << endl;
    cout << "The manager of "
         << middle1.secretary()->name() << " is "
         << middle1.secretary()->manager()->name() << endl;
    cout << p.name() << " is a project leader managed by "
         << p.manager()->name() << endl;
    cout << e.name() << " is an engineer managed by "
         << e.manager()->name()
         << " and works on a project controlled by "
         << e.projectLeader()->name() << endl;
  }

When executed, I get this output:

  [d:\cppintro\lesson6]staff.exe
  CEO is :Big Boss whose manager is nobody
  The secretary of CEO is 1st secretary
  1st secretary's manager is Big Boss
  The name of middle1 is: Medium boss 1 whose secretary is shared
  secretary
  The manager of shared secretary is Big Boss
  Proj1 is a project leader managed by Medium boss 1
  Eng is an engineer managed by Big Boss and works on a project
  controlled by Proj1

Not bad eh? Inheritance is a way good way of expressing commonality. To make it even neater, let's create a print operator for "Employee" and use that in "main".

  ostream& operator<<(ostream& os, const Employee& e)
  {
    if (!os.opfx())
      return os;

    cout << '"' << e.name() << "\" is managed by ";
    const Manager* pm = e.manager();
    if (pm) {
      cout << '"' << pm->name() << '"';
    } else {
      cout << "nobody";
    }
    os.osfx();

    return os;
  }

  int main(void)
  {
    Manager CEO("Big Boss", 0);
    Secretary sec1st("1st secretary", &CEO);
    CEO.setSecretary(&sec1st);

    Secretary shared("shared secretary", &CEO);

    Manager middle1("Medium boss 1", &CEO, &shared);
    Manager middle2("Medium boss 2", &CEO, &shared);


    ProjectLeader p("Proj1", &middle1);
    Engineer e("Eng", &CEO, &p); // Managed by CEO, but
                                  // work on a project
                                  // controlled by p.

    const Manager* pm = CEO.manager();
    cout << "CEO is :" << CEO.name()
         << " whose manager is "
         << (pm ? pm->name() : "nobody") << endl;
    cout << "The secretary of CEO is "
         <<  CEO.secretary()->name() << endl;
    cout << CEO.secretary()->name() << "'s manager is "
         << CEO.secretary()->manager()->name() << endl;

    cout << "The name of middle1 is: " << middle1.name()
         << " whose secretary is "
         << middle1.secretary()->name() << endl;
    cout << "The manager of "
         << middle1.secretary()->name() << " is "
         << middle1.secretary()->manager()->name() << endl;
    cout << p.name() << " is a project leader managed by "
         << p.manager()->name() << endl;
    cout << e.name() << " is an engineer managed by "
         << e.manager()->name()
         << " and works on a project controlled by "
         << e.projectLeader()->name() << endl;


    cout << endl;

    cout << CEO << endl;
    cout << sec1st << endl;
    cout << middle1 << endl;
    cout << middle2 << endl;
    cout << shared << endl;
    cout << p << endl;
    cout << e << endl;
  }

The output now becomes:

  [d:\cppintro\lesson6]staff2.exe
  CEO is :Big Boss whose manager is nobody
  The secretary of CEO is 1st secretary
  1st secretary's manager is Big Boss
  The name of middle1 is: Medium boss 1 whose secretary is shared
  secretary
  The manager of shared secretary is Big Boss
  Proj1 is a project leader managed by Medium boss 1
  Eng is an engineer managed by Big Boss and works on a project
  controlled by Proj1

  "Big Boss" is managed by nobody
  "1st secretary" is managed by "Big Boss"
  "Medium boss 1" is managed by "Big Boss"
  "Medium boss 2" is managed by "Big Boss"
  "shared secretary" is managed by "Big Boss"
  "Proj1" is managed by "Medium boss 1"
  "Eng" is managed by "Big Boss"

Can you see what happens here? As mentioned last month, "operator<<" is a function, with the syntax of an operator. This function is defined for "const Employee&" only, and it works with "Secretary" and "Manager" as well. Since "Manager" and "Secretary" publicly inherit from "Employee", they can be used as "Employee", so a reference to an "Employee" can legally refer to a "Secretary" or a "Manager."

While this is neat, it's not over by a long shot.

Virtual functions

Since the different classes hold somewhat different information (the derived classes are more specialised, so they hold more specific information,) it would be nice if we could see the differences when printing. One way of doing this is, of course, to define operator<< for all classes, but that's cheating. We'll do better than that by using object orientation, or more specifically, something called "dynamic binding" which is very central to object orientation. Say we use the template stack from part 4, and instantiate a stack of pointers to employees. Since a pointer to an employee can actually point to a secretary, a manager, a project leader, an engineer, or some other weird kind of employee we haven't yet defined, say a human resources person or (shudder) a marketer. Still, if we wanted to print the employees pointed to by the stack, wouldn't it be neat if we could see exactly what there was to see, for example that the employee happened to be an engineer, and allow us to see the engineer's project leader? Hold on tight now, here comes a mini example showing exactly that kind of thing:

  #include <iostream.h>

  class A
  {
  public:
    virtual void print(ostream&); //** 1
  };

  void A::print(ostream& os)
  {
    os << "A";
  }

  class B : public A
  {
  public:
    virtual void print(ostream&); //** 2
  };

  void B::print(ostream& os)
  {
    os << "B";
  }

  class C : public A
  {
  public:
    void print(ostream&); //** 3
  };

  void C::print(ostream& os)
  {
    os << "C";
  }

  class D : public A
  {                    //** 4
  };

  class E : public B
  {
  public:
    virtual void print(ostream&);
  };

  void E::print(ostream& os)
  {
    os << "E : public ";
    B::print(os); //** 5
  }

  ostream& operator<<(ostream& os, A& a)
  {
    a.print(os);
    return os;
  }

  int main()
  {
    A a;
    B b;
    C c;
    D d;
    E e;
    cout << a << endl;
    cout << b << endl;
    cout << c << endl;
    cout << d << endl;
    cout << e << endl;
    return 0;
  }

When executed, this program displays:

  [d:\cppintro\lesson6]virt.exe
  A
  B
  C
  A
  E : public B

How did this work? Let's first have a look at the marked lines in the source code. At **1, we declare the member function "A::print" as "virtual." "virtual" means, that if the function is overloaded by a descendant (any of the other classes in the example,) and the member function is called on an object of the descendants class (say B,) but through a pointer or reference to the ancestor (that is A,) it's the function of the descendant (say B again) that is to be called.

At **2 this kind of overloading takes place the way I think it should be. As can be seen at **3, the keyword "virtual" is not needed when overloading (if a member function is virtual for an ancestor, they automatically become virtual for the descendants.) I still think it's a good idea to have the keyword there, because it makes the intention clearer.

At **4 there is no function overloaded, so if d.print() is called, it's A::print() that's executed (it is, however, possible to inherit from D and overload "print", and it would behave as the other examples. There's no way to "unvirtualise" a member function.

At **5 the "print" of the immediate ancestor (B) is called.

With the help of the above, let's analyse the program execution.

  • "cout << a" creates a reference to "a" and calls "print" on it. Pretty straight forward.
  • "cout << b" creates a reference to "b" (but the reference is an "A&") and calls "print" on it. Since the object referenced really is a "B", and the member function "print" is virtual and overridden for class "B", it's "B::print" that's called.
  • "cout << c" creates a reference to "c" (an "A&" to "c") and calls "print" on it. The situation is the same as for "b".
  • "cout << d" does likewise, but since there is no "D::print", it's "A::print" that's called.
  • "cout << e" calls "print" for an "A&" to "e", and since class "E" overrides "print", it's that "print" that's called. It writes "E : public" and then calls the "print" of "B".

Are you ready for something mind-stretching? With the aid of the above, you hardly ever need a "switch" statement. As a matter of fact, whenever you have a "switch" statement in C++, think carefully if the problem couldn't be solved with inheritance and virtual functions instead. Usually the answer is not only yes, but it even makes for a solution that's easier to understand.

Note the differences between this virtual function call, or dynamic binding as it is also called, and templates. Templates generate code at compile time, fixed code, in several instances. Here there is only one "operator<<", it's not a template. It calls the virtual function, which dynamically, at run-time, is bound to a function of the object referred to.

Now that you've seen this, it's time for a Very Important Rule. Whenever you use inheritance, make sure you *always* declare the destructor of the base class "virtual." Here's a mini example showing you why:

  #include <iostream.h>

  class A
  {
  public:
    ~A();
  };

  class AA : public A
  {
  public:
    ~AA();
  };

  class B
  {
  public:
    virtual ~B();
  };

  class BB : public B
  {
  public:
    virtual ~BB();
  };

  A::~A() { cout << "~A" << endl; }
  AA::~AA() { cout << "~AA" << endl; }
  B::~B() { cout << "~B" << endl; }
  BB::~BB() { cout << "~BB" << endl; }

  int main()
  {
    A* pa1 = new A;
    A* pa2 = new AA;
    B* pb1 = new B;
    B* pb2 = new BB;
    delete pa1;
    cout << "--" << endl;
    delete pa2;
    cout << "--" << endl;
    delete pb1;
    cout << "--" << endl;
    delete pb2;
    cout << "--" << endl;
    return 0;
  }

The execution results in:

  [d:\cppintro\lesson6]virt2
  ~A
  --
  ~A
  --
  ~B
  --
  ~BB
  ~B
  --

As you can see, the destructor for "AA" is never called. The reason is that we're dealing with pointers to the base classes only, and when calling delete on a pointer to an object, the destructor for the object pointed to is called. The destructor to call is determined by the type of the pointer, and if the destructor isn't declared "virtual," it won't call the most derived version, as it should.

The above result also gives a reason to switch to the next issue with inheritance.

Construction and Destruction

Let's revisit the old "Tracer" class from part 2. It looks like this:

  class Tracer
  {
  public:
    Tracer(const char* tracestring);
    ~Tracer(); // destructor
  private:
    const char* string;
  };

  Tracer::Tracer(const char* tracestring)
    : string(tracestring)
  {
    cout << "+ " << string << endl;
  }

  Tracer::~Tracer()
  {
    cout << "- " << string << endl;
  }

With the aid of the tracer, we can see what happens with object construction and destruction when inheritance is used. Let's go for an example right away:

  class A : public Tracer
  {
  public:
    A(const char* name1, const char* name2)
     : Tracer(name1),
       trc(name2) { cout << "A" << endl;}
    virtual ~A() { cout << "~A" << endl;}
  private:
    Tracer trc;
  };

  class B : public A
  {
  public:
    B(const char* n1, const char* n2, const char* n3)
      : A(n1, n2), trc(n3) { cout << "B" << endl;};
    virtual ~B() { cout << "~B" << endl;}
  private:
    Tracer trc;
  };

  int main(void)
  {
    cout << "creating an A" << endl;
    A a("A-ancestor", "A-component");
    {
      cout << "creating a B" << endl;
      B b("B-A-ancestor", "B-A-component", "B-component");
      cout << "destroying a B" << endl;
    }
    cout << "destroying an A" << endl;
    return 0;
  }

Execution gives me this printout:

  creating an A
  + A-ancestor
  + A-component
  A
  creating a B
  + B-A-ancestor
  + B-A-component
  A
  + B-component
  B
  destroying a B
  ~B
  - B-component
  ~A
  - B-A-component
  - B-A-ancestor
  destroying an A
  ~A
  - A-component
  - A-ancestor

An analysis shows that when creating an object, the first thing is that the data members of the base class are created, then the constructor body of the base class is executed. After that, the data members of the derived class is created, followed by the execution of the constructor, and so it goes towards the most derived class. The last thing to be executed is the constructor body of the most derived class. This is out of necessity. When the constructor body for the most derived class executes, everything it might need access to (data members, as well as the inherited parts,) is already constructed and legal to use. Note an implication of this: It's not a very good idea to call virtual functions in a constructor (as a matter of fact, if called from within a constructor they don't have their "virtuality", binding is static.) As usual in C++, destruction is in exactly the reverse order of construction.

Hmm... There's a lot more to say on the topic, but I think I'll save some for next month.

Oh, OK, one last thing. WARNING!!! *Never* use public inheritance as a way of reusing code. Public inheritance models "is-a" relationships only. If you use public inheritance for the purpose of reusing code, you're creating a maintenance nightmare for yourself, as well as conceptual havoc in your design. Please, please, take note of this. It's probably the most frequently committed sin in C++ and any other object oriented programming language, and it brings you nothing but trouble. Why? Even if your intention is code-reuse only, you will in fact, whether you like it or not, get an "is-a" relationship with public inheritance. Let's say that we in the staff example defined the class project, and we know that all projects are named. Let's also say that to make life easy, we re-use code from the "Employee" class, by publicly inheriting from it. Now we'll be able to do amazing things with the projects! Public inheritance is for "is-a" relationships only.

Exercises

  • What's the difference between inheritance and templates?
  • Say we state firm pre and post conditions for a virtual function. In what way, if any, may the pre and post conditions for an override in a derived class differ from that in the base class (this truly requires some thought.)
  • Experiment with the constructor/destructor tracer and exceptions. What happens?
  • Expand the employee example such that the operator<<(ostream&, const employee&) prints more detailed data depending on the kind of employee. You're not allowed to use templates.
  • Why is it important to declare destructors virtual?
  • In what way can dynamic binding replace switch statements?
  • The word "public" when inheriting suggests that there might be other kinds of inheritance. What might those be, and what would the difference be?
  • If we have a class A, and a class B that publicly inherits from A, an instance of A can call a member function of B. How?
  • An often heard prejudice that's totally wrong, is that virtual function calls are slow. Where do you think this idea stems from, and why is it wrong?

Recap

For being such a seemingly small topic, lots of new and fairly advanced things have been seen:

  • Public inheritance can be used to extend existing types, such that the extension can still be used just like the type being extended from.
  • Public inheritance models "is-a" relationships (and "is-a" relationships only.)
  • Dynamic binding is a way to call a function that is determined by the run-time type of the object referred (or pointed) to.
  • Dynamic binding can often replace switch statements.

Coming up

Next month we'll dive a little deeper and have a look at other kinds of inheritance.

As always, send me e-mail at once, stating your opinions, desires, questions and (of course) answers to this month's exercises!