>
>
>
Wade Not in Unknown Waters. Part Four

Andrey Karpov
Articles: 674

Wade Not in Unknown Waters. Part Four

This time we will discuss virtual inheritance in C++ and find out why one should be very careful using it. See other articles of this series: N1, N2, N3.

Initialization of Virtual Base Classes

At first let's find out how classes are allocated in memory without virtual inheritance. Have a look at this code fragment:

class Base { ... };
class X : public Base { ... };
class Y : public Base { ... };
class XY : public X, public Y { ... };

It's pretty clear: members of the non-virtual base class 'Base' are allocated as common data members of a derived class. It results in the 'XY' object containing two independent 'Base' subobjects. Here is a scheme to illustrate that:

Figure 1. Multiple non-virtual inheritance.

When we deal with virtual inheritance, an object of a virtual base class is included into the object of a derived class only once. Figure 2 shows the structure of the 'XY' object in the code fragment below.

class Base { ... };
class X : public virtual Base { ... };
class Y : public virtual Base { ... };
class XY : public X, public Y { ... };

Figure 2. Multiple virtual inheritance.

It is at the end of the 'XY' object that memory for the shared subobject 'Base' is most probable to be allocated. The exact implementation of the class depends on the compiler. For example, the classes 'X' and 'Y' may store pointers to the shared object 'Base'. But as far as I understand, this practice is out of use nowadays. A reference to a shared subobject is rather implemented through offset or as information stored in the virtual function table.

The "most derived" class 'XY' alone knows where exactly a subobject of the virtual base class 'Base' is to be allocated. That's why it is the most derived class which is responsible for initializing all the subobjects of virtual base classes.

'XY' constructors initialize the 'Base' subobject and pointers to it in 'X' and 'Y'. After that, all the rest members of the classes 'X', 'Y' and 'XY' are initialized.

Once the 'XY' constructor has initialized the 'Base' subobject, the 'X' and 'Y' constructors are not allowed to re-initialize it. The particular way it will be done depends on the compiler. For example, it can pass a special additional argument into the 'X' and 'Y' constructors to tell them not to initialize the 'Base' class.

Now the most interesting thing which causes much confusion and a lot of mistakes. Have a look at the following constructors:

X::X(int A) : Base(A) {}
Y::Y(int A) : Base(A) {}
XY::XY() : X(3), Y(6) {}

What number will the base class's constructor take as an argument - 3 or 6? None!

The constructor 'XY' initializes the virtual subobject 'Base' yet does that implicitly. It is the 'Base' constructor which is called by default.

As the 'XY' constructor calls the 'X' or 'Y' constructor, it doesn't re-initialize 'Base'. That's why 'Base' is not being called with an argument passed into it.

Troubles with virtual base classes don't end here. Besides constructors, there are also assignment operators. If I'm not mistaken, the standard tells us that an assignment operator generated by the compiler may assign values to a subobject of a virtual base class multiple times or once. So, you just don't know how many times the 'Base' object will be copied.

If you implement your own assignment operator, make sure you have prevented multiple copying of the 'Base' object. The following code fragment is incorrect:

XY &XY::operator =(const XY &src)
{
  if (this != &src)
  {
    X::operator =(*this);
    Y::operator =(*this);
    ....
  }
  return *this;
}

This code leads to double copying of the 'Base' object. To avoid this, we should add special functions into the 'X' and 'Y' classes to prevent copying of the 'Base' class's members. The contents of the 'Base' class are copied just once, in the same code fragment. This is the fixed code:

XY &XY::operator =(const XY &src)
{
  if (this != &src)
  {
    Base::operator =(*this);
    X::PartialAssign(*this);
    Y::PartialAssign(*this);
    ....
  }
  return *this;
}

This code will work well, but it still doesn't look nice and clear. That's the reason why programmers are recommended to avoid multiple virtual inheritance.

Virtual Base Classes and Type Conversion

Because of the specifics of how virtual base classes are allocated in memory, you can't perform type conversions like this one:

Base *b = Get();
XY *q = static_cast<XY *>(b); // Compilation error
XY *w = (XY *)(b); // Compilation error

A persistent programmer, though, will achieve that by employing the operator 'reinterpret_cast':

XY *e = reinterpret_cast<XY *>(b);

However, the result will hardly be of any use. The address of the beginning of the 'Base' object will be interpreted as a beginning of the 'XY' object, which is quite a different thing. See Figure 3 for details.

The only way to perform a type conversion is to use the operator dynamic_cast. But using dynamic_cast too often makes the code smell.

Figure 3. Type conversion.

Should We Abandon Virtual Inheritance?

I agree with many authors that one should avoid virtual inheritance by all means, as well as common multiple inheritance.

Virtual inheritance causes troubles with object initialization and copying. Since it is the "most derived" class which is responsible for these operations, it has to be familiar with all the intimate details of the structure of base classes. Due to this, a more complex dependency appears between the classes, which complicates the project structure and forces you to make some additional revisions in all those classes during refactoring. All this leads to new bugs and makes the code less readable.

Troubles with type conversions may also be a source of bugs. You can partly solve the issues by using the dynamic_cast operator. But it is too slow, and if you have to use it too often in your code, it means that your project's architecture is probably very poor. Project structure can be almost always implemented without multiple inheritance. After all, there are no such exotica in many other languages, and it doesn't prevent programmers writing code in these languages from developing large and complex projects.

We cannot insist on total refusal of virtual inheritance: it may be useful and convenient at times. But always think twice before making a heap of complex classes. Growing a forest of small classes with shallow hierarchy is better than handling a few huge trees. For example, multiple inheritance can be in most cases replaced by object composition.

Good Sides of Multiple Inheritance

OK, we now understand and agree with the criticism of multiple virtual inheritance and multiple inheritance as such. But are there cases when it can be safe and convenient to use?

Yes, I can name at least one: Mix-ins. If you don't know what it is, see the book "Enough Rope to Shoot Yourself in the Foot" [3]

A mix-in class doesn't contain any data. All its functions are usually pure virtual. It has no constructor, and even when it has, it doesn't do anything. It means that no troubles will occur when creating or copying these classes.

If a base class is a mix-in class, assignment is harmless. Even if an object is copied many times, it doesn't matter: the program will be free of it after compilation.

References

  • Stephen C. Dewhurst. "C++ Gotchas: Avoiding Common Problems in Coding and Design". - Addison-Wesley Professional. - 352 pages; illustrations. ISBN-13: 978-0321125187. (See gotchas 45 and 53).
  • Wikipedia. Object composition.
  • Allen I. Holub. "Enough Rope to Shoot Yourself in the Foot". (You can easily find it on the Internet. Start reading at section 101 and further on).