Sponsored Link •
Bjarne Stroustrup talks with Bill Venners about using multiple inheritance and pure abstract classes, multi-paradigm programming, and the technique of resource acquisition is initialization.
Bjarne Stroustrup is the designer and original implementer of C++. He is the author of numerous papers and several books, including The C++ Programming Language (Addison-Wesley, 1985-2000) and The Design and Evolution of C++ (Addison-Wesley, 1994). He took an active role in the creation of the ANSI/ISO standard for C++ and continues to work on the maintenance and revision of that standard. He is currently the College of Engineering Chair in Computer Science Professor at Texas A&M University.
On September 22, 2003, Bill Venners met with Bjarne Stroustrup at the JAOO conference in Aarhus, Denmark. In this interview, which is being published in multiple installments on Artima.com, Stroustrup gives insights into C++ best practice.
Bill Venners: I programmed almost exclusively in C++ for about five years, 1991 to 1996. In those days, I thought the sole purpose of multiple inheritance was to let me inherit data and functions, both virtual and implemented, from multiple superclasses. I never imagined using what's now called an interface in Java, an abstract class that has no data and only pure virtual functions. I just never thought about using multiple inheritance that way, and neither did the C++ programmers with whom I worked. These days you seem to be recommending abstract classes a lot. Was the usefulness of multiply inheriting pure interfaces something that was discovered with experience, or did we just somehow miss the message about abstract classes?
Bjarne Stroustrup: I had a lot of problems explaining that to people and never quite understood why it was hard to understand. From the first days of C++, there were classes with data and classes without data. The emphasis in the old days was building up from a root with stuff in it, but there were always abstract base classes. In the mid to late eighties, they were commonly called ABCs (Abstract Base Classes): classes that consisted only of virtual functions. In 1987, I supported pure interfaces directly in C++ by saying a class is abstract if it has a pure virtual function, which is a function that must be overridden. Since then I have consistently pointed out that one of the major ways of writing classes in C++ is without any state, that is, just an interface.
From a C++ view there's no difference between an abstract class and an interface. Sometimes we use the phrase "pure abstract class," meaning a class that exclusively has pure virtual functions (and no data). It is the most common kind of abstract class. When I tried to explain this I found I couldn't effectively get the idea across until I introduced direct language support in the form of pure virtual functions. Since people could put data in the base classes, they sort of felt obliged to do so. People built the classic brittle base classes and got the classic brittle base class problems, and I couldn't understand why people were doing it. When I tried to teach the idea with abstract base classes directly supported in C++, I had more luck, but many people still didn't get it. I think it was a major failure in education on my part. I didn't imagine the problem well. That actually matches some of the early failures of the Simula community to get crucial new ideas across. Some new ideas are hard to get across, and part of the problem is a lot of people don't want to learn something genuinly new. They think they know the answer. And once we think we know the answer, it's very hard to learn something new. Abstract classes were described, with several examples, in The C++ Programming Language, Second Edition, in 1991, but unfortunately not used systematically throughout the book.
Bill Venners: What is the advantage of using pure abstract classes? What is an appropriate situation to use them versus the more general case of multiple inheritance?
Bjarne Stroustrup: The obvious good case is multiple interfaces single implementation. That's a very common one. For example, your system may have a notion of persistence. It may also have a notion of iteration. Both persistence and iteration can be provided by an interface in the form of an abstract class. Now, if you want to provide a persistent container, you can derive it from both the persistence abstract class and the abstract class providing iteration services. That's the form of multiple inheritance that Java and C# has adopted.
Another common use of multiple inheritance is when you just want to combine a couple of classes that you happen to have. They don't have complicated semantics, so here the use of multiple inheritance is just a convenience. You could instead use delegation, that is, you could have a member that's a pointer to the real object and provide functions that indirect to that object. That is fine, but each time you add a new function to the class you indirect to, you have to add a new function to the class that indirects. That's a pain in the neck, an indirect way of expressing the idea, and a maintenance hazard. The final case is where you need to inherit state from two classes. If those classes are complicated or if the semantics of the two classes interact, you can easily get a mess. But you try mininize such cases. You try not to overuse inheritance. When you do use inheritance, you try not to use to overuse multiple inheritance. And when you do use multiple inheritance, you try to avoid the more complicated variations. Always, you try to model the problem as directly and simply as possible, but no simpler.
People quite correctly say that you don't need multiple inheritance, because anything you can do with multiple inheritance you can also do with single inheritance. You just use the delegation trick I mentioned. Furthermore, you don't need any inheritance at all, because anything you do with single inheritance you can also do without inheritance by forwarding through a class. Actually, you don't need any classes either, because you can do it all with pointers and data structures. But why would you want to do that? When is it convenient to use the language facilities? When would you prefer a workaround? I've seen cases where multiple inheritance is useful, and I've even seen cases where quite complicated multiple inheritance is useful. Generally, I prefer to use the facilities offered by the language to doing workarounds.
One of the ways we deal with the more complicated cases is composition using templates. You have templates that take several parameters. These parameters tend to be totally independent classes: they implement abstractions that you can combine, but they don't depend on each other. Only the deriving class depends on all of them. Sometimes it's convenient to do the composition inside a template in terms of inheritance, and sometimes it isn't (so you use membership or pointers to separate objects). Here's an example where you sometimes inherit state from multiple classes. You can get an allocator object that knows how to deal with memory. You can get an accessor object that knows how to access memory once it's given. Then you use those as pieces to implement, say, a multiplication function of a matrix class. You have state from (at least) two sources, but there are no problems of the kind that people who worry about multiple inheritance tend to talk about. Basically, the situations that work are the ones you can explain in fairly simple terms.
Bill Venners: Another thing I don't remember hearing about when I was programming in C++ is multi-paradigm programming�programming using multiple styles. You seem to be talking about this a lot recently. What programming styles does C++ support, and what is the benefit of combining multiple styles in the same program?
Bjarne Stroustrup: Multi-paradigm programming is not a new thing. However, it is a relatively new way of talking about things. Just as I was basically unsuccessful teaching people about classes used as interfaces as opposed to stateful objects, I had some trouble explaining how to use different paradigms. But if you look in my first book on C++, it says: we support traditional C style programming, and we do it better than C; we support data abstraction; and we support object-oriented programming. Data abstraction was basically the way you wrote code in Ada and a whole slurry of other languages. Data abstraction is very good for dealing with high performance numerical problems where you have lots of rellatively standard concepts, such as complex number, vector, and upper-diagonal matrix, that must be efficiently implemented to be useful and that can be represented as relatively independent classes.
Bill Venners: Data abstraction is free-standing classes without inheritance?
Bjarne Stroustrup: Basically, yes. Object-oriented programming is where you use class hierarchies, as first done by Simula. When you look at my earlier writings, there was usually a section under data abstraction that said, "and we need to parameterize our containers with the element type and do operations on parameterized containers." This is what later grew to be generic programming. Then object-oriented programming came along, and most people's emphasis and attention shifted to class hierarchies. In the C++ world, generic programming slowly emerged from data abstraction over the late '80s and '90s. Generic programming is now so prominent, and we know so much more about it, that I describe it separately.
Bill Venners: And generic programming is when I write code to a generic type T that gets plugged in later?
Bjarne Stroustrup: Yes. When you say, "template type T," that is really the old mathematical, "for all T." That's the way it's considered. My very first paper on "C with Classes" (that evolved into C++) from 1981 mentioned parameterized types. There, I got the problem right, but I got the solution totally wrong. I explained how you can parameterize types with macros, and boy that was lousy code. Fortunately, it's more important to have the right problem than to have the right solution, because at least when you have the right problem you can eventually solve it. We understand how to use parameterization so much better now than I did then. I mean, I could only glimpse part of the solution and part of the importance of the problem. Fortunately, I glimpsed enough. Today, we casually do things that would have been almost impossible in C++ in the eighties or nineties, before the generic style was first directly supported in C++ by templates.
So the notion of multi-paradigm programming was there from the beginning. This is why I say, "C++ supports object-oriented programming," usually adding "and it supports it better than some languages." I don't say, "C++ is an object-oriented programming language." I never had the idea that there was just one right way of writing code. From the very beginning, the notion of using different styles or paradigms was there. I usually list, C-style programming, data abstraction, object-oriented programming, and generic programming as styles directly supported by C++. And from the very beginning, there were examples that used combinations of the styles. I'm just emphasizing multi-paradigm programming more strongly now. I think I'm better at teaching it. I'm better at getting the ideas across or maybe the community has simply matured to the point where it's easier to explain the need for multiple styles. However, there is still much to get across to the C++ community, especially in the area of how the different styles can be used in combination to create the best, most efficient, most maintainable code.
Apropos to this, I had a nice experience reading a review of the third edition my book, The C++ Programming Language, a few years ago. A reviewer, I think it was Al Stevens, said the third edition read much better than the original edition. However, to check, he went back and read the original to see if it really was as bad as he remembered it to be. His conclusion was that the first edition wasn't that bad. He thought now that it was very clear. But when the first edition came out, he had written a review that said it was almost incomprehensible. Ideas mature. And the community matures partly through reasonably pioneering works that are hard to understand at the time. If you go back to the roots of C++, you will find things considered very hard that are now considered obvious, and you have trouble understanding why people had problems with them. I don't quite understand why I couldn't teach those ideas, but I too have learned a lot since then.
Bill Venners: Another technique I never heard about when I was programming C++ is "resource allocation is initialization." Could you explain that technique with respect to memory management, resource management, and exception safety?
Bjarne Stroustrup: If I create 10,000 objects and have pointers to
them, I need to delete those 10,000 objects, not 9,999, and not 10,001. I don't know
how to do that. If I have to handle the 10,000 objects directly, I'm going to
screw up. This is what I meant when I said earlier [in Part I] that if you use
new the way you used
malloc, you're going to get in
trouble. So, quite a long time ago I thought, "Well, but I can handle a
low number of objects correctly." If I have a hundred objects to deal with, I can be
pretty sure I have correctly handled 100 and not 99. If I can get then number down to 10 objects, I start
getting happy. I know how to make sure that I have correctly handled 10 and not just 9.
For example, a container is a systematic way of dealing with objects. Some of the things you can have in a container are pointers to other objects, but the container's constructors and destructor can take care of contained objects. The name of the game here is to get allocation out of the way so you don't see it. If you don't directly allocate something, you don't directly deallocate it, because whoever owns it deals with it. The notion of ownership is central. A container may own its objects because they are stored directly. A container of pointers either owns the pointed-to objects or it doesn't. If it contains 1000 pointers, there's only one decision to make. Does it own them, or doesn't it? That's a 1000 to 1 reduction in complexity. As you apply this technique recursively and in as many places as you can, allocation and deallocation disappear from the surface level of your code.
The next thing you notice is that this is a general notion of a resource. How do you manage a file? In the old way you have a file pointer. You initialize the file pointer by doing open, and you have to remember to do a close. Well, as I just said you shouldn't have naked pointers with allocations sitting around, and a open operation really is an allocation of a file handle. So instead, we create a resource object for a file, a file handle. Initialization of a file handle opens the file. If the constructor successfully opens the file, the destructor will eventually close it again. So "resource acquisition is initialization" is a consequence of the view of hiding allocation and deallocation by dealing with it in constructors and destructors of classes. The "resource acquisition is initialization" technique, sometimes called RAII, is a clumsy name for a central concept. The "resource acquisition is initialization" technique also happens to be necessary for exception handling, because a major part of exception handling is keeping your program in a reasonable state. That means not leaking resources or having broken invariants, so it fundamentally involves the same resource management problem. Again, the main tool for resource management is constructors and destructors.
Bill Venners: So exception safety means that if an exception flies up through my class that I clean up, close any open resources that need to be closed on the way out, and ensure my invariants are still true on the way out?
Bjarne Stroustrup: That's right. That's basically it. There's almost a whole theory about this, but people can read about that in Appendix E of The C++ Programming Language, Third Edition. If they have a third edition without an Appendix E, they should upgrade to a modern version. And if they're really stone broke and can't afford to upgrade, they can download Appendix E from my homepages. But basically, if your C++ book doesn't have a section on exception safety, it's time to upgrade.
Exceptions signal that something bad (or at least unanticipated) has happened, and you want somebody to help you out of the mess. For that to work, before you get out by throwing an exception, you have to make sure you've cleaned up your local mess. That is, you don't just allocate an object on the heap and then throw an exception, because that would "leak" that object. If you allocated it, you must either delete it or transfer its ownership to someone else before throwing the exception. The exception then goes up the call chain, and at each level the function has to make sure that it releases any resources it has acquired. If you don't use "resource acquisition is initialization", you have to write a try block, you have to provide a catch-everything clause, you have to do whatever cleanup has to be done and then rethrow. It's like writing finally blocks in Java. If you forget to write a finally block you have a bug. You have to get it right every single time throughout the code, and I don't think that's likely. It's a bug source. The simple and manageable way to ensure exception safety to use "resource acquisition is initialization".
Come back Monday, December 1 for the next installment of a conversation with Ward Cunningham. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.
Bjarne Stroustrup is author of The C++ Programming Language, which
is available on Amazon.com at:
Bjarne Stroustrup is author of The Design and Evolution of C++, which
is available on Amazon.com at:
Bjarne Stroustrup's home page:
Bjarne Stroustrup's page about the C++ Programming Language:
Publications by Bjarne Stroustrup:
Interviews with Bjarne Stroustrup:
Bjarne Stroustrup's FAQ:
Bjarne Stroustrup's C++ Style and Technique FAQ:
Bjarne Stroustrup's C++ Glossary:
Libsigc++ Callback Framework for C++:
C++ Boost, peer-reviewed portable C++ source libraries:
Al Stevens' review of The C++ Programming Language, by Bjarne Stroustrup: