Sponsored Link •
Bill Venners: C++ culture is concerned with efficiency. Is there a lot of premature optimization going on? And how do we know the difference between early optimization that's premature versus early optimization that's prudent?
Bjarne Stroustrup: Some parts of the C++ community are concerned with efficiency. Some of them, I think, are concerned for good reasons, others just because they don't know any better. They have a fear of inefficiency that's not quite appropriate. But certainly there's an efficiency concern, and I think there are two ways of looking at it. The way I would look at efficiency is this: I would like to know that my abstractions could map in a reasonable way to the machine, and I would like to have abstractions that I can understand.
If I want to do linear algebra, I want a matrix class. If I want to do graphics, I want a graphics class. If I want to do string manipulation, I want a string class. The first thing I do is raise the level of abstraction to a suitable level. I'm using these fairly simple examples, because they're the most common and the easiest to talk about. The next thing I look out for is not to have an N2 or N3 algorithm where I don't need it. I don't go to the web for information if I have the information locally. I don't go to the disk if I have a cached version in memory. I've seen people using modeling tools that ended up writing to the disk twice to write two fields into a record. Avoid such algorithms. I think this is prudent up front design-level optimization, which is the kind of thing you should be concerned with.
Now, once you have a reasonably modeled world, with reasonably high level of
abstraction, you start optimizing, and that sort of late optimization is
reasonable. What I don't like is when people, who out of fear of high level features
and fear of abstraction, start using a very restricted subset of the language or
avoid good libraries in favor of their own hand-crafted code.
They deal with bytes where they could just as well deal with objects. They deal
with arrays because they fear that a
vector or a map class will be too expensive
for them. Then, they end up writing more code, code that can't be
understood later. That's a problem, because in any big system you'll have to
analyze it later and figure out where you got it wrong.
You also try to have higher abstractions so you can measure something concrete. If you use a map, you may find that it's too expensive. That's quite possible. If you have a map with a million elements, there's a good chance it could be slow. It's a red black tree. In many cases, you can replace a map with a hashtable if you need to optimize. If you only have 100 elements, it won't make any difference. But with a million elements, it can make a big difference.
Now, if you've hacked at all at the lowest level, even once, you won't really know what you have. Maybe you knew your data structure was a map, but more likely it was an ad hoc map-like data structure. Once you realize that the ad hoc data structure didn't behave correctly, how do you know which one you can replace it with? You're working at such a low level that it's hard to get ideas. And then finally, if you've written an ad hoc data structure, you may have operations scattered all over your program. That's not uncommon with a random data structure. There's not a fixed set of operations you use to manipulate it, sometimes data is access directly from user code "for efficiency". In that case, your profiler won't show you where the bottleneck is, because you have scattered the code across the program. Conceptually the bottleneck belongs to something, but you didn't have the concept, or you didn't represent the concept directly. Your tools therefore cannot show you that this concept is what caused your problem. If something isn't in the code directly, no tool can tell you about that something by its proper name.
Come back Monday, February 23 for the next installment of this conversation with Bjarne Stroustrup. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.
Bjarne Stroustrup is author of The C++ Programming Language, which
is available on Amazon.com at:
Bjarne Stroustrup is author of The Design and Evolution of C++, which
is available on Amazon.com at:
Bjarne Stroustrup's home page:
Bjarne Stroustrup's page about the C++ Programming Language:
Preface to Third Edition where Bjarne talks about Programming is Understanding:
Publications by Bjarne Stroustrup:
Interviews with Bjarne Stroustrup:
Bjarne Stroustrup's FAQ:
Bjarne Stroustrup's C++ Style and Technique FAQ:
Bjarne Stroustrup's C++ Glossary:
Libsigc++ Callback Framework for C++:
C++ Boost, peer-reviewed portable C++ source libraries:
Al Stevens' review of The C++ Programming Language, by Bjarne Stroustrup: