|
|
|
Advertisement
|
Bill Venners: C++ culture is concerned with efficiency. Is there a lot of premature optimization going on? And how do we know the difference between early optimization that's premature versus early optimization that's prudent?
Bjarne Stroustrup: Some parts of the C++ community are concerned with efficiency. Some of them, I think, are concerned for good reasons, others just because they don't know any better. They have a fear of inefficiency that's not quite appropriate. But certainly there's an efficiency concern, and I think there are two ways of looking at it. The way I would look at efficiency is this: I would like to know that my abstractions could map in a reasonable way to the machine, and I would like to have abstractions that I can understand.
If I want to do linear algebra, I want a matrix class. If I want to do graphics, I want a graphics class. If I want to do string manipulation, I want a string class. The first thing I do is raise the level of abstraction to a suitable level. I'm using these fairly simple examples, because they're the most common and the easiest to talk about. The next thing I look out for is not to have an N2 or N3 algorithm where I don't need it. I don't go to the web for information if I have the information locally. I don't go to the disk if I have a cached version in memory. I've seen people using modeling tools that ended up writing to the disk twice to write two fields into a record. Avoid such algorithms. I think this is prudent up front design-level optimization, which is the kind of thing you should be concerned with.
Now, once you have a reasonably modeled world, with reasonably high level of
abstraction, you start optimizing, and that sort of late optimization is
reasonable. What I don't like is when people, who out of fear of high level features
and fear of abstraction, start using a very restricted subset of the language or
avoid good libraries in favor of their own hand-crafted code.
They deal with bytes where they could just as well deal with objects. They deal
with arrays because they fear that a vector or a map class will be too expensive
for them. Then, they end up writing more code, code that can't be
understood later. That's a problem, because in any big system you'll have to
analyze it later and figure out where you got it wrong.
You also try to have higher abstractions so you can measure something concrete. If you use a map, you may find that it's too expensive. That's quite possible. If you have a map with a million elements, there's a good chance it could be slow. It's a red black tree. In many cases, you can replace a map with a hashtable if you need to optimize. If you only have 100 elements, it won't make any difference. But with a million elements, it can make a big difference.
Now, if you've hacked at all at the lowest level, even once, you won't
really know what you have. Maybe you knew your data structure was a map, but more likely it was
an ad hoc map-like data structure. Once you realize that the ad hoc data
structure didn't behave correctly, how do you know which one you can replace it
with? You're working at such a low level that it's hard to get ideas. And then
finally, if you've written an ad hoc data structure, you may have operations
scattered all over your program. That's not uncommon with a random data
structure. There's not a fixed set of operations you use to manipulate it, sometimes data is access directly from user code "for efficiency". In
that case, your profiler won't show you where the bottleneck is, because you
have scattered the code across the program. Conceptually the bottleneck belongs
to something, but you didn't have the concept, or you didn't represent the
concept directly. Your tools therefore cannot show you that this concept is
what caused your problem. If something isn't in the code directly, no tool can
tell you about that something by its proper name.
Come back Monday, February 23 for the next installment of this conversation with Bjarne Stroustrup. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.
Have an opinion about the programming techniques presented in this article?
Discuss this article in the Articles Forum topic,
Abstraction and Efficiency.
Resources
Bjarne Stroustrup is author of The C++ Programming Language, which
is available on Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/0201700735/
Bjarne Stroustrup is author of The Design and Evolution of C++, which
is available on Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/0201543303/
Bjarne Stroustrup's home page:
http://www.research.att.com/~bs/homepage.html
Bjarne Stroustrup's page about the C++ Programming Language:
http://www.research.att.com/~bs/C++.html
Preface to Third Edition where Bjarne talks about Programming is Understanding:
http://www.research.att.com/~bs/3rd_pref.html
Publications by Bjarne Stroustrup:
http://www.research.att.com/~bs/papers.html
Interviews with Bjarne Stroustrup:
http://www.research.att.com/~bs/interviews.html
Bjarne Stroustrup's FAQ:
http://www.research.att.com/~bs/bs_faq.html
Bjarne Stroustrup's C++ Style and Technique FAQ:
http://www.research.att.com/~bs/bs_faq2.html
Bjarne Stroustrup's C++ Glossary:
http://www.research.att.com/~bs/glossary.html
Libsigc++ Callback Framework for C++:
http://libsigc.sourceforge.net/
C++ Boost, peer-reviewed portable C++ source libraries:
http://www.boost.org/
Boost.Threads:
http://www.boost.org/libs/thread/doc/
Al Stevens' review of The C++ Programming Language, by Bjarne Stroustrup:
http://www.ercb.com/feature/feature.0021.html
|
Sponsored Links
|