.NET Buzz Forum - The impact of the Pareto principle in optimization

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Preface

Although the Pareto principle is frequently mentioned in software optimization discussions, [2] the way this principle affects the optimization procedure is usually left obscure. Hence, I considered interesting to devote this brief discussion to the impact of the Pareto principle in software optimization.

Basic theory and practice

The Pareto principle [1] generally states that roughly 80% of the effects come from 20% of the causes and is hence also known as the 80-20 rule. When applied in software optimization, [2] this principle denotes that 80% of the resources are typically used by 20% of the operations. More specifically, regarding the execution speed of a software entity, the Pareto principle denotes that 80% of the execution time is usually spent executing no more than 20% of the code.

The validity of the Pareto principle in software optimization is practically indisputable. Anyone who has even a limited experience in optimization, will acknowledge that responsible for most of the system resources consumption is almost always a very small percentage of the overall code. The Pareto principle applies so well in speed optimization, that there are even cases in which almost 90% of the execution time is spend executing only 10% of the code. (90-10 rule) Furthermore, we will see in the next paragraphs, that some well-known optimization rules and practices, which are frequently mentioned in relative discussions, are in fact consequences of this important principle.

The positive impact of the Pareto principle

The obvious outcome of the Pareto principle is that, all parts of the implementation code in a typical software entity are not equally responsible for the system resources consumption and only a small portion (10%-20%) of the overall code is actually performance critical. However, the full power of this principle cannot be revealed unless we study further its three most notable consequences, which are frequently regarded as optimization rules and practices: [5,6] (see also figure 1)

It is a good practice to profile [3] before optimizing. [6] According to the Pareto principle, most of the implementation code is usually more of less irrelevant to the overall software performance, except of some small code portions, (10%-20%) which consume most (80%-90%) of the system resources. Consequently, in order to save ourselves from useless optimizations, which cost a lot of time, produce bugs and reduce the maintainability, we have to profile first and judiciously optimize only the code portions which have been proven performance critical.

It is often preferable to optimize when the implementation is complete and functional. [5] It is actually much easier to make accurate performance measurements and effectively locate the performance bottlenecks, [4] when the implementation is complete and functional. Thanks to the Pareto principle, the critical code is usually relatively small in size, hence a limited rewrite of the bottlenecks is not expected to cost as much as prematurely optimizing a much larger portion of code. This particular practice is also known as: Make it work first, optimize later.

Well-designed code is usually much easier to optimize. A good software design will help us to both locate the performance bottlenecks and also improve small portions of code, without affecting the rest of the program. On the other hand, poor software design will probably reduce the positive impact of the Pareto principle, by increasing the undesirable side-effects of the performance modifications and will eventually make the optimization procedure disproportionally difficult, in relation to the relatively small size of the critical code. In the words of Martin Fowler: "Well-factored software is easier to tune". [7]

Figure 1: Pareto consequences

The fact alone that the Pareto principle enables and supports the above fundamental optimization rules and practices, is already quite remarkable. However, the overall impact of the Pareto principle goes far and beyond the above tree consequences. Thanks to the validity of this principle, it is possible to design software solutions, without having the performance considerations and restrictions constantly in our minds. It is also possible, for the software designers and developers, to often favor clarity, flexibility, simplicity, maintainability, reusability and other important qualities, over the performance and efficiency. Consequently, thanks to this important principle, both the complexity and the cost of producing quality software have been significantly moderated.

Finally, last but not least, the Pareto principle and its consequences can provide a very good defence against the premature optimization, [2,8] which is a particularly bad habit of the software developers who care a lot about performance. A good understanding of this principle virtually eliminates any temptation for optimizing too early, or unnecessarily optimizing non-critical parts of code.

Misconceptions and limitations

It is quite natural to expect, that a principle as powerful and indisputable as the Pareto principle, will probably cause some exaggerations and misconceptions, along with its positive impact. More particularly, there are at least one common exaggeration and two common misconceptions, which seem to be related to the Pareto principle:

Exaggeration:	It is always easy to optimize a complete implementation.
Misconception 1:	There is no need at all to care about performance during the development.
Misconception 2:	Designing for performance is completely useless.

The logical path which leads to the above false conceptions is quite short and simple. (see also figure 2) Since the Pareto principle applies generally well in optimization, it is practically guaranteed that almost always a small percentage of the overall code (10%-20%) will be responsible for most of the system resources consumption. Hence, it is tempting to assume that it will be always convenient and effective to optimize a software entity, after it has been fully implemented. This in turn, will lead to the false conclusion that, there is no need at all to care about performance when implementing the software and designing for performance is completely useless!

Figure 2: Common Pareto exaggerations and misconceptions

The major logical fault which enables the above inaccurate conceptions, is the assumption that the optimization of a fully implemented software entity will be always easy and effective, just because the Pareto principle applies. Of course, like we have already discussed, this principle is an essential precondition for successful optimization at the end of the software development cycle, but is it sensible to consider the validity of a precondition as a guarantee for successful outcome? It is important to understand that, the Pareto principle gives us no guaranties regarding the effort required for improving the software performance, it just states that the performance critical code is relatively small in size and nothing more.

More specifically there are several limitations, which may reduce in practice the positive impact of the Pareto principle and can make the optimization procedure disproportionally difficult, in relation to the relatively small size of the critical code. To obtain a more concrete and practical understanding of these limitations, it can serve us well to discuss some indicative examples:

The performance critical code already performs too well to be significantly improved. In such cases we should concentrate our efforts on reducing the use of the critical code, instead of actually improving its performance. This sometimes requires difficult design changes or even architectural changes, which in turn increase the complexity and the side-effects of the optimization very much. Consequently, this is a good example of why the effort required for improving the performance of a software entity, is not necessarily proportional to the size of the critical code.

The performance critical code, even if it is small in size, may be distributed in many places. In such cases, solving the performance problems will probably require a big number of changes, and may produce a lot of side-effects and increase instability. The existence of scattered performance bottlenecks can be particularly frequent in poorly designed software entities.

The improvement of the performance critical code, may cause side-effects in a much larger code portion. This is the main reason why we have already discussed that well-designed code is easier to optimize. However, in the real word, software design is often imperfect and this is a fact that we have to take into account.

The overall software performance is too poor to become acceptable by merely improving the critical code. When the 80-20 rule applies, it is mathematically impossible to make the overall performance more than five times better, by merely improving the critical 20% of the code. Likewise, when the 90-10 rule applies, the best we can possibly do by improving only the critical 10% of the code, is to make the overall performance up to ten times better. These limitations may seem very relaxed at first glance, but in practice the feasible performance improvements will usually be considerably smaller than the above theoretical best cases. Consequently, it is not reasonable to assume that any software, regardless of its initial state, can easily achieve acceptable performance, just because the Pareto principle applies. If we have neglected the performance issues too much when building the software, our product will probably require extensible modifications, in order to work in a satisfactory manner. Speaking metaphorically, if you have originally designed an elephant, it will be extremely difficult to optimize it into a cheetah at the final stages of its implementation!

It should be obvious by now that the Pareto principle can not guarantee an easy and effective optimization of a fully implemented software. Consequently, any exaggerations and misconceptions which ignore and violate the above fact, are not only wrong, but also potentially dangerous for the quality of the software. As a summary we can say that, like most things in life, the positive impact of the Pareto principle has its limits, which the software developers should acknowledge and respect. The Pareto principle is a really bad excuse for neglecting the software performance during the design or the implementation phases of the software development. Overestimating this important principle can be at least as wrong as ignoring or underestimating it.

Conclusion

The Pareto principle plays an important role in the software optimization and the solid understanding of this principle is essential for software developers who deal with optimization tasks. Being able to take advantage of the positive consequences of the Pareto principle, while avoiding the dangerous misconceptions which surround this principle, is a very useful optimization skill, which can be significantly improved with experience and practice. However, it is not uncommon even for experienced developers, to underestimate the limitations of the Pareto principle. Unfortunately, the fact that these limitations can be very inconvenient, makes them also difficult to be acknowledged by the software developers, who are often prone to underestimate and overlook whatever seems inconvenient to them.

References

Wikipedia: Pareto principle
http://en.wikipedia.org/wiki/Pareto_principle
Wikipedia: Program optimization
http://en.wikipedia.org/wiki/Optimization_(computer_science)
Wikipedia: Profiling
http://en.wikipedia.org/wiki/Profiling_(computer_programming)
Wikipedia: Performance bottlenecks
http://en.wikipedia.org/wiki/Program_optimization#Bottlenecks
Optimize Later
http://c2.com/cgi/wiki?OptimizeLater
Profile Before Optimizing
http://c2.com/cgi/wiki?ProfileBeforeOptimizing
Tuning Performance and Process: Creating Tunable Software
http://www.artima.com/intv/tunableP.html
Tuning Performance and Process: Creating Tunable Software
http://c2.com/cgi/wiki?PrematureOptimization
My blog posts on Codeproject
CodeProject

Read: The impact of the Pareto principle in optimization

Previous Topic

Next Topic


	Web Artima.com