The Artima Developer Community
Articles | Discuss | Print | Email | First Page | Previous | Next
Sponsored Link

Analyzing a Web-Based Performance Problem
by Arash Barirani and Jeffrey Blake
May 10, 2004

<<  Page 2 of 2

Advertisement

A Sample Strategy Guide: A Step-by-Step Approach

In the rest of this article we will outline a series of steps that you may use to help you solve a crash or lockup problem.

  1. Identify the problem:
    • Classify the performance issue: crash, lockup, or slow-down. Why? Because each scenario has specific symptoms that will give you specific hints as to the cause of the problem.
    • Repeat the problem If the problem only happens in the production environment make a full effort to make it happen again on a controlled environment, like the Quality Assurance (QA) or the development system. Why? Because being able to repeat the problem will enable you to understand and test the problem. You can try different cause and effect scenarios that will lead eventually to the actual cause of the problem—not to mention avoiding using the production machine for troubleshooting.
    • Log key systems information Tracking vital signs of a system, such as its memory usage, CPU utilization, and disk I/O performance, are always helpful in finding out how efficient your system operates. Collect this information before and after the problem occurs as this may give you some clues. You may need to turn logging in for a while before the problem occurs. Therefore, watch out for the system log getting too big. Don't forget to time stamp any information that you log.
    • Chart or graph all your data Use a spreadsheet package to display the data you collect in order to understand the behavior of your system. Finding a visual pattern in the occurrence of your performance problem is a key starting point.
    • Add custom trace statements If you cannot repeat the problem, you must start inserting your own debugging information:
      • Make sure timing and other relevant diagnostic information are included.
      • Do active tracing of key information at all times.
      • Take all-out systems snapshots of the system information when possible.
      • Be aware that active logging will take disk space and full-time logging will degrade system performance.
    • Look back at the latest system changes If the problem occurred suddenly, backtrack your steps to see if that gives you any clues. Sometimes unforeseen or undocumented simple changes could cause problems. For example, an unplanned backup starting at midnight on a server machine dedicated to a specific application process may well cause a poor response time as the server's CPU would be hogged by the backup program.
    • Start with the software changes first Begin your troubleshooting, if possible, with software changes that could make the most impact, as the software in general is for most part easier to change relative to hardware changes. ( Especially if you have to order and install hardware.)
  2. Include all parties involved: In an all-out systems approach you will need to include the network team, the systems operation (systems administrators/database administrators (DBAs)), the QA and the development teams. This has to happen at early stages of the performance problem troubleshooting even if those parties may not have an active role at the start.
  3. Write down the strategic plan.
    • Look at the entire system first and then point to the component that is most likely to buy you the largest bang for the buck when you optimize it.
    • Avoid guesswork and make sure you look at the impact any changes you make on other parts of the system.
    • Write down a list of solution options and mark them as short-term or long-term and see which one fits your case better.
    • If you come across a situation that you cannot troubleshoot in the context of the entire application, extract the key components and make a simple application. This way you have easier time understanding the problem and easier time resolving it.
    • Write down a backup plan in case your first set of solutions doesn't solve the performance problem. (For example a complete system rebuild in case some parts of your system was corrupted.)
  4. Keep an accurate and detailed log of all the changes and all the steps taken. This document becomes invaluable when you try multiple changes, as you want to minimize your test scenarios and avoid the repeat of any previous step.

Conclusion

Performance is a system-wide issue that spans consideration over application software, network system, and the underlying computing hardware. A solid performance strategy and a planned effort coordinated with the operations staff, QA, development, and network systems domain experts are key elements in resolving systems performance issues and creating an optimized and well tuned web application. Since software changes are often the easiest to make, a good approach would be to start with software code or software architectural changes first and then later proceed to bigger and costlier changes to the network or the computing hardware systems.

Talk Back!

Have an opinion about the performance tuning techniques presented in this article? Discuss this article in the Articles Forum topic, Analyzing a Web-Based Performance Problem.

About the Authors

Arash Barirani and Jeffrey Blake are developers and consultants who help clients solve performance problems of web-based applications.

Resources

1. MQSeries is a messaging platform from IBM, now called Websphere MQ:
http://www-306.ibm.com/software/integration/wmq/

Jack Shirazi runs a website devoted to Java performance tuning:
http://www.javaperformancetuning.com/

<<  Page 2 of 2

Articles | Discuss | Print | Email | First Page | Previous | Next

Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us