ATG Test Strategy

An application test strategy for ATG Dynamo applications

Document Information

Title
ATG Test Strategy
Subject
An application test strategy for ATG Dynamo applications
Author(s)
Güray Sen
Comments
 

Document Version

v11@10/02/2003

Copyright

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

No Warranty

This document is provided 'as is' without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.

The contents of this publication could contain technical inaccuracies or typographical errors. Changes are periodically added to the information herein; these changes will be incorporated in the new editions of the publication. Spindrift may make improvements and/or changes in the publication and/or product(s) described in the publication at any time without notice.

Limitation of Liability

In no event will Spindrift be liable for direct, indirect, special, incidental, economic, cover, or consequential damages arising out of the use of or inability to use this documentation even if advised of the possibility of such damages.

Contact

Web
www.spindriftgroup.com

Contents

1 Introduction
    1.1 Document Goals
    1.2 Related Documents
2 Background
    2.1 Abstract
    2.2 Performance Load Testing - A Common Misconception
3 Common testing strategy for ATG applications
    3.1 The Methodology
    3.2 Benchmarking Site Performance
    3.3 Simulating Peak Load
    3.4 Examining Performance Characteristics Under Load
    3.5 Types of Performance Tests
        3.5.1 JVM Dumps
        3.5.2 Garbage Collection and Memory Sizing
        3.5.3 Isolating Application Memory Leaks
        3.5.4 Dynamo Performance Monitor
        3.5.5 DRP Thread Utilizations
        3.5.6 Connection Pool Utilization
    3.6 Re-Bench Mark the Site Performance
    3.7 The Performance Tuning Steps
    3.8 Summary


1 Introduction


1.1 Document Goals

This document serves the following purposes:

  • Provide a test strategy to isolate memory leaks and indicate performance issues with a ATG Dynamo application.

1.2 Related Documents

The following documentation is related to this document:


2 Background


2.1 Abstract

This document is intended to describe a load testing strategy to isolate suspected memory leaks and additionally indicate performance issues within any ATG application. Frequently sites do not adequately performance tune application code prior to publishing to a live environment. This practice leads to unpredictable site down time, and unacceptable site performance.


2.2 Performance Load Testing - A Common Misconception

Performance load testing has many variations and meanings depending on the context of the test. This document will describe application performance load tuning which we define as the process of increasing performance by introspecting an application while it is at peak capacity. Most sites perform load tests before launching, in order to predict the maximum server capacity. Tests like this blast the entire environment with tens of thousands of simulated user requests, and record the round trip latency of one or more HTTP requests. These times indicate that a particular page takes some number of seconds from the time it is requested to the time the page gets back to the browser over the internet. These benchmarks include all of the time spent by the ATG DRP server to process the request - plus all of te processing time spent by the web server, the network hardware, all of the routing over the internet, the client's ISP network, their own client machines, all of the image downloads, and even the third party requests for JavaScript, CSS or imahes. This is a rounf trip exercise where multiple HTTP requests are issued, including many for images and other resources that are not even served by ATG. These tests can be useful to find system level configuration problems, and can be useful in assuring load is balanced appropriately accross all of the servers. In addition, this type of test can also be used to measure any perfromance gained by making optimization using the technique described in this document.

This document describes a procedure that is quite different from benchmarking tests. Application performance load tuning is intended to isolate application specific problems such as memory leaks and performance bottlenecks. This is achieved through several means such as measuring the time that the server spends processing different parts of custom application code, performing a JVM thread dump to find resource bottlenecks, or profiling memory utilization. Most of these methods involve putting load on the system, then using various tools to view the application processes as they react when under stress. These tools can be used to find out what code is executing poorly and where it is. Many of these instrumentations also degrade performance while they are running, but they allow us to access perfromance data on a microscopic level. It is important to note that the tools themselves often slow down the processing of the server, thus making benchmarks invalid. This is why testing for the maximum throughput of a site and capacity testing must be done seperately from application performance load tuning.

These tests would be extremely helpful to include in the standard QA process, but it must be understod that the round trip benchmark times are not relevant when performance tuning. Time benchmarks should simply be performed before and after the tuning to quantify the benefit of a performance enhancement.


3 Common testing strategy for ATG applications


3.1 The Methodology

Most of the performance tuning tests will follow the same pattern. Your performance testing strategy may vary slightly from this methodology depending on the specifics of your performance goals. The first step in most performance tuning tests is a benchmark of the current performance of the site. This benchmark is useful in determining the impact of any tuning that you perform, and will stand as a reference point to make sure that any tuning actually increases performance. The first and second step in performance tuning tests is usually to apply load to the system. However, the second time you apply load a tool should be used to gain insight into the server's processing activities. In the case of a potential memory leak, this tool may be a memory profiler. When there are latency problems, this tool may be the Dynamo Performance Monitor, or simply JVM thread dumps. Once a potential problem is found, a fix should be applied, then the last step should always be another benchmark of the site throughput without any profiling tools involved. This last step is important to remove the speculation that your performance tuning was successful, and instead quantify your success.


3.2 Benchmarking Site Performance

A benchmark is the total round trip time to build a page. Most load testing tools will provide you with statistics about the times, such as a minimum, maximum, mean and standard deviation of the requests. The results are usually summarised on a per object basis (an object being an image, html page, jsp page, etc.). Benchmarking a suite to determine its throughput is not a trivial task. It involves accurately simulating real user activities using a scripting tool such as URL Hammer, or other, commercial, load testing tools. This user activity can be difficult to predict accurately. You may want to run tests with different sequences of URLs to determine how much throughput varies based on what the user is doing on your site. Some bottlenecks may occur only in certain page sequences. You will get easy to interpret results when you test the performance of your site if you start with very simple tests. Once you know that individual pages or sequences are performing adequately, you can work toward tests that exercise the full range of functionality on your site. The QA process should perform each of these types of tests, ending with a test of the full range of functionality, then specific targeting functionality should be done if a new performance bottleneck is identified that was not found during the smaller more focused testing.


3.3 Simulating Peak Load

Peak load is when the server is at capacity in terms of throughput. It may also be defined as a point at which the server's performance degrades to put a point that the page latency is greater than acceptable levels, but for the purpose of this document, it is the former.

Achieving peak load first involves developing a script that simulates user behavior across the site. This script tells the load-testing tool to execute requests in sequence. Once this script is written, it can be applied to a server by adding virtual users until you hit a point where all of the DRP Threads are utilized at all times, or until the CPU is consistently near 100% utilization. To evaluate your throughput, set up your testing tool to evaluate performance based on a number of virtual clients. These clients do not necessarily represent the real number of concurrent user, as the virtual clients will be significantly more active than real clients will, as they will be submitting requests serially. The total throughput for the whole site should be at it's highest as a number of clients once the content that they are requesting has been cached on the servers. It should then stay constant as you add clients. If the custom application is the limiting factor, throughput reaches a maximum level as soon as CPU utilization of Dynamo process nears 100%. If total throughput goes down significantly as demand increases, your site has some performance problems that need to be corrected. Several tools are available to apply load to an application server. Dynamo ships with a tool called URLHammer. It is a basic command line utility, but it should be more than adequate for the purpose of this document. Using URLHammer is simple. The first step to turn on recording in the recording servlet. With recording on, you can then browse through the site as a normal user might. As you do so, the recording servlet will record all of the requested URLs, and including all posts of form data. Once this script is written, it can be used as input to URLHammer, and can be run as many times and for as many simultaneous sessions as it takes to reach peak capacity. For more information on the use of URLHammer, refer to the ATG Dynamo Programmer's Guide.


3.4 Examining Performance Characteristics Under Load

At the operating system level, you will want to monitor the following statistics:

  • CPU utilization
  • paging activity
  • disk I/O utilization
  • network I/O utilization

A well-performing site will have high CPU utilization when the site achieving its maximum throughput and will not be doing any paging. A site with high I/O activity and low CPU utilization has some I/O bottleneck.

If CPU utilization is low on the server there is likely to be a resource bottleneck. This bottleneck may be one of four basic types:

  • Database Bottleneck - Occurs when I/O is high
    This happens when your database is consistently near 100% CPU utilization. Fixes for this situation can involve intelligent caching of database content to avoid the application's reliance on the database as a resource.
  • Disk I/O Bottleneck - Occurs when I/O is high
    This can be observed when the JVM is paging because it does not have enough memory, or if your FileCache component is under sized.
  • Network I/O Bottleneck - Occurs when I/O is high
    This situation can occur for many reasons, and may be a configuration issue, or a network bandwidth problem. The source of this problem can often be identified by performing a JVM stack dump, and looking for threads that are performing an I/O operation.
  • Synchronized System Resource Bottleneck - Occurs when I/O is low
    This problem often happens when application code uses synchronization to protect date integrity. This type of bottleneck can be viewed by performing a JVM stack dump, and looking for threads that are in the MW (monitored wait) state.

If CPU utilization is high on the server, the bottleneck may be the application code itself. In situations like this the Dynamo Performance Monitor, Thread analizers, and Memory profiler tools can be employed to find the slow points in code.


3.5 Types of Performance Tests

3.5.1 JVM Dumps

A JVM dump shows the current state of the JVM including the current state of all threads in the JVM, and all of the synchronization monitors. To do this operation the JVM must synchronize itself, and stop all thread processing so that the thread dump shows the state of the JVM at a particular instance in time. Consequently, it is not advisable to perform a thread dump on a production environment, but in a QA environment it can offer much useful information. Thread dumps are performed by sending an interrupt to java process. In the Solaris, this can be done by calling kill -QUIT <dynamo_pid> on the dynamo process. For detailed information on how to analyze a JVM dump, including the meanings of the thread states, and what patterns to look for identify problems in the application, see the ATG Dynamo Deployment Guide.

3.5.2 Garbage Collection and Memory Sizing

Memory Sizing and JVM Garbage Collection is best tuned under load. Ideally the JVM should have enough memory allocated that it is able to hold all of the cached content, all of the user's session data. It is also important to have enough extra heap that the JVM does not need to garbage collect frequently spending a lot of CPU time in garbage collection and small enough that the time to perform a garbage collection is not excessive due to a large amount of memory that the JVM is responsible for managing. This is important because the JVM must acquire a heap lock while garbage collecting, thus preventing any other memory accesses while the garbage collection is in process. Be aware that decreasing heap sizes may increase the overhead of garbage collection. Each time a full garbage collection is performed, all of the memory needs to be scanned for garbage. Garbage collections occur more frequently with smaller heaps, which could waste CPU time. It is also important not to exceed the total physical memory on the machine. Exceeding this limit would cause the OS to page out memory which would have a large negative impact on performance. Paging can be monitored using the Solaris "top" command. An indication that there are memory or garbage collection issues, is when the JVM is spending more than 20% performing garbage collection, or if the garbage collections are taking more than 30 seconds per garbage collection. Another indication is occasional OutOfMemory errors. The garbage collection statistics of a JVM can be viewed by adding "-verbose:gc" option to the JAVA_ARGS. For more info on monitoring garbage collection, see the ATG Dynamo Deployment Guide.

3.5.3 Isolating Application Memory Leaks

Finding memory leaks is important to improve the performance of your application. This is typically done using a tool such as Optimizelt or JProbe. These tools run between the JVM and the Dynamo process, and keep track of all object instantation that happens while the system is processing requests. These tools can be used to find places where memory is allocated, but is never released by the application.

Profiling is typically done by starting ATG with one of these tools, then applying load to the system by running a load test script. Once the script has been run several times, let it stop, and allow the session expiration time to pass (often this process is easier if you decrease the session expiration time temporarily). This will allow Dynamo to release all referances to the session scoped components. Once these sessions are expired, manually force a garbage collection via the profiling tool, and mark the current heap size and object instantation counts. This process essentially primes the system, and creates all of the global components that are referenced by the load test script and sets a base line for all of the required objects that are in the system.

Next re-run the load testing script, let it stop, and allow the session expiration time limit to pass again. Once the session has passes, force another garbage collection, and compare the marked heap size and object instantation counts to the current heap and object count. If the heap has grown, there may be a memory leak. Look for object instantation counts that are even multiples of the number of times the load test script has run, and check to see what components are still holding those references. It may be the case that those (probably global) components need to listen for session expiration events, and perform some memory cleanup to remove references to session data.

3.5.4 Dynamo Performance Monitor

The Dynamo Performance Monitor can be used to isolate the location of poorly performing pages by timing the number of miliseconds a particular page render takes. These pages can be further instrumented by adding additional calls in the code to the PerformanceMonitor class. An additional benefit of this approach is ability to track custom code performance under load conditions. Frequently the performance characteristics of code change when load is applied. A block of code that may perform very well with only a few users, can become a bottleneck when load is applied. For more information on utilization of the Dynamo Performance Monitor, see the Performance Monitor chapter in the Dynamo Deployment Guide.

3.5.5 DRP Thread Utilization

The DrpServer component can be viewed via the admin interface by referencing /nucleus/atg/dynamo/server/DrpServer. This component can show you the current of all the request handling threads. If therads are hanging for long periods of time this can indicate an I/O or System Resource Bottleneck. This can also be used to determine if your handler thread count is appropriate, and if the system is handling requests at full capacity (all threads are checked out and handling requests).

3.5.6 Connection Pool Utilization

The utilization of your connection pool can be viewed by accessing /nucleus/atg/dynamo/service/jdbc/JTDataSource/ This component can let you know how long the connections have been checked out of the pool. The "numFreeResources" property should generally be greater than zero. If its a low number relative to the maximum pool size, you might want to increase the maximum number of connections in the pool if your database can handle more concurrent connections. If it is near zero, that means that there is contention over database connections, and the connection pool itself is a bottleneck.


3.6 Re-Bench Mark the Site Performance

The last step in any performance tuning activity is to re-benchmark the site to determine if the performance tuning was effective, and to quantify the results.


3.7 The Performance Tuning Steps

  1. A stress test is done to benchmark the site performance.
  2. The performance testing tools illustrated above are then turned on.
  3. The site should then be placed under load again, and vital data is gathered. (Expect a >50% slowdown in the site performance)
  4. Once you have isolated the performance bottleneck, you then fix the problem.
  5. Turn off the performance testing tools, and repeat step 1.
  6. After fixing the problem, re-run steps 1 through 5 to find the next greatest performance problem.

3.8 Summary

Typically, the performance tuning process is iterative. Each test will usually result in uncovering the current most significant performance bottleneck. Once the bottleneck is identified and removed, it is important to start the process over again, to uncover the next most significant performance issue. Much more information on this subject can be found in the Dynamo Documentation, in the Dynamo Knowledge Base on the ATG external website, on the Javasoft website, as well as with many commercial vendor tools.