Long time no post...
I do a lot of our customer performance tuning jobs, and there are some guiding princliples that apply in every environment.
A good place to start is to have a set of portal properties, xml configurations, and JVM flags that you know that you will use and apply every time. But that can only go so far, because every app is different and has different needs.
Performance tuning is a very iterative process. It involves 3 main components:
- A repeatable test script
- A load test client
- A Java profiler
- The first thing you need to do is to decide what you want to do for your test. Do you simply want to hit localhost, sign in, and sign out? Or do you want to browse around many pages with certain portlets placed on each page before signing out? Find out what the customer wants and decide on a test, or a number of test cases, early on and stay consistent throughout the performance tuning process. This will ensure meaningful results.
- The next thing you need to do is create the script for the repeatable test. For example, in Apache JMeter, it is easy to use the built-in proxy server to record a test script, based on your click actions. You can save this and run it multiple times across multiple threads.
- Decide how many concurrent users and repetitions you want to run this test as. For example, if the customer wants to test 200 concurrent users, you would configure JMeter to run 200 threads. You could maybe ramp them up 3 seconds apart, so in JMeter, you would configure the 200 threads to run within 600 seconds (i.e. - every 3 seconds, a thread will be started, each thread executing the test script).
- Establish a baseline. This is a very important step, as this will prove that there was measurable improvement backed up by statistics. In JMeter, you can use the aggregate report to gather data for this baseline. You have to show that you started somewhere.
- Use a Java profiler. Why is this important? Because if you are stuck on why the CPU is spiking so much, why the memory keeps growing, why certain threads seem to be blocked... apart from some very clever debugging in the code or a lucky guess, you will need a Java profiler to really nail down what the true cause is. This is because a Java profiler can examine what is in the JVM. It can examine what objects are in memory, how much memory you have available, how much memory is allocated, CPU performance and spiking, active threads, blocked or waiting threads, and much more. Recommended profilers are JProfiler, YourKit Java Profiler, and Netbeans Profiler.
- Identify bottlenecks. Once you have the profiler in place, it will be much easier to see the bottlenecks. Is it too little heap? Is it processor power? Is there a memory leak? Are there blocked threads? Once you identify the bottlenecks, you can address those bottlenecks via configuration or code changes. Then deploy your changes.
- Repeat the above steps. You must use the same test and run it the same way, after you have addressed your bottlenecks. You will most likely run into another, different bottleneck, but that is OK. The point is that you want to keep addressing the bottlenecks as they show up, address them, repeat the process, until you get the desired performance that you want. Take measurements and record statistics after every iteration! This will show progress.
If you do not use a Java profiler, it is almost always a wild guess as to what the problem is. Maybe an educated guess, but it's still a guess. Yes there is some overhead when attaching the Java profiler to the app server, but most times, this is negligible in testing scenarios. Again, this is very much an iterative process. Test, identify bottlenecks, address bottlenecks, rinse-repeat. Usually for customers we estimate 5 days. It's not something that can be done with quality, overnight. The iterations, configuration changes, code changes, redeployments, and testing all take time.
Lastly, do not run the tests in production! Make a copy and a test environment. The tests are supposed to be designed to slow down the system to reveal and identify the bottlenecks.