(Originally posted 2013-03-16.)
It’s hard to write about test environments without feeling you’re insulting somebody. That’s certainly true when it comes to performance tests.
But I think that very fact is indicative of something: It’s incredibly difficult to get it right. Put another way, most environments are compromises.
In recent weeks I’ve seen a number of customer situations where things haven’t quite gone according to plan. In what follows bear in mind that almost nobody has a fully dedicated performance test environment: Almost all represent compromises of some kind.
(More than 20 years ago it was explained to me that benchmarking is phenomenally expensive: Poughkeepsie does it, but almost nobody else does. And even they produce relatively few data points.)
Here are some of the things I’ve seen recently (and I share them not to poke fun at the customers involved but because I think they illustrate some of the difficulties in conducting performance tests any installation might encounter):
- Other stuff still running, using resources the application under test would’ve found handy.
- High levels of paging and almost no free memory.
- DB2 buffer pools defined unhelpfully small.
- Shared-engine Coupling Facility LPARs with very long service times.
- CPU limited, whether through a physical shortage or artificial constraints. (In one case the test LPAR was in the same Capacity Group as other LPARs and the other LPARs caused the test LPAR and themselves to be solidly capped throughout the test run.)
- The Test LPAR roared into life in the middle of the morning Production peak and contributed to a CPU shortage on the machine when it was already heavily constrained. (You might not consider that to be a problem for the test environment. Frankly I have no idea how bad a service the tests encountered.)
One thing all the above have in common is they’re tests being run on the same machine as Production services. As I said, this is almost inevitable. And often even the LPAR isn’t as dedicated to the application under test as you’d like: If a truly dedicated test environment is rare, one dedicated to a single application is even rarer.
An interesting question is what people are testing for, performancewise. It could be scalability, meaning responsiveness at load. It could be resource consumption. When I’ve been asked to help out – by analysing system performance numbers from a test environment – it’s been one of the first things I ask: Enabling a test environment to support a scalability test is different from minimising resource usage. It could, of course, be whether the application continues to be reliable and produce the intend results at high load levels.
I’m slightly worried that the measurements from the residency I intend to run this Autumn will be taken too seriously: We plan on doing things that will provide reasonable quality numbers. I’ve already said, though, that the numbers won’t be “benchmark quality”. Actually the measurements aren’t the main point: The processes we’ll develop and describe are. Perhaps an interesting sidebar would be some commentary on the quality (good or bad) of the measurements and the environment in which we run them.
And what this post has been about is Performance. I’m not a Testing specialist – so I’m only averagely aware of the wider issues that discipline has to deal with. I’ve for enough of my own, thank you so much. 🙂