When assessing how well a WLM policy protects key work it’s necessary to understand how much CPU is used at each importance level. \ If there is very little work at a lower importance (higher importance number) a CPU crunch will leave work at this importance level vulnerable to CPU queuing.
Let me give a real, recent example:
Db2 “Engine” (DBM1, MSTR, DIST address spaces) were in a service class at Importance 1 – with CPU Critical enabled. This is good. However, these address spaces were the only users of zIIP. (DBM1 mostly for Deferred Write and Prefetch, MSTR for Log Writes.) It is fortunate the zIIPs were lightly used. If they had been heavily used Db2 might well have slowed down. And that would cause Db2 clients e.g. CICS to slow down – even though these clients weren’t directly using zIIP.
In that example, Db2 being the only zIIP user is what I call “a fact of life”. You can’t tune that situation away. Fortunately, adding newer kinds of work that exploit zIIP almost certainly would create displaceable work. So growth in zIIP usage might cure the problem all by itself.
So the key thing we seek is indeed displaceable work.
And I deliberately gave a zIIP example to show it’s not just GCP CPU.
Measuring Displaceable Work
So, how do we measure displaceable work?
Let’s keep this simple by only looking at GCP CPU. (My standard code graphs CPU, zIIP-on-zIIP, and zIIP-on-GCP the same way.)
With RMF Workload Activity Report data (SMF 72-3) we can see CPU By Service Class Period. We can also see Importance for that work’s goal.
So, my code sums CPU by Importance – over all the service class periods. It plots it by time of day – as the picture in the batch window is often very different from the online peak.
But you mustn’t forget SYSTEM and SYSSTC service classes. And you mustn’t forget Discretionary. On the latter you could further divide between SYSOTHER and other discretionary service class periods. Personally I don’t.
A Twist With CICS
(What I’m about to say is true of IMS as well.)
CICS allows you to define (single period) transaction service classes. If you do so the regions’ own goals will be ignored and CICS work will be managed to the transaction service class’ goals.
Actually the dispatching priority of the region will be in support of the transaction goals. If there are multiple transaction service classes served by the same reason the most important one will drive things. And the other transactions will come along for the ride. This is because it is the address space that has a dispatching priority, not an individual transaction.
So, if the transaction goal overrides the region goal, how do you tell what Importance the CICS CPU is at?
Unless you can tell the Importance of the predominant transactions in the region and the region’s CPU it’s difficult. The latter can, of course, be obtained from SMF 30. But the correspondence between transactions and regions isn’t in RMF or in SMF 30. You could get it from SMF 110 CICS Monitor Trace – but that would be expensive. And it doesn’t speak to CICS transactions’ service classes.
Fortunately, in most CICS customers, the service class of the transactions has the same WLM Importance as the regions. But not always. For example, a transaction might flow from a TOR (perhaps at Importance 1) to an AOR (perhaps at Importance 2).
And this, in a nutshell, is why I am careful about assessing displaceable work when CICS and IMS are involved.
Conclusion
Understanding the amount of CPU at the various Importance levels is important when understanding how resilient the WLM setup is. But it’s not always straightforward.
A few other notes:
- Importance is not the same as dispatching priority – though it is the most scalable way of assessing displaceability.
- Having little displaceable work should suggest you can’t run the machine all that busy. And you have to look at GCPs and zIIPs separately.
- I’ve not really talked about what Importance work should be classified to. You can have all the displaceable work in the world but if work is misclassified it could still avail you little.
One final point: I often ask customers questions like “if you lost a machine what work could you sacrifice?” To my mind this is similar to which work is displaceable. I have in mind another blog post on this subject.
Making Of
Again another “written on a plane” post. For the first time ever, though, I had to restore the Markdown from a prior version. Somehow the text had mostly disappeared. Thankfully Drafts stores versions in iCloud. So, no harm done. Still, it freaked me out. Be careful to have automatic backups, folks.
Actually it was written on two plane journeys, some months apart. I’d had it on my task list to complete but somehow never got round to it. Well, now I have.
One thought on “What’s Important? CICS Transaction vs Region Goals”