(Originally posted 2014-10-12.)
… and other address spaces, too. 🙂
In Once Upon A Restart I talked about how to detect IPLs and restarts of CICS regions and MQ subsystems (and other long-running address spaces) – from SMF Type 30 Interval records.
It’s easy to see starts but what about stops?
It turns out you can estimate when address spaces stop from the SMF 30 Interval records (Subtypes 2 and 3):
- When there is no longer a record for the address space (with a given Reader Start Time) the address space has terminated. So the last record for that job name with the given Reader Start Time marks when it came down.
- When there is again a record with the same job name it will have a new Reader Start Time and the address space has come up again.
This is actually a naive implementation but it gets me very close to when an address space comes down.
The flippant answer is that I extend what my tooling does because it pleases me to. 🙂
But actually that’s not true: To the extent that it needs a justification I’m more useful the closer I get to how my customers are running things, and to understanding their problems.
Specifically, in the handful of customers I’ve tested this code with, I have quite a good understanding of the relationship between CICS regions  and the batch. For example:
- I see CICS regions come down and not come back up again for hours, sometimes on a timer pop and sometimes event-driven. This is usually overnight and I’m therefore seeing a Batch Window.
- I see CICS regions come down and immediately restarted – in a way that suggests being put into read-only mode or to flip data sets.  Again this can be a sign of a batch window.
- I see test regions come up for very short periods of time and then go down again. 
Actually, being (supposedly) open minded, I don’t know quite what I’ll see. But these are the sorts of things I think I’ll see.
Here’s a depiction of CICS coming down for Batch and restarting after:
and here’s a conflation of a number of scenarios where CICS gets bounced but is still up alongside batch. In this case it’s in “Read Only” mode:
Again, So What?
The answer to why this might be relevant to you is:
- Many of you are looking after a plethoration  of systems and applications. This technique might save you time.
- If I start talking to you about up and down times this might help you understand where I got it from. The words “see my blog” escape from my lips quite frequently these days.
And I expect I’ll be updating Life And Times Of An Address Space with this.
Yes you can use SMF 30 Subtypes 4 and 5 to get step- and job-end timings but I prefer not to make customers send me these. I might change my mind, one day. ↩
But I treat this as a new instance of the region / address space. ↩
It’s really only the CICS regions that get frequently restarted. But I’d notice if others did. ↩
In one customer case this is to pick up new versions of VSAM data sets the batch has created. ↩
I probably should pick up termination code to see if they ABENDed. Unfortunately there isn’t one as the Completion Section isn’t created for SMF 30 Subtypes 2 and 3 but only Subtypes 4 and 5. ↩
It probably should be “plethora” or “proliferation” but I like combining the two into “plethoration”. I hope you do, too. 🙂 ↩