(Originally posted 2016-10-22.)
Seasoned readers will recognise the title of this post as a bad pun, rather than a mis-spelling. [1]
One emergent theme in our code for Parallel Sysplex Performance is treating individual coupling facility structures on their merits. For example, lock structures are different from cache structures.
But there is much commonality in the instrumentation. For example Maximum Size, Size and Minimum Size are common to all.
One type of structure I haven’t paid much detailed attention to is List structures. Two common examples are:
But an incident recently led me to think about List Structure behaviour:
Two test systems with CICS regions on were sharing a Temporary Storage Queue List structure. The structure itself is 20MB in size (with a Maximum Size of 98MB)[4]
The structure itself got to full.
If you approach the structure as some form of queue it helps, because it lets you muse in the following ways:
- Maybe the reader stopped reading.
- Maybe the writer suddenly splurge wrote.
- Maybe the writer outpaced the reader for some other reason.
The truth of it does need sorting out. All of these are feasible explanations in a testing scenario but you wouldn’t want to go into production like this.
In a queuing environment you have to think about how big a queue is required.[5]
In general a large queue (buffer) helps with transient variations in writer and reader speed; It doesn’t help much with persistent outpacing.
But what can put a “bung” in the pipe? Or appear to?
- A dead reader can do it – whether (in this case) a CICS region, the DB2 it connects to, the LPAR or the machine. You get the picture, I’m sure: It’s not just the actual reader that matters.
- “Market Open” – where a concerted spike in writes can remain unmatched for a while.
So we need to monitor certain list structures. In SMF 74–4 we have, among other things:
- Maximum number of elements – R744SMAE
- Current number of elements – R744SCUE
Plotting the latter as a % of the former is probably the right thing to do. Obviously an RMF interval of, say, 15 minutes might not catch sudden spikes.
But in the “Market Open” type of scenario it’s worthwhile trying to understand what it does to major queues. And as this post is about list structures those would include XCF signalling structures, CICS Temporary Storage queues and MQ shared message queues.
In the case I mentioned, the structure was resized to 49MB. I didn’t hang around to see what the resolution was, from the CICS point of view.
One final thought: Don’t be tempted to set the Maximum Size of a structure ludicrously big, relative to the Initial Size (or even the expected day-to-day size): I have it on good authority the structure would be full of control blocks, rather than data.
-
An even worse pun would be “write on queue”, of course. 🙂 ↩
-
Detectable from SMF 74–2 XCF records’ Path Data Sections. ↩
-
You can detect the address spaces because their program name is DFHQXMN but not the structures directly from SMF. Generally, however, the list structure name is mnemonic. ↩
-
I’ve no real idea, by the way, if this is too small. I guess that’s part of the point of this post. ↩
-
We’ve been here before (some of us) with BatchPipes/MVS “Pipe Depth (BUFNO)”. ↩
One thought on “Right On Queue”