(Originally posted 2019-05-26.)
Maybe you’ve never thought much about this but aren’t weights supposed to be integer?
Well, they are at the LPAR level. But what about at the engine level?
Let me take you through a recent customer example. The names are changed but the numbers are real, as the one graphic in this post will show. The customer has a z14 ZR1 with 3 general-purpose processors (GCPs). The weights add up to 1000. Nice and tidy. There are two LPARs:
- PROD – with 3 logical engines and a weight of 965.
- TEST with 2 logical engines and a weight of 35.
Both LPARs are in HiperDispatch mode – which means the logical engines are vertically polarised.
To proceed any further we need to work out what a full engine’s worth of weight is: It’s 1000 / 3 = 333.3 recurring. Clearly not an integer. How do you assign vertical weights given that?
Let’s take the easy case first:
TEST has a weight of 35. Much less than one engine’s worth of weight. It has two logical processors so we would expect:
- A Vertical Medium (VM) with a weight of 35.
- A Vertical Low (VL) with a weight of 0.
So, in this case, both the engines have integer weights. So far so good.
Now let’s take the case of PROD. Here’s what I expect:
- Two Vertical Highs (VHs) each with a weight of 333.3 recurring. Total 666.6 recurring.
- A Vertical Medium (VM) with weight 965 – 666.6 recurring or 288.3 recurring. (It’s the presence of the non-integer VH’s that forces the VM to be non-integer.)
- No Vertical Lows (VLs).
When I say “expect” I really mean “what I’ve come to expect”. And I say that because I’ve seen it in reports produced by my code – and ended up wondering if my code was wrong. With the “Engine-ering” initiative, and in general because of HiperDispatch, it’s become more important to understand what’s going on at the logical engine level.
Non-integer weights began to worry me. So I started to investigate. Here’s the process, in strict step order:
- My REXX code correctly queries my database at a summary table level and reports what it sees.
- My database code correctly summarises the log level table.
- My log level table correctly maps the record.
Let’s take a closer look at the record, which is what I did to establish Point 3.
When I look at individual records at the bits-and-bytes level I generally use RMF’s ERBSCAN and ERBSHOW execs:
- If you type ERBSCAN against an SMF data set in ISPF 3.4 you get a list of records, each of which has a record number associated with it. Among other things the ERBSCAN list shows SMFID, timestamp and record type and subtype.
- If you type ERBSHOW nnn where nnn is the number of an RMF record you get a formatted hex display of the record.
I emphasise RMF because ERBSHOW does a good job on RMF records, but no so useful a job for most other record types. (SMF 99-14 is one where I’ve seen it do a good job, but I digress.)
Anyway, back to the point., Here’s part of an ERBSHOW for an SMF 70-1 record. It shows five Logical Processor Data Sections – the first 3 for PROD and the last 2 for TEST.
The highlighted field is SMF70POW – the engine’s vertical weight. Here’s the full description of the 4-byte binary field:
Polarisation weight for the logical CPU when HiperDispatch mode is active. See bit 2 of SMF70PFL. Multiplied by a factor of 4096 for more granularity. The value may be the same or different for all shared CPUs of type SMF70CIX. This is an accumulated value. Divide by the number of Dignoase samples (SMF70DSA) to get average weight value for the interval.
So the samples are multiplied by 4096. Now 4096 is 1000 hexadecimal. So an integer would end with three hex zeroes, wouldn’t it? The first three clearly don’t.
But lets take the simpler – TEST – case first.
- SMF70DSA is 90 decimal.
- Section 4 has hex 00C4E000.Dividing by hex 1000 and converting to decimal we get 3150. Divide that by 90 and we get 35. So this is the VM mentioned above.
- Section 5 has zero so that is a vertical weight of 0. So this is the VL mentioned above.
Now let’s look at PROD.
- Each of the first two logical engines has SMF70POW of hex 0752FFE2. Clearly dividing by 1000 hex doesn’t yield an integer – so I (and my code) divide by SMF70DSA first. I get hex 0014D555 or decimal 1365333. Divide this by 4096 and I get 333.3 recurring.
- The third engine has SMF70POW of hex 068E203C. Divide by SMF70DSA and convert to decimal and I get 1221974 decimal. (Already this is less than 1365333.) Divide by 4096 and I get 298.3 recurring.
So my code is vindicated. Phew!
My suspicion is that vertical weights are held (not just sampled) multiplied by 4096.
But in any case the message is if the data looks odd then dig into it. In my case I blamed my own tools first but my tools are vindicated. But my expectation was wrong or, more charitably, blurry.
And, the more I think about it, the more the actual engine-level weights make sense. They have to add up to the LPAR weight. And the existence of Vertical Highs forces the above arithmetic on us.
But half the point of this post is to show how I debug numbers (and names) in my reporting that don’t meet my expectation. And ERBSCAN / ERBSHOW is a pair of friends you might like to get to know.
Hi Martine , Kindly please post some more blog about mainframe WLM Capacity and planning. Would be helpful for young sysprog like me ,
LikeLike
Martin here: Thanks for your encouragement. I intend to do just that. And please connect with me on LinkedIn and/or Twitter.
LikeLiked by 1 person
Sure .
LikeLike