Engineering – Part One – A Happy Medium?

(Originally posted 2019-05-25.)

In Engineering – Part Zero I talked about the presentation that Anna Shugol and I have put together. That post described the general sweep of what we’re doing.

This post, however, is more specific. It’s about Vertical Medium logical processors.

To keep it (relatively) simple I’m describing a single processor pool. For example, the zIIP Pool. Everything here can be generalized, though it’s best to treat each processor pool separately.

Also note I use the term “engine” quite a lot. It’s synonymous with processor.

What Is A Vertical Medium?

Before HiperDispatch an LPAR’s weight was distributed evenly across all its online logical processors. So, for a 2-processor LPAR with weights sufficient for 1.2 processors, each logical processor would have 0.6 engines’ worth of weight.

Now let’s turn to HiperDispatch (which is all there is nowadays)1.

The concept of A Processor’s Worth Of Weight is an important one, especially when we’re talking about HiperDispatch. Let’s take a simple example:

Suppose a machine has 10 physical processors and the LPARs’ weights add up to 10002. In this case an engine’s worth of weight is 100.

In that scenario, suppose an LPAR has weight 300 and 4 logical processors. Straightforwardly, the logical processors are:

• 3 logical engines, each with a full engine’s worth of weight. These are called Vertical Highs (VH for short). These use up all the LPAR’s weight.
• 1 local engine, with zero weight. This is called a Vertical Low (or VL).

There are a few “corner cases” with Vertical Mediums, but let me give you a simple case. Suppose the LPAR, still with 4 logical processors, has weight 270. Now we get:

• 2 VH logical engines, each with a full engine’s worth of weight. This leaves 70 to distribute.
• 1 logical engine, with a weight of 70. This is not a full engine’s weight. So this kind of logical processor is called a Vertical Medium (or VM).
• 1 VL logical engine, with zero weight.

Note that the VM in this case has 70% of an engine’s worth of weight.

How Do Vertical Mediums Behave?

There are two parts to HiperDispatch:

• Vertical CPU Management
• Dispatcher Affinity

Vertical CPU Management

Let’s take the three types of vertically polarized engines:

• With a VH the picture is clear: The logical processor is tied to a specific physical processor. It is, in effect, quasi-dedicated. The benefit of this is good cache reuse – as no other logical engine can be dispatched on the physical engine. Conversely, the logical engine won’t move to a different physical engine (leaving its cache entries behind).

• With a VM there is a fair attempt to dispatch a logical engine consistently on the same physical engine. But it’s less clear cut that this will always succeed than in the VH case. Remember a VM will probably be competing with other LPARs for the physical engine. So it could very well lose cache effectiveness.

• With a VL, the logical engine could be dispatched anywhere. Here the likelihood of high cache effectiveness is reduced.

The cache effects of the three cases are quite different: It would be reasonable to suppose that a VH would have better cacheing than a VM, which in turn would do better than a VL. I say “reasonable to suppose” as the picture is dynamic and might not always turn out that way.

But you can see that LPAR design – in terms of weights and online processors – is key to cache effectiveness.

We prefer not to run work on VLs – so the notion of parking applies to VLs. This means not directing work to a parked VL. VLs can be parked and unparked to handle varying workload and system conditions.

Dispatcher Affinity

With Dispatcher Affinity, work is dynamically subdivided into queues for affinity nodes. An affinity node comprises a few logical engines of a given type. Occasionally work is rebalanced.

You could, for queuing purposes, view an LPAR as a collection of smaller units – affinity nodes – though it’s not as simple as that. But that could introduce imbalance, a good motivation for the rebalancing of work I just mentioned.

What Dispatcher Affinity means is that work isn’t necessarily spread across all logical processors.

How Do They Really Behave?

With VMs I have three interesting cases, two of which I have data for. They got me thinking.

• Client A has an LPAR with 4 logical zIIPs. One is a VH, one is a VM with weight equivalent to 95% of an engine, and two are VLs. Here it was notable that there was reluctance to send work to the VLs – as one might expect. The surprise was that the VM was consistently loaded about 50% as much as the VH. For some reason there’s reluctance to send work there as well, but not as bad as to the VLs. The net effect – and why I care – is because the VH was loaded heavier than we would recommend, because of this skew.
• Client B has two LPARs on a 3-way GCP-only machine. One has two VHs and one VM with almost a whole engine’s worth of weight. In this case the load was pretty even across the 3 logical engines, according to RMF.
• Client C – for whom I don’t have data – are concerned because it is inevitable they’ll end up with 1 almost-VH logical engine.

So there’s some variability in behaviour. But that’s consistent with every customer environment being different.

Conclusion – Or Should We Avoid Vertical Mediums?

First, in many cases, there’s an inevitability about VMs, particularly for small LPARs or where there are more LPARs than physical engines. I’ll leave it as an exercise for the reader to figure out why every LPAR has to have at least one VH or VM in every pool in which it participates.

I don’t believe it makes any difference in logical placement terms whether a VM has 60% of an engine’s worth of weight or 95%. But I do think a 60% VM is more likely to lose the physical in favour of another LPAR’s logical engine than a 95% VM.

I do think it’s best to take care with the weights to ensure you don’t just miss a logical engine being a VH.

This thinking about Vertical Mediums suggests to me it’s useful to measure utilisation at the engine level – to check for skew. After all you wouldn’t want to have Delay For zIIP just because of skew – when the pool isn’t that busy.

But, of course, LPAR Design is a complex topic. So I would expect to be writing about it some more.

1. Except under z/VM with HiperDispatch enabled I’m told you would want to turn it off for a z/OS guest.

2. Often I see “close but no cigar” weight totals, such a as 997 or 1001. I have some sympathy with this as events such as LPAR moves and activations can lead to this. Nonetheless it’s a good idea to have the total be something sensible.