What’s The Latency, Kenneth?

(Originally posted 2015-03-22.)

OA37826 really is the gift that keeps on giving: I got really nosy about Coupling Facility links when it came out [1] , though most customers didn’t get the added benefits of CFLEVEL 18 for a while.

This post is about a customer installation which pointed out another benefit of the instrumentation. [2]

Customer Example

I’ve simplified the customer situation a little – in a way that doesn’t detract from the truth. [3]

Here’s a simplified version of their Parallel Sysplex environment:

They have 3 routings between two data centres – and 6 links from each CEC to the CF image in the other data centre. Structure duplexing is not used – as the customer is using external (to this sysplex) coupling facilities.

According to the SMF 74 Subtype 4 data the signalling latency from MVSA to CFB is 161μs (x2), 172μs (x2), and 176μs (x2). You can see the three routes on the diagram.

MVSB shows the same signalling latencies to CFA – which is to be expected.

You’ll notice I’ve used latency – which is what 74–4 gives you. A good rule of thumb is each 10μs of latency translates into 1 kilometer of distance. [4]

I was supplied with the customer’s own diagram and it shows slightly different distances. The discrepancies between the two sets of estimates are not accounted for by any inaccuracy in that formula. I say that because one of the customer’s path distance estimates is substantially lower than my minimum, one if substantially higher, and the third about the same. [5]

It could be a matter of the vendor being inaccurate, though not by much (and life isn’t usually that simple). If the discrepancy was massive compared to this you might begin to suspect “fibre suitcases” left in the route. In any case for once SMF can give you a view of distance.

The “local” latency is 1μs, which is the same as I’ve seen in previous cases. The latency value is an integer number of microseconds and the minimum value is 1 for a supported link type. It means “very short link indeed”.

Both the high values (161 – 176 μs) and the low value (1μs) are consistent with the Adapter types – HCA3-O LR (1X) in the former case and HCA3-O (12X) in the latter. Talking of which, the physical adapters are reported in the Channel Path Data Section (as mentioned in System zEC12 CFLEVEL 18 RMF Instrumentation Improvements ), alongside the latency. So we can see which links / CHPIDs / PCHIDs / ports etc use which routes. [6]

In this customer case there is nothing to recommend. I simply observe the three-route solution, which is patently sensible.

Impact On My Reporting

I’ve modified my reporting only slightly as a result of this customer example, I’m pleased to say.

In my tabular report that documents the paths between z/OS systems and coupling facilities I had one row per z/OS-to-CF pairing. It had the range of latencies for that pairing. In the customer example it said “161 – 176”.

That was useful as it alerted me to the possibility (I hadn’t considered before) of multiple latencies and hence multiple routes. But it told me I could do better:

Now, for each link I list the latency separately – if there is any variation. So, “tourist information” perhaps but I can discuss with a customer their use of alternate routes between sites. [7]

I consider this a nice little piece of (easy to code) extra information.

Final Thoughts

This example shows how you can verify the distance of routes between data centres – or at any rate between z/OS images and distant coupling facilities. You can verify it to within 100 metres, which I think is plenty good enough.

Note that the Coupling Facility does not select paths based on distance/latency. And that latency values are static in all the sets of data I’ve seen. These two facts are mutually consistent.

Also signalling latency is not the same as request service time. It might be interesting to compare latency to service time to try to understand the non-CPU component of service time. But expect Async requests to weaken the correlation – as requests can be expected to be delayed sometimes for reasons unrelated to signalling links.

And finally all this applies equally to links between coupling facilities for structure duplexing.

Anyhow look up the data in your favourite performance reporting tool and try it. You’ll like it!

And wasn’t I naΓ―ve when I wrote “Call me nosey [8] but I really want this – as I like to figure out whether machines are close together or in different data centres.” πŸ™‚

  1. Described in The Missing Link? and Coupling Facility Topology Information – A Continuing Journey and System zEC12 CFLEVEL 18 RMF Instrumentation Improvements  ↩

  2. By the way this customer is using CMF. But, apart from how the “OA37826” function is enabled, I don’t expect this to affect the validity of my message.  ↩

  3. And I’ve anonymised it, too. Not that the customer has anything to be embarrassed about.  ↩

  4. Not “as the crow flies” but “as the Infinibird flies”. πŸ™‚ Infinibirds fly rather more like ducks, I mean through ducts. πŸ™‚  ↩

  5. So this is not a case of systematic error.  ↩

  6. Actually that should read “have which latency” but the effect is similar.  ↩

  7. Another example of why I think I can claim to sometimes be doing Infrastructure Architecture.  ↩

  8. In my defence I’d say Shakespeare couldn’t spell his own name consistently and here I am writing “nosey” and “nosy” alternately. πŸ™‚  ↩

Published by Martin Packer


4 thoughts on “What’s The Latency, Kenneth?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: