Mainframe Performance Topics Podcast Episode 26 “Sounding Board”

In Episode 25 I said it had been a long time since we had recorded anything. That was true for Episode 25, but it certainly wasn’t true for Episode 26. What is true is that it’s taken us a long time from start to finish on this episode, and ever so much has happened along the way.

But we ploughed on and our reward is an Episode 26 whose contents I really like.

On to Episode 27!

Here are the unexpurgated show notes. (The ones in the podcast itself have a length limitation; I’m not sure Marna and I do, though.) 🙂

Episode 26 “Sounding Board”

Here are the show notes for Episode 26 “Sounding Board”. The show is called this because it relates to our Topics topic, and because we recorded the episode partly in the Pougkeepsie recording studio where Martin sounded zen, and partly at home.

Where we have been

  • Marna has been in Fort Worth for SHARE back in February

  • Martin has been to Las Vegas for “Fast Start”, for technical sales training, and he got out into the desert to Valley Of Fire State Park

  • Then, in April he “visited” Nordics customers to talk about
    • zIIP Capacity and Performance
    • So You Don’t Think You’re An Architect?
  • But he didn’t get to go there for real. Because, of course, the world was upended by both Covid and Black Lives Matter.

Follow up

  • Chapter markers, discussed in Episode 16. Marna finally found an Android app that shows them – Podcast Addict. Martin investigated that app, and noted it is available on iOS too.

What’s New – a couple of interesting APARs

  • APAR PH21919: NEW FUNCTION – WORKFLOW SUPPORT SAVE JOB OUTPUT

    • When you run a workflow step that invokes a job you can automatically save the job output in a location of your choosing files (z/OS Unix file directory).

    • In the same format as you’d see in SDSF . Means users can have an automatic permanent record of the work that was done in a workflow

    • PTF Numbers are UI68359 for 2.3 and UI68360 for 2.4

  • APAR OA56774 (since 2.2) Provides new function to prevent a runaway sysplex application from monopolizing a disproportionate share of CF resources

    • This APAR has a dependency on CFLEVEL 24.

    • This case is pretty rare, but is important when you have it.

    • Not based on CF CPU consumption. Is based on deteriorating service times to other structures – which you could measure with SMF 74–4 Coupling Facility Activity data.

Mainframe – z15 FIXCATs

  • Important to cover as there are many questions about them.

  • IBM.Device.Server.z15–8561.RequiredService

    • Absolute minimum needed to run on a z15

    • Unfortunately some of these PTFs in that list have been involved in messy PE chains

    • If that happens, involve IBM Service (Bypass PE or ++APAR)

    • Usually intent is to keep these PTFs to a minimum – and keep the number of PTFs relatively constant.

      • CORRECTION: System Recovery Boost for z15 GA1 is in Required, not Exploitation category, as the recording states!
  • IBM.Device.Server.z15–8561.Exploitation

    • Needed for optional functions, and you can decide when you want to use them.

    • This PTF list could grow – if we add new functions

  • IBM.Device.Server.z15–8561.RecommendedService

    • This is more confusing. Usually to fix a defect that is found but haven’t risen up to required. We might’ve detected it in testing, or a customer might have.

    • Over time this category probably will grow, as field experience increases

    • Might want to run an SMP/E REPORT MISSINGFIX to see what’s in this FIXCAT. Might install some, all, or none of the fixes. Might want to be more selective. Based on how much change you want to encounter, versus what problems are fixed

  • By the way there are other FIXCATs you might want to be interested in for z15, e.g. IBM.Function.SYSPLEXDataSharing

Performance – DFSORT And Large Memory

  • A very special guest joins us, Dave Betten, former DFSORT performance lead.

  • Follows on from Elpida’s item in Episode 10 “234U” in 2017, and continues the “Managing Large Memory” theme.

  • Number of things to track these days:
    • Often track Average Free
    • Also need to track Minimum Free
    • Fixed frames – Especially Db2, and now with z/OS 2.4 zCX
    • Large frames – Again Db2 but also Java Heap
  • In z/ 2.2
    • OPT controls simplified
      • Thresholds set to Auto
      • Default values changed
      • 64GB versus %
  • In z/ 2.3
    • LFAREA
      • Not reserved anymore but is a maximum
      • BTW the LFAREA value is in SMF 71
      • Dave reminded us of what’s in SMF 71
  • Dave talked about DFSORT memory controls
    • DFSORT has historically been an aggressive user of memory
    • Installation defaults can be used to control that
    • But the EXPOLD parameter needs special care – because of what constitutes “old pages”, which aren’t actually unused.
    • DFSORT Tuning Guide, especially Chapter 3
  • Dave talked about how handy rustling up RMF Overview Reports can be, with several Overview conditions related to memory.

  • Most of the information in this topic is relevant to LPARs of all sizes

Topics – Update on recording techniques

  • Last talked about this in Episode 12, December 2017

  • Planning for podcast – still using iThoughts for outlining the episode (though its prime purpose is for mind mapping and Martin (ab)uses it for depicting various topologies.

  • Recording of podcast – still using Skype to collaborate
    • Record locally on each side, but now Marna’s side is in the new Poughkeepsie recording studio!
    • Martin has moved to Piezo and then Audio Hijack
      • Recording in stereo, with a microphone stand to minimise bumps
      • Has to slow the computer’s fan speed, and has an external cooling pad
      • Also he hides behind pillows to minimise the noise and improve audio quality.
    • For a guest, it’s different. We can’t record in stereo. Guests might not have recording software. But still use Skype (unless in Poughkeepsie).
  • Production

    • Martin’s editing

      • Moved from Audacity on Mac to Ferrite on iPad OS
      • Moved to iPad so he can edit anywhere, except where there is noise. Apple Pencil helps with precision.
      • Then, throw away remote side – in stereo terms.
      • Then, perform noise reduction, still not perfect.
  • Publishing

    • Marna’s publishing: Uploading the audio, publishing show notes, still the same as before.

Customer requirements

  • – Insert Usual Disclaimer Here – which is only our thoughts.

  • RFE 139477 “Please include the CPU Time Limit for a Job/Step in SMF Type 30”

    • The CPU Time Limit in effect for a JobStep is not currently written to SMF Type30 at the end of the step.

      While a job is running this information is available in the Address Space Control Block (ASCBJSTL) and can be displayed or even modified by tools such as OMEGAMON.

      However the information is not retained after the JobStep completes. This information would be very useful after the fact to see the CPU time limit in effect for a JobStep.

      This enhancement request is to include the information in ASCBJSTL in the SMF Type30 Subtype 4 record written at the end of the JobStep.

      An additional consideration would be how to best deal with the Job CPU time Limit (as specified on the JOB statement) and whether this can also be catered for in the RFE

    • Business justification: Our site got caught out by a Test job being submitted overnight with TIME=1440 and consuming over 6 hours CPU before it was cancelled. We would like to be able to prevent similar issues in future by having the CPU Time Limit data available in SMF.

    • Our comments:

      • After the fact
        • The RFE was calling for “after the fact”. i.e. when the step has ended. Might also like the source of the limit.

        • End of step looks useful. Could run query comparing to actual CPU time, then track to see if ABEND is on the horizon

      • “As it happens”

        • Would like on the SMF Interval as well as Step End records, maybe with tools to dynamically change the parameters.

        • May not need the SMF information if vendor and IBM tools already do it today, making it perhaps not a high enough priority for SMF

        • And the source of the parameters might not be readily available in control blocks so this might not even be feasible.

On the blog

Contacting Us

You can reach Marna on Twitter as mwalle and by email.

You can reach Martin on Twitter as martinpacker and by email.

Or you can leave a comment below. So it goes…

Engineering – Part Four – Activating And Deactivating LPARs Causes HiperDispatch Adjustments

(This post follows on from Engineering – Part Two – Non-Integer Weights Are A Thing, rather than Engineering – Part Three – Whack A zIIP).

I was wondering why my HiperDispatch calculations weren’t working. As usual, I started with the assumption my code was broken. My code consists of two main parts:

  • Code to build a database from the raw SMF.
  • Code to report against that database.

(When I say “my code” I usually say “I stand on the shoulders of giants” but after all these years I should probably take responsibility for it.) 🙂

Given that split the process of debugging is the following:

  1. Check the reporting code is doing the right thing with what it finds in the database.
  2. Check the database accurately captures what was in the SMF records.

Only when those two checks have passed should I suspect the data.

Building the database itself consists of two sub stages:

  1. Building log tables from the raw records.
  2. Summarising those log tables into summary tables. For example, averaging over an hour.

If there is an error in database build it is often incorrect summarisation.

In this case the database accurately reports what’s in the SMF data. So it’s the reality that’s wrong. 🙂

A Very Brief Summary Of HiperDispatch

Actually this is a small subset of what HiperDispatch is doing, sufficient for the point of this post.

With HiperDispatch the PR/SM weights for an LPAR are distributed unevenly (and I’m going to simplify to a single pool):

  1. If the LPAR’s overall weight allows it, some number of logical processors receive “full engine” weights. These are called Vertical Highs (or VH’s for short). For small LPARs there could well be none of these.
  2. The remainder of the LPAR’s weight is distributed over one or two Vertical Mediums (or VM’s for short).
  3. Any remaining online logical processors receive no weight and are called Vertical Lows (or VL’s for short).

Enigma And Variations

It’s easy to calculate what a full engine’s weight for a pool is: Divide the sum of the LPARs’ weights for the pool by the number of shared physical processors. You would expect a VH logical processor to have precisely this weight.

But what could cause the result if this calculation to vary. Here the maths is simple but the real world behaviours are interesting:

  • The number of physical processors could vary. For example, On-Off Capacity On Demand could add processors and later take them away.
  • The total of the weights for the LPARs in the pool could vary.

The latter is what happened in this case: the customer deactivated two LPARs on a machine – to free up capacity for other LPARs to handle a workload surge. Later on they re-enabled the LPARs, IPLing them. I’m not 100% certain but it seems pretty clear to me that IPLing doesn’t cause the LPAR’s weights to come out of the equation; I’m pretty sure IPLing doesn’t affect the weights.

These were two very small LPARs with 2–3% of the overall pool’s weights each. But they caused the above calculation to yield varying results:

  • The “full engine” weight varied – decreasing when the LPARs were down and increasing when they were up.
  • There was some movement of logical processors between VH and VM categories.

The effects were small. Sometimes a larger effect is easier to debug than a smaller one. For one, it’s less likely to be a subtle rounding or accuracy error.

The conversion of VH’s to VM’s (and back) has a “real world” effect: A VH logical processor is always dispatched on the same physical processor. the same is not so true for a VM. While there is a strong preference for redispatch on the same physical, it’s not guaranteed. And this matters because the cache effectiveness is reduced when a logical processor moves to a different physical processor.

So, one recommendation ought to be: If you are going to deactivate an LPAR recalculate the weights for the remaining ones. Likewise, when activating, recalculate the weights. In reality this is more a “playbook” thing where activation and deactivation is automated, with weight adjustments built in to the automation. Having said that, this is a “counsel of perfection” as not all scenarios can be predicted in advance.

What I Learnt And What I Need To Do

As for my code, it contains a mixture of static reports and dynamic ones. The latter are essentially graphs or the makings of – such as CSV files.

Assuming I’ve done my job right – and I do take great care over this – the dynamic reports can handle changes through time. So no problem there.

What’s more difficult is the static reporting. So, one of my key reports is a shift-level view of the LPAR layout of a machine. In the example I’ve given, it had a hard time getting it right. For example, the weights for individual LPARs’ VH processors go wrong. (The weight of a full processor worked in this case – but only because the total pool weight and number of physical engines didn’t change. Which isn’t always the case.)

To improve the static reporting I could report ranges of values – but that gets hard to consume and, besides, just tells you things vary but not when and how. The answer lies somewhere in the region of knowing when the static report is wrong and then turning to a dynamic view.

In particular, I need to augment my pool-level time-of-day graphs with a stack of the LPARs’ weights. This would help in at least two ways:

  • It would show when weights were adjusted – perhaps shifting from one LPAR to another.
  • It would show when LPARs were activated and de-activated.

A design consideration is whether the weights should stack up to 100%. I’ve come to the conclusion they shouldn’t – so I can see when the overall pool’s weight changes. That reveals more structure – and I’m all for not throwing away structure.

Here’s what such a graph might look like:

In this spreadsheet-driven mockup I’ve ensured the “now you see them now you don’t” LPARs are at the top of the stack.

I don’t know when I will get to this in Production code. As now is a particularly busy time with customer studies I probably should add it to my to-do list. But I’ll probably do it now anyway… 🙂

Head Scratching Time

In this set of data there was another phenomenon that confused me.

One LPAR had twelve GCPs online. In some intervals something slightly odd was happening. Here’s an example, from a single interval:

  • Logical Processors 0–4 had polar weights (from SMF70POW as calculated in Engineering – Part Two – Non-Integer Weights Are A Thing) of 68.9. (In fact there was a very slight variation between them.)
  • Logical Processor 5 had a polar weight of 52.9.
  • Logical Processor 6 had a polar weight of 12.6.
  • Logical Processors 7 to 11 had polar weights of 0.

If you tot up the polar weights you get 410 – which checks out as it’s the LPAR’s weight in the GCP pool (obtained from other fields in the SMF 70 record).

Obviously Logical Processors 0, 1, 2, 3, and 4 are Vertical High (VH) processors – and bits 0,1 of SMF70POF are indeed “11”.

But that leaves two logical processors – 5 and 6 with non-zero, non-VH weights. But they don’t have the same weight. This is not supposed to be the case.

Examining their SMF70POF fields I see:

  • Logical Processor 5 has bits 0,1 set to “10” – which means Vertical Medium (VM).
  • Logical Processor 6 has bits 0,1 set to “01” – which means Vertical Low (VL).

But if Logical Processor 6 is a VL it should have no vertical weight at all.

Well, there is another bit in SMF70POF – Bit 2. The description for that is “Polarization indication changed during interval”. (I would’ve stuck a “the” in there but nevermind.)

This bit was set on for LP 6. So the LP became a Vertical Low at some point in the interval, having been something else (indeterminable) at some other point(s). I would surmise VL was its state at the end of the interval.

So, how does this explain it having a small but non-zero weight? It turns out SMF70POW is an accumulation of sampled polar weight values, which is why (as I explained in Part Two) you divide by the number of samples (SMF70DSA) to get the average polar weight. So, some of the interval it was a VM, accumulating. And some of the interval it was a VL, not accumulating.

Mystery solved. And Bit 2 of SMF70POF is something I’ll pay more attention to in the future. (Bits 0 and 1 already feature heavily in our analysis.)

This shifting between a VM and a VL could well be caused by the total pool weight changing – as described near the beginning of this post.

Conclusion

The moral of the tale is that if something looks strange in your reporting you might – if you dig deep enough – see some finer structure (than if you just ignore it or rely on someone else to sort it out).

The other, more technical point, is that if almost anything changes in PR/SM terms – it can affect how HiperDispatch behaves and that could cause RMF SMF 70–1 data to behave oddly.

The words “rely on someone else to sort it out” don’t really work for me: The code’s mine now, I am my own escalation route, and the giants whose shoulders I stand on are long since retired. And, above all, this is still fun.

zIIP Capacity And Performance Presentation

A few years ago I built a presentation on zIIP Capacity Planning. It highlighted the need for better capacity planning for zIIPs and outlined precisely why zIIPs couldn’t be run as busy as general purpose processors (GCPs).

Since then a lot has changed. And I, in common with most people, have a lot more experience of how zIIPs perform in customer installations. So, earlier this year I updated the presentation and broadened the title to include Performance.

I was due to “beta” the presentation at a user group meeting in London in March. Further, I was due to present it to a group of customers in Stockholm in May. The former, understandably, was cancelled. The latter happened as a Webex.

The essential thesis of the presentation is that zIIP Capacity and Performance needs a lot more care than most customers give it, particularly for CPU-stringent consumers such as the Db2 engine (MSTR and DBM1). (Actually I’ve talked about Db2 and it’s relationship with zIIP in Is Db2 Greedy?.)

What’s really new about this presentation is a shift in emphasis towards Performance, though there is plenty on Capacity. And one key aspect is LPAR Design. For example, to aid the “Needs Help” mechanism where a General Purpose Processor (GCP) aids a zIIP, some LPARs might need to forego access to zIIP. This might be controversial – as you want as much zIIP exploitation as possible. But for some LPARs giving them access to zIIP makes little or no sense. Meanwhile other LPARs might need better access to zIIP.

The presentation is also updated in a few key areas:

  • More comprehensive and up to date treatment of Db2 – and if you are a Db2 customer you really should pay attention to this one. (I’m grateful to John Campbell and Adrian Burke for their help with this topic.)
  • zCX Container Extensions in z/OS 2.4. This can be a major consumer of zIIP. Obviously this needs to be planned for and managed.
  • z15 System Recovery Boost (SRB). I’m looking forward to seeing how much this speeds up IPLs – and I think I’m going to have to refurbish my IPL/Restart detection code to do it justice. I also think you will want to consider how an event affects the other LPARs sharing the zIIP pool.

As with So You Don’t Think You’re An Architect?, I’m planning on evolving the presentation over time – and the above list shows how I’ve already done it. I’m also interested in giving it to any audience that wants it. Let me know if it would be of interest and I’ll see what I can do.

In the mean time, here’s the presentation: “zIIP Capacity And Performance” presentation

So You Don’t Think You’re An Architect?

Every year I try to write one new presentation. Long ago, it feels like, I started on my “new for 2020” presentation. It’s the culmination-so-far 🙂 of my “architecture thing”.

“What Architecture thing?” some of you might be asking.

It’s quite a simple idea, really: It’s the notion that SMF records can be used for far more than just Performance, even the ones (such as RMF) that we’re notionally designed for Performance. A few years ago I wrote a presentation called “How To Be A Better Performance Specialist” where I pushed the germ of this notion in two directions:

  • Repurposing SMF for non-Performance uses.
  • Thinking more widely about how to visually depict things.

The first of these is what I expanded into this “Architecture” idea. (The second actually helps quite a bit.) But I needed some clear examples to back up this “who says?” notion.

My day job – advising customers on Performance matters – yields a lot of examples. While the plural of “anecdote” isn’t “data”, the accumulation of examples might be experience. And boy do I have a lot of that now. So I set to writing.

The presentation is called “So You Don’t Think You’re An Architect?” A good friend of mine – who I finally got to meet when I did a customer engagement with him – thought the title a little negative. But it’s supposed to be a provocative statement. Even if the conclusion is “… and you might be right”. So I’ve persisted with it (and haven’t lost my friend over it). 🙂

I start at the top – machines and LPARs – and work my way down to the limits of what SMF 30 can do. I stop there, not really getting much into the middleware instrumentation for two reasons:

  • I’ve done it to death in “Even More Fun With DDF”.
  • This presentation is already quite long and intensive.

On the second point, I could go for 2 hours, easily, but I doubt any forum would let me do a double session on this topic. Maybe this is the book I have in me – as supposedly everybody does. (Funnily enough I thought that was “SG24–2557 Parallel Sysplex Batch Performance”. Oh well, maybe I have two.) 🙂

One hour has to be enough to get the point across and to show some actual (reproducible) examples. “Reproducible” is important as it is not (just) about putting on a show; I want people to be able to do this stuff and to get real value out of it.

One criticism I’ve faced is that I’m using proprietary tools. That’s for the most part true. Though sd2html, An Open Source WLM Service Definition Formatter – Mainframe, Performance, Topics is a good counter-example. I intend to do more open sourcing, time permitting. And SMF 30 would be a good target.

So, I’ve been on a long journey with this Architecture thing. And some of you have been on bits of the journey with me, for which I’m grateful. I think the notion we can glean architectural insight from SMF has merit. The journey continues as recently I’ve explored:

I’ll continue to explore – hence my “culmination-so-far” quip. I really don’t think this idea is anything like exhausted. And – in the spirit of “I’ll keep revising it” I’ve decided to put the presentation in GitHub. (But not the raw materials – yet.) You can find it here.

You might argue that I risk losing speaking engagements if I share my presentation. I have to say this hasn’t happened to me in the past, so I doubt it makes much difference now. And this presentation has already had one outing. I expect there will be more. And anyway the point is to get the material out. Having said that, I’m open to webcasting this presentation, in lieu of being able to travel.

IMS Address Space Taxonomy

(I’m grateful to Dougie Lawson for correcting a few errors in the original version of this.)

I don’t often write about IMS and there’s a good reason for it: Only a small proportion of the customers I deal with use it. I regard IMS as being one of those products where the customers that have it are fanatical – in a good way. 🙂

So when I do get data from such a customer I consider it a golden opportunity to enhance my tooling. And so it has been recently. I have a customer that is a merger of three mainframe estates – and I have data from two of the three heritages. Both of these have IMS.

This mergers happened long ago but, as so often happens, the distinct heritages are evident. In particular, the way they set up the IMS systems and regions differs.

You can, to a first approximation, separate IMS-related address spaces into two categories:

  • IMS System Address Spaces
  • IMS Application Regions

In what follows I’ll talk about both, referencing what you can do with SMF 30, specifically. Why SMF 30? Because processing SMF 30 is a scalable method for classifying address spaces, as I’ve written about many times before.

IMS System Address Spaces

IMS system address spaces run with program name “DFSMVRC0” and there are several different address spaces. For example, over 30 years ago the “DL/I SAS” address space became an option – to provide virtual storage constraint relief. It;s been mandatory for a long time. Also there is a DBRC address space. All have the same program name.

The system address spaces have Usage Data Sections which say “IMS”. The Product Version gives the IMS version. In this customer’s case one part of the estate says “V15” and the other part “V14”.

The IMS Control Region is the only system address space that can attach to Db2 or MQ. So, if the program name is “DFSMVRC0” and there are Usage Data Sections for either Db2 or MQ we know this is the Control Region. But this isn’t always going to be the case – as some IMS environments connect to neither Db2 nor MQ. So here the Product Qualifier field can be helpful:

  • Both DBRC and Control Region address spaces have a Product Qualifier of “TM”. But you can’t necessarily tell them apart from things like I/O rates. However, you might expect a DBRC address space to have a name with something like “DBR” in. (I’m not wowed by that level of fuzziness.)
  • A DL/I SAS has Product Qualifier “DBCTL”.

I’m going to treat IRLM as an IMS System Address Space, when really it isn’t. This is the lock manager – and it’s the same code whether you’re running IMS or Db2. The program name is DXRRLM00 and there is little to distinguish between an IRLM for IMS or for a Db2 subsystem in SMF. (In fact which Db2 an IRLM address space is associated with isn’t in SMF either.) the best my code can do is parse job names, service class, report class etc names for “IMS” or, still worse, “I” but no “D”.

IMS Application Regions

IMS application address spaces – whether MPRs or BMPs – run with program name “DFSRRC00”. They also have Usage Data Sections that say “IMS” but don’t – in the Product Qualifier field – say anything about the subsystem it’s using. Similarly, when CICS attaches to IMS it’s Product Qualifier isn’t helpful.

To my mind the distinction between a MPR (Message Processing Region) and a BMP (Batch Message Processor) is subtle. For example I’ve seen BMPs that sit there all day, fed work by MQ. You probably would glean something from Service Classes and Report Classes. Relying on the address space name is particularly fraught.

Two Diverse IMS Estates

This latest customer has two contrasting styles of IMS environment, mainly in their testing environments:

  • One has lots of very small IMS environments.
  • the other has few, larger testing environments.

Also, as I noted above, one estate is IMS V14 and the other is V15. This does not appear to be a case of V15 in Test/Development and V14 in Production.

So I guess their testing and deployment practices differ – else this would’ve been homogenised.

I’m going to enjoy talking to the customer about how these two different configurations came to be.

Conclusion

IMS taxonomy can be done – but it’s much messier than Db2 and MQ. It relies a lot on naming conventions and spotting numerical dynamics in the data.

Note: For brevity, I haven’t talked about IMS Datasharing. That would require me to talk at length of XCF SMF 74–2 and Coupling Facility 74–4. Something else I haven’t discussed is “Batch DL/I” – where a batch job is it’s own IMS environment. This is rather less common and I haven’t seen one of these in ages.

I would also say, not touched on here, that SMF 42–6 would yield more clues – as it documents data sets.

And, of course serious IMS work requires its own product-specific instrumentation. Plus, as Dougie pointed out to me, the Procedure JCL.

sd2html, An Open Source WLM Service Definition Formatter

There have been a number of WLM Service Definition formatters over the years. So why do we need another one?

Well, maybe we don’t but this one is an open source one, covered by the MIT licence. That means you can change it:

  • You could contribute to the project.
  • You could modify it for your own local needs.

While IBM has other WLM Service Definition Formatters, it was easy to get permission to open source this one.

It’s the one I started on years ago and have evolved over the many engagements where I’ve advised customers on WLM.

If it has an unusual feature it’s that I’ve stuck cross links in wherever I can – which has made it easier for me to use. For example, everywhere a Service Class name appears I have a link to its definition. So, a Classification Rule definition points to the Report Class definition.


Installing sd2html


sd2html is a single PHP script, originally run on a Linux laptop and then on a MacBook Pro. Both platforms come with web servers and PHP built in. In the Mac’s case it’s Apache.

So, to use it you need to provide yourself with a tame (perhaps localhost) web server. It needs to run PHP 7.

Place sd2html.php somewhere that it can be run by the web server.


Extracting A WLM Service Definition


In my experience, most customers are still using the ISPF WLM Application. there is a pull down menu to print the Service Definition. Choose the XML option and it will write to a FB 80 sequential file. This you need to place on the web server, as previously mentioned.

Customers send me their WLM Service Definitions in this format, downloaded with EBCDIC to ASCII translation. It’s easy to email this way.

When I receive the file it looks broken. I keep reassuring customers it isn’t because I can one-line it, throwing away the new line characters. This used to be a fiddle in my editor of choice – then Sublime Text, now BBEdit. That works well.

But I’ve eliminated the edit step: sd2html now does the edit for me, before passing the repaired text onto the XML parser. (Originally the XML parser read the file on disk directly. Now the code reads the file in, removes the new lines, and then feeds the result to the XML parser.)


Using sd2html


So you’ve got the Service Definition accessible by your PHP web server. Now what?

From a browser invoke sd2html on your web server with something like

http://localhost/sd2html.php?sds=wlm.xml

You obviously need to adjust the URL to point to sd2html. Also the sds query string parameter needs to point to your WLM Service Definition file.

Then browse to your heart’s content, following links which you’ll find in two places:

  • The table of contents at the beginning.
  • Within the document.

Open Sourcing sd2html


I said in filterCSV, An Open Source Preprocessor For Mind Mapping Software I had another open source project in the works. sd2html is it. I have one more piece of code that will need a lot of work to open source – but I think mainframers will like it. And two more potential ones – that aren’t specific to mainframes.

So, I welcome contributions to sd2html, or even just comments / ideas / requirements. Specifically, right now, I’d value:

  • Documentation writing. (This post is all there is right now.)
  • Early testers.
  • Creative ideas.
  • People who know PHP better than I do.
  • People who can think of how to handle the national language characters that show up from time to time.

Anyhow, try it if you can and let me know what you think.

filterCSV, An Open Source Preprocessor For Mind Mapping Software

I have a number of ideas for things I want to open source, some directly related to the day job and some not. This post is about one piece of software that I use in my day job but which you probably wouldn’t recognise as relevant to mainframe performance.

To me the rule of thumb for candidates for open sourcing is clear: Something of use outside of IBM but with little-to-no prospect of being commercialised.

filterCSV is just such a piece of software.


What’s The Original Point Of filterCSV?


filterCSV started out as a very simple idea: Our processing often leads to CSV (Comma Separated Value) files with a tree structure encoded in them.

This tree structure enables me to create tree diagrams in iThoughts. iThoughts is mind mapping software I’m (ab)using to draw tree diagrams. Whereas most people create mind maps by hand, I’m bulk loading them from a CSV file. Strictly speaking, I’m not creating a mind map – but I am creating a tree.

iThoughts has a variant of CSV for importing mind maps / trees. It’s documented here. It’s a very simple format that could be confected by any competent programmer, or from a spreadsheet.

So, to filterCSV: I’ve got in the habit of colouring the nodes in the tree I create in iThoughts. Originally I did it by hand but that doesn’t scale well as a method. If I discern a bunch of nodes (perhaps CICS regions) are part of a group I want to colour them all at once.

The very first piece of filterCSV, which is a Python 3 script, compared nodes to a regular expression. If they matched they’d be coloured with a specified RGB value – by altering the CSV file. I would import this altered CSV file into iThoughts.

In a real customer engagement this saves a lot of time: For CICS regions the nodes have the string “RC: CICSAORS” in, for example. “RC” is short for “Report Class”, of course. So the following works quite well as a command line invocation:


filterCSV < input.csv > output.csv ‘RC:CICSAORS’ FFBDAA


So every node with “RC: CICSAORS” in its text gets coloured with RGB value FFBDAA.

If I keep going with this I can find rules that colour all the CICS regions. Then I understand them much better.


Open Sourcing filterCSV


Let’s generalise the idea: You might be creating a mind map and want to colour some nodes, based on a readily-codifiable criterion. Here’s what you do:

  1. You export the mind map from iThoughts in CSV format.
  2. You throw the CSV file through filterCSV, specifying the regular expression and the colour on the command line.
  3. You import the resulting CSV file into iThoughts.

I don’t know how many users of mind mapping software want to do this, but I bet I’m not alone in wanting it. If the effort to open source it were minimal it makes sense to do it, rather than accepting I’m going to be the only user.

So, I put it on IBM’s internal GitHub site – and I was pleased when Christian Clauss of IBM Zurich joined me in the effort. He’s brought a lot of experience and, in particular, testing knowledge to bear.

Then I got permission to open source filterCSV. This turned out to be very straightforward for a number of reasons:

  • IBM is keen on open sourcing stuff.
  • There is no prospect of this becoming product code.
  • The process for open sourcing when there are no dependencies is streamlined.

I’ll also say this is a good practice run for open sourcing things that are of wider interest – and particularly for the mainframe community. Which is something I really want to do.

So it’s now a project on GitHub. I subsequently went through the process with another one – which I’ll talk about in another blog post.


filterCSV Has Morphed Somewhat


I realised a couple of things while developing filterCSV:

  1. It’s not just iThoughts that the method could be applied to.
  2. What I really have is a tree manipulation tool. In fact that’s essentially what mind mapping software is.

It’s the combination of those two points that made me think the tool could be more generally useful. So here are some things I’ve added to make it so:

  • It can import flat text – creating a tree using indentation. That can include Markdown using asterisks for bullets.
  • It can import XML.
  • You can delete nodes that match a regular expression.
  • You can change the shape of a matching node, or its colour.
  • You can write HTML in tabular or nested list form.
  • You can write XML – either OPML or Freemind.
  • You can promote nodes up the hierarchy, replacing their parents.
  • You can spread Level 0 nodes vertically or horizontally. This helps when you have multiple trees.

Craig Scott, the developer of iThoughts, kindly gave me the RGB values for iThoughts’ colour palette. So now you can specify a colour number in the palette. (You can actually go “next colour” (or “nc” for short), which is quite a boon when you have multiple regular expression rules.)

Some of these things came to me while using filterCSV in real life; the experience of actually using something you built is useful.


Conclusion


So this has been a fun project, where I’ve learnt a lot of Python. I continue to have things I want to do to filterCSV, including something that should be out in the very next few days. The general “tree manipulation” and “adjunct to iThoughts” ideas seem to have merit. And I’m enjoying using my own tooling.

If you fancied contributing to this open source project I’m open to that. In any case you can find it here on GitHub. The latest release is 1.2 and 1.3 should, as I say, be out soon.

And I have plenty of ideas for things to enhance filterCSV.

Is Db2 Greedy?

If you want really good Db2 performance you follow the guidelines from Db2 experts.

These guidelines contain such things as “ensure Db2 has excellent access to zIIP” and “put the Db2 address spaces up high in the WLM hierarchy”. Some of these rules come as a bit of a shock to some people, apparently.

If you take them all together it sounds like Db2 is greedy. But is it really? This blog post seeks to answer that question.


Why Is Db2 Performance So Important?


It’s a good idea to understand how important Db2’s performance is – or isn’t. Let’s assume at least some of the work connecting to Db2 is businesswise important. Then it depends on Db2’s own performance.

The principle is the server needs to have better access to resources than the work it serves.

Let’s take two examples:

  • If writing to the Db2 logs slows down commits will slow down – and the work that wants to commit will slow down.
  • If Prefetch slows down the work for which the prefetch is done will slow down. Ultimately – if we run out of Prefetch Engines – the work will revert to synchronous I/O.

So, in both these cases we want the appropriate Db2 components to run as fast as possible.

“Ah”, you might say, “but a lot of this work is asynchronous”. Yes, but here’s something you need to bear in mind: Asynchronous work is not necessarily completely overlapped. There’s a reason we have time buckets in Accounting Trace (SMF 101) for asynchronous activities: A sufficiently slowed down async activity can indeed become only partially overlapped. So it does matter.


What Resources Are We Talking About?


In both the above examples zIIP performance comes into play:

  • Since Db2 Version 10, Prefetch Engines (which are really Dependent Enclaves) are 100% eligible for zIIP. Similarly Deferred Write Engines. These run in the DBM1 address space.
  • Since Db2 Version 11, Log Writes are similarly eligible. These are issued from the MSTR address space.

Note: The number of Prefetch Engines (or rather their limit) is greatly increased in Db2 Version 12 – from 600 to 900. I think this provides some headroom for surges in requests – but it doesn’t obviate the need for DBM1 to have very good access to zIIP.

By the way Robert Catterall has a very good discussion on this in Db2 for z/OS Buffer Pools: Clearing the Air Regarding PREFETCH DISABLED – NO READ ENGINE.

Note Also: Everything I’ve said about Prefetch Engines is true of Deferred Write Engines.

The above has been about zIIP but almost all the same things apply to General Purpose CPU. The two go together in the WLM Policy: Delay for CPU and Delay For zIIP are part of the denominator in the calculation of Velocity.

By the way, just because a Delay sample count is low doesn’t necessarily mean there was no delay. Furthermore, don’t be overly reassured if the zIIP-on-GCP number is low: In both cases there can be zIIP Delay in between samples. Further, the not crossing over to GCP can still hide delay in getting dispatched on a zIIP.

Further, there are scenarios where a zIIP doesn’t get help from a GCP:

  • The logical zIIP has to be dispatched on a physical to ask for help. This is less timely for a Vertical Medium (VM) and still less likely for a Vertical Low (VL) – compared to a Vertical High (VH).
  • The GCP pool might itself be very busy and so might not offer to help the zIIPs.

(By the way zIIPs ask for help; GCPs don’t go round trying to help.)

Of course, CPU is not the only category of resource Db2 needs good access to. Memory is another. Db2 is a great exploiter of memory, getting better with each release. By Version 10 virtually all of it was 64-Bit and long-term page-fixed 1MB pages are the norm for buffer pools.

It remains important to provision Db2 with the memory it needs. For example, it’s best to prevent buffer pools from paging.

Usually Db2 is the best place to consider memory exploitation, too.

Then there’s I/O. Obviously fast disk response helps give fast transaction response. And technologies like zHyperWrite and zHyperLink can make a significant difference; Db2 is usually an early exploiter of I/O technology.

For brevity, I won’t go into Db2 Datasharing requirements.


Db2 And WLM


The standard advice for Db2 and WLM is threefold:

  • Place the IRLM (Lock Manager) address space in SYSSTC – so above all the application workload and the rest of the Db2 subsystem.
  • Place the DBM1, MSTR, and DIST address spaces above all the applications, with the possible exception of genuine CICS Terminal Owning Regions (TORs), using CPU Critical to further protect their access to CPU.
  • Understand that DDF work should be separately classified from the DIST address space and should be below the DIST address space in the WLM hierarchy.

“Below” means of a lower Importance, not necessarily a lower Velocity. Importance trumps the tightness of the goal.

While we’re talking about Velocity, it’s important to keep the goal realistic. In my experience I/O Priority Queuing keeps attained Velocity higher than without it enabled. Because the samples are dominated by Using I/O. This also means CPU and zIIP play less of a role in the Velocity calculation.

Follow these guidelines and you give Db2 the best chance in competing for resources – but still no guarantee a heavily constrained system won’t cause it problems.


What Of Test Db2 Subsystems?


Perhaps a Test* Db2 isn’t important. Well it is to the applications it serves, in the same way that a Production Db2 is to a Production workload. So two things flow from this:

  • The test Db2 should have better access to resources than the work it supports.
  • If you want to run a Production-like test you’d better give the Db2 that’s part of it good access to resources.

That’s not to say you have to treat a Test Db2 quite as well as you’d treat a Production Db2.


What Of Other Middleware?


Db2 isn’t necessarily exceptional in this regard. MQ and IMS are certainly similar. For MQ we’re talking about the MSTR and CHIN address spaces. For IMS the Control Region and SAS, at least.

You’d want all of these to perform well – to serve the applications they do.


Conclusion


Despite apparently strident demands, I wouldn’t say Db2 is greedy. I entirely support the notion that it needs exceptionally good access to resources – because of how its clients’ performance is so dependent on this. But there are indeed other things that play an even more central role – such as the address spaces in SYSTEM and those that are in SYSSTC.

One final point: I said I wouldn’t talk about Datasharing but I will just say that an ill-performing member can drag down the whole group’s performance. So we have to take a sysplex-side view of resource allocation.

Let’s Play Master And Servant

I’ve been experimenting with invoking services on one machine from another. None of these machines have been mainframes. They’ve been iPads, iPhones, Macs and Raspberry Pis.

I’ve used the plural but the only one actively dualed is iPhone. And that plays a part of the story.

For the purpose of this post you can view iPads and iPhones the same.

Actually mainframers needn’t switch off – as some of this might be a useful parallel.

Why Would You Invoke Services On Another Machine?

It boils down to two things, in my view:

  • Something the other machine has.
  • Something the other machine can do.

I have examples of both.

At this point I’m going to deploy two terms:

  • Heterogeneous – where the architecture of the two machines is different.
  • Homogeneous – where the architecture of the two machines is the same.

You can regard Pi calling iOS as heterogeneous. You can regard iOS calling iOS as homogeneous. there is quite a close relationship between these two terms and the “has” versus “can do” thing. But it’s not 100% true.

On “has” I would probably only invoke a service on another device of the same type of it had something that wasn’t on the invoking machine. Such as a calendar. So, for example, I could copy stuff from my personal calendar into my work calendar.

On “can do” I would invoke a service on another machine of a different type of it can do something the invoking machine can’t do. For example, a Raspberry Pi can’t create a Drafts draft – but iOS (and Mac) can.

But Pi can do a lot of things iOS can do, and vice versa. So sometimes “has” comes into play in a heterogeneous way. (I don’t have a good example of this – and it’s not going to feature any further in this post.)

Now let’s talk about some experiments – that I consider “Production-ready”.

Automating Raspberry Pi With Apache Webserver

There are certain things Raspberry Pi – which runs Linux – can do that iOS can’t. One of them is use certain C-based Python libraries. One I use extensively is python-pptx – which creates PowerPoint presentations. My Python code uses this library to make presentations from Python. Normally it would run on my Mac.

At 35,000 feet I can run my Pi on a battery and access it on a Bluetooth network. I can SSH in from my iPad and also have the Pi serve web pages to the iPad.

So I don’t need to pull out my nice-but-cumbersome Mac on a plane.

SSH is my method of choice for administering the Pi. But what of invoking services? Here’s where Apache (Webserver) comes in handy. It’s easy to get Apache with PHP up and running. In my case I use URLs of the form http://pi4.local/services/phpinfo.php to access it from iOS:

  • I can use the Safari web browser.
  • I can use a Shortcuts action to GET or POST. In fact all http verbs.
  • I can write JavaScript in Scriptable to access the webserver.
  • and so on.

Here’s a very simple PHP example:

<?PHP                                                                                                                                          
phpinfo();                                                                                                                                     
?>                                                                                                                                             

I’ve saved this as phpinfo.php in /var/www/html/services on the Pi. So the URL above is correct to access it.

I don’t want to get into PHP programming but the first and the last line are brackets around PHP code. Outside of these brackets could be HTML, including CSS and javascript. phpinfo() itself is a call to a built in routine that gives information about the PHP setup in the web server. (I use it to validate PHP is even installed properly and accessible from the web server.)

The point of this example is that it’s very simple to set up a web server that can execute real code. PHP can access, for example, the file system.

Here’s another example:

<?php                                                                                                                                          
print("<table>\n");                                                                                                                            
print("<th><td>Parameter</td><td>Value</td></th>\n");                                                                                          
foreach ($_GET as $key => $value) {                                                                                                            
  print("<tr><td>$key</td><td>$value</td></tr>\n");                                                                                            
}                                                                                                                                              
print("</table>\n");                                                                                                                           

?>                         

This is queryString.php and it illustrates one more important point: You can pass parameters into a script on the Pi, using the query string portion of the URL. This example tabulates any passed in parameters.

So the net of this is you can automate stuff on the Raspberry Pi (or any Linux server or even a Mac) if you set it up as a web server.

Automating iOS With Shortcuts And Pushcut Automation Server

Automating an iOS device from outside is rather more difficult. There are several components you need:

  1. Shortcuts – which is built in to both iPad OS and iOS.
  2. A way to automatically kick off a shortcut from another device.
  3. Some actual shortcuts to do the work.

Most people reading this are familiar with items 1 and 3. Most people won’t, however, know how to do item 2.

This is where Pushcut comes to the rescue. It has an experimental Automation Server. When I say “experimental”, Release 1.11 three weeks ago from the time of writing introduced it as “experimental” but there have been enhancements released since then. So I guess and hope it’s here to stay, with a few more enhancements perhaps coming.

Before I go on I should say some features of Pushcut require a subscription – but it’s not onerous. I don’t know if Automation Server is one of them – as I happily subscribed to Pushcut long before Automation Server was a thing. There’s a lot more to Pushcut than what I’m about to describe.

When you start up Automation Server it has access to all your Shortcuts, ahem, shortcuts. You can invoke one using a URL of the form

https://api.pushcut.io/<my-secret>/execute?shortcut=Hello%20World&input=Bananas%20Are%20Nice

The input parameter is the shortcut’s input parameter. You don’t have to have an input parameter, of course so just leave off the Bananas%20Are%20Nice part. (Notice it is percent encoded, which is easy to do, as is the shortcut name.)

The above invokes a (now obsolete version of my) Hello Worldshortcut. Obsolete because it now turns the input into a dictionary.

JSON is passed in thus:

https://api.pushcut.io/<my-secret>/execute?shortcut=Hello%20World&input={"a":"123%20456,789"}

Again it’s percent encoded.

On Pi (via SSH from Mac) you can use the Lynx textual browser, wget, or curl. Actually, on Mac, you can do it natively – if you want to.

When you use the URL Pushcut – via the web – invokes the Automation Server on your device. “Via the web” is important as you can’t use the Automation Server technique without Internet access. You can, however, use another function of Pushcut – Notifications-based running of a shortcut.

Curl Invocation

Curl invocation is a little more complex than invoking using a web browser. Curl is a command line tool to access URLs. Here is how I invoke the same shortcut, complete with a JSON payload

curl -G -i 'https://api.pushcut.io/<my-secret>/execute' --data-urlencode 'shortcut=Hello World' --data-urlencode 'input={"a":"123 456,789"}'

Note the separation of the query string parameters and their encoding. I found this fiddly until I got it right.

An Automation Server

It’s worth pointing out that Automation Server should run on a “dedicated” device. In practice few people will have these. So I’ve experimented with what that actually means.

First, buying a secondhand device that is good enough to run iOS 12 (the minimum supported release and not recommended) is prohibitive. In the UK I priced them above £100 so I’d have to have a much better use case than I do.

Second, I don’t have a spare device lying around.

But I do have a work phone (that I donated, actually) that rarely gets calls. It does get notifications but those don’t seem to stop Automation Server from functioning just fine.

So this approach might not be available to many people.

Automating Drafts From Raspberry Pi

One good use case is being able to access my Drafts drafts from my Raspberry Pi. Drafts is an excellent environment for creating, storing and manipulating text. I highly recommend it for iPhone, iPad and Mac users. Again a well-worth-it subscription could apply.

So, my use case is text access on the Pi. I might well create a draft on iOS from the Pi, perhaps storing some code created on the Pi.

So I’ve written shortcuts that:

  • Create a Drafts draft.
  • Retrieve the UUIDs (identifiers) of drafts that match a search criterion. (In the course of writing this post I upgraded this to return a JSON array of UUID / draft title pairs.)
  • Move a draft to the trash (as the way of deleting it).

Here’s what the first one of these looks like:

It’s a simple two action shortcut:

  • Create the draft.
  • Return the UUID (unique ID) of the newly created draft.

The second action isn’t strictly necessary but I think there will be cases where the UUID will be handy. For example, deleting the draft requires the UUID. So this action returns the UUID as a text string.

The first action is the interesting one. It takes the shortcut’s input as the text of the draft and creates the draft. Deselecting “Show When Run” is important as otherwise the Drafts user interface will show up and you don’t want that with an Automation Server. (It’s likely to temporarily stall automation.)

So this is a nice example where iOS can do something Raspberry Pi (or Linux) can’t. This is not a Mac problem as Drafts has a Mac client.

Outro

A couple of other things:

I did a lot of my testing on a Mac – both directly as a client and also by SSH’ing into the Pi. (Sometimes I SSH’ed into the Pi on both iPad and iPhone using the Prompt app.)

By the way, the cultural allusion in the title is to Depeche Mode’s Master And Servant. I’ve been listening to a lot of Depeche Mode recently.

I’ve titled this section “Outro” rather than “Conclusion” because I’m not sure I have one – other than it’s been a lot of fun playing with one machine calling another one. Or playing Master and Servant. 🙂

My Generation?

Ten years ago in Channel Performance Reporting I talked about how our channel-related reporting had evolved.

That was back when the data model was simple: You could tell what channel types were and what attached to each channel:

  • Field SMF73ACR in SMF Type 73 (RMF Channel Path Activity) told you the channel was, for example, FICON Switched (“FC_S”). This was an example I gave in that blog post.
  • SMF Type 78 Subtype 3 (RMF I/O Queuing Activity) gave you the control units attached to each channel. (You had to aggregate them through SMF 74 Subtype 1 (RMF Device Activity) to relate them to real physical controlers.)

SMF 74 Subtype 7 FICON Switch Statistics

In that post I talked about SMF 74 Subtype 7 (FICON Director Statistics) as a possible extension to our analysis. I mapped this record type pretty comprehensively. Unfortunately so few customers have 74–7 enabled that I haven’t got round to writing analysis code.

I’m grateful to Steve Guendert for information on how to turn on these records:

RMF 74–7 records are collected for each RMF interval if

  • The FICON Management Server (FMS) license is installed on the switching device. (This is the license for FICON CUP)
  • The FMS indicator has been enabled on the switch
  • The IOCP has been updated to include a CNTLUNIT macro for the FICON switch (2032 is the CU type)
  • The IECIOSnn parmlib member is updated to include “FICON STATS=YES”
  • “FCD” is specified in the ERBRMFnn parmlib member

With these instructions I’m hoping more customers will turn on 74–7 records.

Talking About My Generation

What I don’t think we had 10 years ago was a significant piece of information we have today: Channel Generation. This is field SMF73GEN in SMF Type 73.

This gives further detail on how the channel is operating.

There isn’t a complete table anywhere for the various codes but here’s my attempt at one:

Value Meaning Value Meaning
0 Unknown 18 Express8S at 4 Gbps
1 Express2 at 1 Gbps 19 Express8S at 8 Gbps
2 Express2 at 2 Gbps 21 Express16S at 4 Gbps
3 Express4 at 1 Gbps 22 Express16S at 8 Gbps
4 Express4 at 2 Gbps 23 Express16S at 16 Gbps
5 Express4 at 4 Gbps 24 Express16S at 16 Gbps
7 Express8 at 2 Gbps 31 Express16S+ at 4 Gbps
8 Express8 at 4 Gbps 32 Express16S+ at 8 Gbps
9 Express8 at 8 Gbps 33 Express16S+ at 16 Gbps
17 Express8S at 2 Gbps 34 Express16S+FEC at 16 Gbps

Actually that “24” entry worries me.

By the way I’m grateful to Patty Driever for fishing many of these out for me.

A Case In Point

We’ve been decoding these channel “generations” for a number of years but it came into sharper focus with a recent customer situation.

This customer has a pair of z14 processors, each with quite a few LPARs. A few weeks ago I pointed out to them that quite a few of their channels were showing up at 4 Gbps or 8 Gbps, not the 16 Gbps they might have expected.

Where the channels were at 16 Gbps they were indeed using Forward Error Correction (FEC).

Now, why would a 16 Gbps channel run at 8 or even 4 Gbps? It’s a matter of negotiation. If the device or the switch can’t cope with 16 the result might well be negotiation down to a slower speed. So older disks or older switches might well lead to this situation. I guess it’s also possible that unreliable fabric might lead to the same situation.

(Quite a few years ago I had a customer situation where their cross-town fabric was being run above its rated speed, with quite a lot of errors. It was a critsit, of course. 🙂 Later I had another critsit where the customer had ignored their disk vendor (not IBM) ’s advice in how to configure cross-town fabric. So it does happen.)

Apart from SMF73ACR the above all relates exclusively to FICON. So what about Coupling Facility links?

I’ve written extensively on these. The most recent post is Sysplexes Sharing Links. In this post you’ll find links (if you’ll pardon the pun) to all my other posts on the topic. I don’t need to repeat myself here.

But SMF73ACR does indeed identify broad categories of Coupling Facility links:

  • “ICP” means “IC Peer link”.
  • “CS5” means “ICA-SR link”.
  • “CL5” means “Coupling Express LR link”.

Conclusion

Whether we’re talking about Coupling Facility links or FICON links it’s worth checking what you’re actually getting. And RMF has – both in its reports (which I haven’t touched on) or in its SMF records – the information to tell you how the links panned out.

It’s also worthwhile knowing that raw link speed doesn’t translate directly into I/O response time savings. But modern channels, especially when exploiting zHPF, can yield impressive gains. But this is not the place to go into that.

One final thought: Ideally you’d allow the latest and greatest links to run at their full speed. But economics – such as the need to keep older disks in Production – might defeat this. But, over time, it’s worthwhile to think about whether to upgrade – to get the full channel speed the processor is capable of.