Engineering – Part Five – z14 IOPs

I previously wrote about SMT in Born With A Measuring Spoon In It’s Mouth in 2016 – before z14 was announced. I also wrote about it again in 2016 in SMT – Some Actual Graphs. It’s been a year since z15 was announced so enough time has passed for me to want to write about SMT once more.

But actually there isn’t any real SMT news.

But there’s something I thought I’d written about before, but I hadn’t: With z14, IOPs are always enabled for SMT. Actually one of them isn’t, but the rest are. So, in SMF 78–3 you get an odd number of IOPs – and therefore an odd number of IOP Initiative Queue and Utilization Data Sections. One is not SMT-enabled and the rest are.

So, if you have 10 IOP cores you have 19 IOP sections.

It would be interesting to see how they behave. So I took data from a two-drawer z14. (It’s a M02 hardware model, with a software designation 507, with 7 GCPs, 4 zIIPs, and 5 ICFs. It has lots of LPARs.)

So, I used the 78–3 data to plot two metrics:

  • Processed I/O Interrupts per second
  • IOP Busy %

Here is the graph, with IOP Busy on the right-hand axis and I/O Interrupts on the left.

The numbers are interesting but there is no clear pattern:

  • The I/O Interrupt rate varies wildly – and I suspect it has something to do with the devices and channels the IOP is handling.
  • The IOP Busy % doesn’t necessarily correspond to the I/O Interrupt rate.

Probably the more important and useful metric is the IOP Busy number.

When I say “no clear pattern” I mean it would be difficult to say something like “IOP 4 is busier because of its position in the machine”.

I do think it’s worth keeping an eye on IOP Busy %. This particular set of data shows very low IOP utilisations – which is a go thing.

For a 2-drawer z14, 10 IOPs is the standard number but you can buy more. For z13 it was 12 and for z15 it’s 8. there’s a clear trend here. I do think that having SMT as standard on IOPs will have contributed to the possibility of reducing the number of standard IOPs. Obviously them getting a little bit faster with each generation helps, but you have to balance that against other processor types also getting faster. Another factor might be the historical trend towards more memory in a machine and fewer I/Os, relatively speaking.

My code knows that it’s standard for a 2-drawer z14 to have 10 IOPs. It has to calculate – especially from z14 onwards – the number of IOPs as this isn’t recorded. SMT is part of that calculation. So I report standard IOPs and additional IOPs – though I haven’t seen a case of the latter yet.

And this is in the “Engineering” series of blog posts as we’re dealing with individual processors, even if they are IOPs.

filterCSV – 4 Months On

Back in May (2020) I published filterCSV, An Open Source Preprocessor For Mind Mapping Software.

To recap a little, the premise was very simple: I wanted to create a tool that could automate colouring nodes in a mind map, based on simple filtering rules. The format I chose was iThoughts’ CSV file format. (It could both import and export in this format.) Hence the name “filterCSV”.

I chose that format for three reasons:

  • I use iThoughts a lot – and colouring nodes that match patterns is a common thing for me to do.
  • The format is a rather nice text format, with lots of semantics available.
  • Python has good tools for ingesting and emitting CSV files. Likewise for processing arrays – which is, obviously, what CSV can be converted to.

So I built filterCSV and last time I wrote about it I had extended the CSV -> filterCSV -> CSV cycle to

  • Ingest flat text, Markdown and XML
  • Emit HTML, OPML, XML, Freemind

So, it had become a slightly more general tree manipulator and converter.

What Happened Next?

I’ve done a lot of work on filterCSV. I’ll attempt to break it down into categories.

Import

You can now import OPML.

Export

You can now export as tab- or space-indented text.

You can now export in GraphViz Directed Graph format, which means you can get a tree as a picture, outside of a mind-mapping application.

Tree Manipulation Functions

You can sort a node’s children ascending, and you can reverse their order. The latter means you can sort them descending. Imagine a tree with a Db2 subsystem and the CICS regions that attach to it as its children. You’d want the CICS regions sorted, I think. (Possibly by name, possibly by Service Class or Report Class.

Sometimes it makes sense for the children of a node to be merged into the node. Now they can be and they are each preceded by an asterisk – to form a Markdown bulleted list item. (iThoughts can handle some Markdown in its nodes.) I think we might use this in our podcast show notes.

You can now select nodes by their level in the tree. You can also use none as a selector – to deselect all nodes in the tree. (Before you had all as a selector – to allow you to set attributes of all nodes.) You might use none with nc (next colour) to skip a colour in the iThoughts palette.

Here’s an example:

'^A1$' nc
none nc
'A1A' nc

Where the first command says ‘for nodes whose text is “A1” colour with the first colour in the standard iThoughts colour palette’. The second says ‘do not use the second colour in the palette’. The third command says ‘for nodes with “A1A” in their text use the third colour in the palette’.

New Node Attributes

iThoughts, as well as colour and shape, has three attributes of a node that filterCSV now supports:

  • Icon – where you can prefix a node with one of about 100 icons. For example, ticks and single-digit number icons.
  • Progress – which is the percent complete that a task is. Some people use iThoughts for task management.
  • Priority – which can range from 1 to 5.

As with colour and shape, you can set these attributes for selected nodes, with the selection following a rule. And, again, you can combine them. For example a tick node and 100% completion. You can also reset them, for example with noprogress.

Usability

Invoking filterCSV with no commands produces some help. This help points to the Github repository and the readme.

You can now (through Stream 3) read commands from a file. If you do you can introduce comments with // . Those continue until the end of the line. You can also use blank lines.

I learnt to use Stream 3 for input you might invoke filterCSV with something like

filterCSV < input.csv > output.opml 3< command-file

So, you can see filterCSV (now at 1.10) has come on in leaps and bounds over the past few months. Most of the improvements were because I personally needed them, but one of them – indented output – was in response to a question from someone in a newsgroup.

And I’ve plenty more ideas of things I want to do with filterCSV. To reiterate, it’s an open source project so you could contribute. filterCSV is available from here.

And it’s interesting to me how the original concrete idea – colouring iThoughts nodes – has turned into the rather more abstract – ingesting trees and emitting them in various formats with lots of manipulations. I like this and probably should deploy the maxim “abstraction takes time and experience”.

Mainframe Performance Topics Podcast Episode 26 “Sounding Board”

In Episode 25 I said it had been a long time since we had recorded anything. That was true for Episode 25, but it certainly wasn’t true for Episode 26. What is true is that it’s taken us a long time from start to finish on this episode, and ever so much has happened along the way.

But we ploughed on and our reward is an Episode 26 whose contents I really like.

On to Episode 27!

Here are the unexpurgated show notes. (The ones in the podcast itself have a length limitation; I’m not sure Marna and I do, though.) 🙂

Episode 26 “Sounding Board”

Here are the show notes for Episode 26 “Sounding Board”. The show is called this because it relates to our Topics topic, and because we recorded the episode partly in the Pougkeepsie recording studio where Martin sounded zen, and partly at home.

Where we have been

  • Marna has been in Fort Worth for SHARE back in February

  • Martin has been to Las Vegas for “Fast Start”, for technical sales training, and he got out into the desert to Valley Of Fire State Park

  • Then, in April he “visited” Nordics customers to talk about
    • zIIP Capacity and Performance
    • So You Don’t Think You’re An Architect?
  • But he didn’t get to go there for real. Because, of course, the world was upended by both Covid and Black Lives Matter.

Follow up

  • Chapter markers, discussed in Episode 16. Marna finally found an Android app that shows them – Podcast Addict. Martin investigated that app, and noted it is available on iOS too.

What’s New – a couple of interesting APARs

  • APAR PH21919: NEW FUNCTION – WORKFLOW SUPPORT SAVE JOB OUTPUT

    • When you run a workflow step that invokes a job you can automatically save the job output in a location of your choosing files (z/OS Unix file directory).

    • In the same format as you’d see in SDSF . Means users can have an automatic permanent record of the work that was done in a workflow

    • PTF Numbers are UI68359 for 2.3 and UI68360 for 2.4

  • APAR OA56774 (since 2.2) Provides new function to prevent a runaway sysplex application from monopolizing a disproportionate share of CF resources

    • This APAR has a dependency on CFLEVEL 24.

    • This case is pretty rare, but is important when you have it.

    • Not based on CF CPU consumption. Is based on deteriorating service times to other structures – which you could measure with SMF 74–4 Coupling Facility Activity data.

Mainframe – z15 FIXCATs

  • Important to cover as there are many questions about them.

  • IBM.Device.Server.z15–8561.RequiredService

    • Absolute minimum needed to run on a z15

    • Unfortunately some of these PTFs in that list have been involved in messy PE chains

    • If that happens, involve IBM Service (Bypass PE or ++APAR)

    • Usually intent is to keep these PTFs to a minimum – and keep the number of PTFs relatively constant.

      • CORRECTION: System Recovery Boost for z15 GA1 is in Required, not Exploitation category, as the recording states!
  • IBM.Device.Server.z15–8561.Exploitation

    • Needed for optional functions, and you can decide when you want to use them.

    • This PTF list could grow – if we add new functions

  • IBM.Device.Server.z15–8561.RecommendedService

    • This is more confusing. Usually to fix a defect that is found but haven’t risen up to required. We might’ve detected it in testing, or a customer might have.

    • Over time this category probably will grow, as field experience increases

    • Might want to run an SMP/E REPORT MISSINGFIX to see what’s in this FIXCAT. Might install some, all, or none of the fixes. Might want to be more selective. Based on how much change you want to encounter, versus what problems are fixed

  • By the way there are other FIXCATs you might want to be interested in for z15, e.g. IBM.Function.SYSPLEXDataSharing

Performance – DFSORT And Large Memory

  • A very special guest joins us, Dave Betten, former DFSORT performance lead.

  • Follows on from Elpida’s item in Episode 10 “234U” in 2017, and continues the “Managing Large Memory” theme.

  • Number of things to track these days:
    • Often track Average Free
    • Also need to track Minimum Free
    • Fixed frames – Especially Db2, and now with z/OS 2.4 zCX
    • Large frames – Again Db2 but also Java Heap
  • In z/ 2.2
    • OPT controls simplified
      • Thresholds set to Auto
      • Default values changed
      • 64GB versus %
  • In z/ 2.3
    • LFAREA
      • Not reserved anymore but is a maximum
      • BTW the LFAREA value is in SMF 71
      • Dave reminded us of what’s in SMF 71
  • Dave talked about DFSORT memory controls
    • DFSORT has historically been an aggressive user of memory
    • Installation defaults can be used to control that
    • But the EXPOLD parameter needs special care – because of what constitutes “old pages”, which aren’t actually unused.
    • DFSORT Tuning Guide, especially Chapter 3
  • Dave talked about how handy rustling up RMF Overview Reports can be, with several Overview conditions related to memory.

  • Most of the information in this topic is relevant to LPARs of all sizes

Topics – Update on recording techniques

  • Last talked about this in Episode 12, December 2017

  • Planning for podcast – still using iThoughts for outlining the episode (though its prime purpose is for mind mapping and Martin (ab)uses it for depicting various topologies.

  • Recording of podcast – still using Skype to collaborate
    • Record locally on each side, but now Marna’s side is in the new Poughkeepsie recording studio!
    • Martin has moved to Piezo and then Audio Hijack
      • Recording in stereo, with a microphone stand to minimise bumps
      • Has to slow the computer’s fan speed, and has an external cooling pad
      • Also he hides behind pillows to minimise the noise and improve audio quality.
    • For a guest, it’s different. We can’t record in stereo. Guests might not have recording software. But still use Skype (unless in Poughkeepsie).
  • Production

    • Martin’s editing

      • Moved from Audacity on Mac to Ferrite on iPad OS
      • Moved to iPad so he can edit anywhere, except where there is noise. Apple Pencil helps with precision.
      • Then, throw away remote side – in stereo terms.
      • Then, perform noise reduction, still not perfect.
  • Publishing

    • Marna’s publishing: Uploading the audio, publishing show notes, still the same as before.

Customer requirements

  • – Insert Usual Disclaimer Here – which is only our thoughts.

  • RFE 139477 “Please include the CPU Time Limit for a Job/Step in SMF Type 30”

    • The CPU Time Limit in effect for a JobStep is not currently written to SMF Type30 at the end of the step.

      While a job is running this information is available in the Address Space Control Block (ASCBJSTL) and can be displayed or even modified by tools such as OMEGAMON.

      However the information is not retained after the JobStep completes. This information would be very useful after the fact to see the CPU time limit in effect for a JobStep.

      This enhancement request is to include the information in ASCBJSTL in the SMF Type30 Subtype 4 record written at the end of the JobStep.

      An additional consideration would be how to best deal with the Job CPU time Limit (as specified on the JOB statement) and whether this can also be catered for in the RFE

    • Business justification: Our site got caught out by a Test job being submitted overnight with TIME=1440 and consuming over 6 hours CPU before it was cancelled. We would like to be able to prevent similar issues in future by having the CPU Time Limit data available in SMF.

    • Our comments:

      • After the fact
        • The RFE was calling for “after the fact”. i.e. when the step has ended. Might also like the source of the limit.

        • End of step looks useful. Could run query comparing to actual CPU time, then track to see if ABEND is on the horizon

      • “As it happens”

        • Would like on the SMF Interval as well as Step End records, maybe with tools to dynamically change the parameters.

        • May not need the SMF information if vendor and IBM tools already do it today, making it perhaps not a high enough priority for SMF

        • And the source of the parameters might not be readily available in control blocks so this might not even be feasible.

On the blog

Contacting Us

You can reach Marna on Twitter as mwalle and by email.

You can reach Martin on Twitter as martinpacker and by email.

Or you can leave a comment below. So it goes…

Engineering – Part Four – Activating And Deactivating LPARs Causes HiperDispatch Adjustments

(This post follows on from Engineering – Part Two – Non-Integer Weights Are A Thing, rather than Engineering – Part Three – Whack A zIIP).

I was wondering why my HiperDispatch calculations weren’t working. As usual, I started with the assumption my code was broken. My code consists of two main parts:

  • Code to build a database from the raw SMF.
  • Code to report against that database.

(When I say “my code” I usually say “I stand on the shoulders of giants” but after all these years I should probably take responsibility for it.) 🙂

Given that split the process of debugging is the following:

  1. Check the reporting code is doing the right thing with what it finds in the database.
  2. Check the database accurately captures what was in the SMF records.

Only when those two checks have passed should I suspect the data.

Building the database itself consists of two sub stages:

  1. Building log tables from the raw records.
  2. Summarising those log tables into summary tables. For example, averaging over an hour.

If there is an error in database build it is often incorrect summarisation.

In this case the database accurately reports what’s in the SMF data. So it’s the reality that’s wrong. 🙂

A Very Brief Summary Of HiperDispatch

Actually this is a small subset of what HiperDispatch is doing, sufficient for the point of this post.

With HiperDispatch the PR/SM weights for an LPAR are distributed unevenly (and I’m going to simplify to a single pool):

  1. If the LPAR’s overall weight allows it, some number of logical processors receive “full engine” weights. These are called Vertical Highs (or VH’s for short). For small LPARs there could well be none of these.
  2. The remainder of the LPAR’s weight is distributed over one or two Vertical Mediums (or VM’s for short).
  3. Any remaining online logical processors receive no weight and are called Vertical Lows (or VL’s for short).

Enigma And Variations

It’s easy to calculate what a full engine’s weight for a pool is: Divide the sum of the LPARs’ weights for the pool by the number of shared physical processors. You would expect a VH logical processor to have precisely this weight.

But what could cause the result if this calculation to vary. Here the maths is simple but the real world behaviours are interesting:

  • The number of physical processors could vary. For example, On-Off Capacity On Demand could add processors and later take them away.
  • The total of the weights for the LPARs in the pool could vary.

The latter is what happened in this case: the customer deactivated two LPARs on a machine – to free up capacity for other LPARs to handle a workload surge. Later on they re-enabled the LPARs, IPLing them. I’m not 100% certain but it seems pretty clear to me that IPLing doesn’t cause the LPAR’s weights to come out of the equation; I’m pretty sure IPLing doesn’t affect the weights.

These were two very small LPARs with 2–3% of the overall pool’s weights each. But they caused the above calculation to yield varying results:

  • The “full engine” weight varied – decreasing when the LPARs were down and increasing when they were up.
  • There was some movement of logical processors between VH and VM categories.

The effects were small. Sometimes a larger effect is easier to debug than a smaller one. For one, it’s less likely to be a subtle rounding or accuracy error.

The conversion of VH’s to VM’s (and back) has a “real world” effect: A VH logical processor is always dispatched on the same physical processor. the same is not so true for a VM. While there is a strong preference for redispatch on the same physical, it’s not guaranteed. And this matters because the cache effectiveness is reduced when a logical processor moves to a different physical processor.

So, one recommendation ought to be: If you are going to deactivate an LPAR recalculate the weights for the remaining ones. Likewise, when activating, recalculate the weights. In reality this is more a “playbook” thing where activation and deactivation is automated, with weight adjustments built in to the automation. Having said that, this is a “counsel of perfection” as not all scenarios can be predicted in advance.

What I Learnt And What I Need To Do

As for my code, it contains a mixture of static reports and dynamic ones. The latter are essentially graphs or the makings of – such as CSV files.

Assuming I’ve done my job right – and I do take great care over this – the dynamic reports can handle changes through time. So no problem there.

What’s more difficult is the static reporting. So, one of my key reports is a shift-level view of the LPAR layout of a machine. In the example I’ve given, it had a hard time getting it right. For example, the weights for individual LPARs’ VH processors go wrong. (The weight of a full processor worked in this case – but only because the total pool weight and number of physical engines didn’t change. Which isn’t always the case.)

To improve the static reporting I could report ranges of values – but that gets hard to consume and, besides, just tells you things vary but not when and how. The answer lies somewhere in the region of knowing when the static report is wrong and then turning to a dynamic view.

In particular, I need to augment my pool-level time-of-day graphs with a stack of the LPARs’ weights. This would help in at least two ways:

  • It would show when weights were adjusted – perhaps shifting from one LPAR to another.
  • It would show when LPARs were activated and de-activated.

A design consideration is whether the weights should stack up to 100%. I’ve come to the conclusion they shouldn’t – so I can see when the overall pool’s weight changes. That reveals more structure – and I’m all for not throwing away structure.

Here’s what such a graph might look like:

In this spreadsheet-driven mockup I’ve ensured the “now you see them now you don’t” LPARs are at the top of the stack.

I don’t know when I will get to this in Production code. As now is a particularly busy time with customer studies I probably should add it to my to-do list. But I’ll probably do it now anyway… 🙂

Head Scratching Time

In this set of data there was another phenomenon that confused me.

One LPAR had twelve GCPs online. In some intervals something slightly odd was happening. Here’s an example, from a single interval:

  • Logical Processors 0–4 had polar weights (from SMF70POW as calculated in Engineering – Part Two – Non-Integer Weights Are A Thing) of 68.9. (In fact there was a very slight variation between them.)
  • Logical Processor 5 had a polar weight of 52.9.
  • Logical Processor 6 had a polar weight of 12.6.
  • Logical Processors 7 to 11 had polar weights of 0.

If you tot up the polar weights you get 410 – which checks out as it’s the LPAR’s weight in the GCP pool (obtained from other fields in the SMF 70 record).

Obviously Logical Processors 0, 1, 2, 3, and 4 are Vertical High (VH) processors – and bits 0,1 of SMF70POF are indeed “11”.

But that leaves two logical processors – 5 and 6 with non-zero, non-VH weights. But they don’t have the same weight. This is not supposed to be the case.

Examining their SMF70POF fields I see:

  • Logical Processor 5 has bits 0,1 set to “10” – which means Vertical Medium (VM).
  • Logical Processor 6 has bits 0,1 set to “01” – which means Vertical Low (VL).

But if Logical Processor 6 is a VL it should have no vertical weight at all.

Well, there is another bit in SMF70POF – Bit 2. The description for that is “Polarization indication changed during interval”. (I would’ve stuck a “the” in there but nevermind.)

This bit was set on for LP 6. So the LP became a Vertical Low at some point in the interval, having been something else (indeterminable) at some other point(s). I would surmise VL was its state at the end of the interval.

So, how does this explain it having a small but non-zero weight? It turns out SMF70POW is an accumulation of sampled polar weight values, which is why (as I explained in Part Two) you divide by the number of samples (SMF70DSA) to get the average polar weight. So, some of the interval it was a VM, accumulating. And some of the interval it was a VL, not accumulating.

Mystery solved. And Bit 2 of SMF70POF is something I’ll pay more attention to in the future. (Bits 0 and 1 already feature heavily in our analysis.)

This shifting between a VM and a VL could well be caused by the total pool weight changing – as described near the beginning of this post.

Conclusion

The moral of the tale is that if something looks strange in your reporting you might – if you dig deep enough – see some finer structure (than if you just ignore it or rely on someone else to sort it out).

The other, more technical point, is that if almost anything changes in PR/SM terms – it can affect how HiperDispatch behaves and that could cause RMF SMF 70–1 data to behave oddly.

The words “rely on someone else to sort it out” don’t really work for me: The code’s mine now, I am my own escalation route, and the giants whose shoulders I stand on are long since retired. And, above all, this is still fun.

zIIP Capacity And Performance Presentation

A few years ago I built a presentation on zIIP Capacity Planning. It highlighted the need for better capacity planning for zIIPs and outlined precisely why zIIPs couldn’t be run as busy as general purpose processors (GCPs).

Since then a lot has changed. And I, in common with most people, have a lot more experience of how zIIPs perform in customer installations. So, earlier this year I updated the presentation and broadened the title to include Performance.

I was due to “beta” the presentation at a user group meeting in London in March. Further, I was due to present it to a group of customers in Stockholm in May. The former, understandably, was cancelled. The latter happened as a Webex.

The essential thesis of the presentation is that zIIP Capacity and Performance needs a lot more care than most customers give it, particularly for CPU-stringent consumers such as the Db2 engine (MSTR and DBM1). (Actually I’ve talked about Db2 and it’s relationship with zIIP in Is Db2 Greedy?.)

What’s really new about this presentation is a shift in emphasis towards Performance, though there is plenty on Capacity. And one key aspect is LPAR Design. For example, to aid the “Needs Help” mechanism where a General Purpose Processor (GCP) aids a zIIP, some LPARs might need to forego access to zIIP. This might be controversial – as you want as much zIIP exploitation as possible. But for some LPARs giving them access to zIIP makes little or no sense. Meanwhile other LPARs might need better access to zIIP.

The presentation is also updated in a few key areas:

  • More comprehensive and up to date treatment of Db2 – and if you are a Db2 customer you really should pay attention to this one. (I’m grateful to John Campbell and Adrian Burke for their help with this topic.)
  • zCX Container Extensions in z/OS 2.4. This can be a major consumer of zIIP. Obviously this needs to be planned for and managed.
  • z15 System Recovery Boost (SRB). I’m looking forward to seeing how much this speeds up IPLs – and I think I’m going to have to refurbish my IPL/Restart detection code to do it justice. I also think you will want to consider how an event affects the other LPARs sharing the zIIP pool.

As with So You Don’t Think You’re An Architect?, I’m planning on evolving the presentation over time – and the above list shows how I’ve already done it. I’m also interested in giving it to any audience that wants it. Let me know if it would be of interest and I’ll see what I can do.

In the mean time, here’s the presentation: “zIIP Capacity And Performance” presentation

So You Don’t Think You’re An Architect?

Every year I try to write one new presentation. Long ago, it feels like, I started on my “new for 2020” presentation. It’s the culmination-so-far 🙂 of my “architecture thing”.

“What Architecture thing?” some of you might be asking.

It’s quite a simple idea, really: It’s the notion that SMF records can be used for far more than just Performance, even the ones (such as RMF) that we’re notionally designed for Performance. A few years ago I wrote a presentation called “How To Be A Better Performance Specialist” where I pushed the germ of this notion in two directions:

  • Repurposing SMF for non-Performance uses.
  • Thinking more widely about how to visually depict things.

The first of these is what I expanded into this “Architecture” idea. (The second actually helps quite a bit.) But I needed some clear examples to back up this “who says?” notion.

My day job – advising customers on Performance matters – yields a lot of examples. While the plural of “anecdote” isn’t “data”, the accumulation of examples might be experience. And boy do I have a lot of that now. So I set to writing.

The presentation is called “So You Don’t Think You’re An Architect?” A good friend of mine – who I finally got to meet when I did a customer engagement with him – thought the title a little negative. But it’s supposed to be a provocative statement. Even if the conclusion is “… and you might be right”. So I’ve persisted with it (and haven’t lost my friend over it). 🙂

I start at the top – machines and LPARs – and work my way down to the limits of what SMF 30 can do. I stop there, not really getting much into the middleware instrumentation for two reasons:

  • I’ve done it to death in “Even More Fun With DDF”.
  • This presentation is already quite long and intensive.

On the second point, I could go for 2 hours, easily, but I doubt any forum would let me do a double session on this topic. Maybe this is the book I have in me – as supposedly everybody does. (Funnily enough I thought that was “SG24–2557 Parallel Sysplex Batch Performance”. Oh well, maybe I have two.) 🙂

One hour has to be enough to get the point across and to show some actual (reproducible) examples. “Reproducible” is important as it is not (just) about putting on a show; I want people to be able to do this stuff and to get real value out of it.

One criticism I’ve faced is that I’m using proprietary tools. That’s for the most part true. Though sd2html, An Open Source WLM Service Definition Formatter – Mainframe, Performance, Topics is a good counter-example. I intend to do more open sourcing, time permitting. And SMF 30 would be a good target.

So, I’ve been on a long journey with this Architecture thing. And some of you have been on bits of the journey with me, for which I’m grateful. I think the notion we can glean architectural insight from SMF has merit. The journey continues as recently I’ve explored:

I’ll continue to explore – hence my “culmination-so-far” quip. I really don’t think this idea is anything like exhausted. And – in the spirit of “I’ll keep revising it” I’ve decided to put the presentation in GitHub. (But not the raw materials – yet.) You can find it here.

You might argue that I risk losing speaking engagements if I share my presentation. I have to say this hasn’t happened to me in the past, so I doubt it makes much difference now. And this presentation has already had one outing. I expect there will be more. And anyway the point is to get the material out. Having said that, I’m open to webcasting this presentation, in lieu of being able to travel.

IMS Address Space Taxonomy

(I’m grateful to Dougie Lawson for correcting a few errors in the original version of this.)

I don’t often write about IMS and there’s a good reason for it: Only a small proportion of the customers I deal with use it. I regard IMS as being one of those products where the customers that have it are fanatical – in a good way. 🙂

So when I do get data from such a customer I consider it a golden opportunity to enhance my tooling. And so it has been recently. I have a customer that is a merger of three mainframe estates – and I have data from two of the three heritages. Both of these have IMS.

This mergers happened long ago but, as so often happens, the distinct heritages are evident. In particular, the way they set up the IMS systems and regions differs.

You can, to a first approximation, separate IMS-related address spaces into two categories:

  • IMS System Address Spaces
  • IMS Application Regions

In what follows I’ll talk about both, referencing what you can do with SMF 30, specifically. Why SMF 30? Because processing SMF 30 is a scalable method for classifying address spaces, as I’ve written about many times before.

IMS System Address Spaces

IMS system address spaces run with program name “DFSMVRC0” and there are several different address spaces. For example, over 30 years ago the “DL/I SAS” address space became an option – to provide virtual storage constraint relief. It;s been mandatory for a long time. Also there is a DBRC address space. All have the same program name.

The system address spaces have Usage Data Sections which say “IMS”. The Product Version gives the IMS version. In this customer’s case one part of the estate says “V15” and the other part “V14”.

The IMS Control Region is the only system address space that can attach to Db2 or MQ. So, if the program name is “DFSMVRC0” and there are Usage Data Sections for either Db2 or MQ we know this is the Control Region. But this isn’t always going to be the case – as some IMS environments connect to neither Db2 nor MQ. So here the Product Qualifier field can be helpful:

  • Both DBRC and Control Region address spaces have a Product Qualifier of “TM”. But you can’t necessarily tell them apart from things like I/O rates. However, you might expect a DBRC address space to have a name with something like “DBR” in. (I’m not wowed by that level of fuzziness.)
  • A DL/I SAS has Product Qualifier “DBCTL”.

I’m going to treat IRLM as an IMS System Address Space, when really it isn’t. This is the lock manager – and it’s the same code whether you’re running IMS or Db2. The program name is DXRRLM00 and there is little to distinguish between an IRLM for IMS or for a Db2 subsystem in SMF. (In fact which Db2 an IRLM address space is associated with isn’t in SMF either.) the best my code can do is parse job names, service class, report class etc names for “IMS” or, still worse, “I” but no “D”.

IMS Application Regions

IMS application address spaces – whether MPRs or BMPs – run with program name “DFSRRC00”. They also have Usage Data Sections that say “IMS” but don’t – in the Product Qualifier field – say anything about the subsystem it’s using. Similarly, when CICS attaches to IMS it’s Product Qualifier isn’t helpful.

To my mind the distinction between a MPR (Message Processing Region) and a BMP (Batch Message Processor) is subtle. For example I’ve seen BMPs that sit there all day, fed work by MQ. You probably would glean something from Service Classes and Report Classes. Relying on the address space name is particularly fraught.

Two Diverse IMS Estates

This latest customer has two contrasting styles of IMS environment, mainly in their testing environments:

  • One has lots of very small IMS environments.
  • the other has few, larger testing environments.

Also, as I noted above, one estate is IMS V14 and the other is V15. This does not appear to be a case of V15 in Test/Development and V14 in Production.

So I guess their testing and deployment practices differ – else this would’ve been homogenised.

I’m going to enjoy talking to the customer about how these two different configurations came to be.

Conclusion

IMS taxonomy can be done – but it’s much messier than Db2 and MQ. It relies a lot on naming conventions and spotting numerical dynamics in the data.

Note: For brevity, I haven’t talked about IMS Datasharing. That would require me to talk at length of XCF SMF 74–2 and Coupling Facility 74–4. Something else I haven’t discussed is “Batch DL/I” – where a batch job is it’s own IMS environment. This is rather less common and I haven’t seen one of these in ages.

I would also say, not touched on here, that SMF 42–6 would yield more clues – as it documents data sets.

And, of course serious IMS work requires its own product-specific instrumentation. Plus, as Dougie pointed out to me, the Procedure JCL.

sd2html, An Open Source WLM Service Definition Formatter

There have been a number of WLM Service Definition formatters over the years. So why do we need another one?

Well, maybe we don’t but this one is an open source one, covered by the MIT licence. That means you can change it:

  • You could contribute to the project.
  • You could modify it for your own local needs.

While IBM has other WLM Service Definition Formatters, it was easy to get permission to open source this one.

It’s the one I started on years ago and have evolved over the many engagements where I’ve advised customers on WLM.

If it has an unusual feature it’s that I’ve stuck cross links in wherever I can – which has made it easier for me to use. For example, everywhere a Service Class name appears I have a link to its definition. So, a Classification Rule definition points to the Report Class definition.


Installing sd2html


sd2html is a single PHP script, originally run on a Linux laptop and then on a MacBook Pro. Both platforms come with web servers and PHP built in. In the Mac’s case it’s Apache.

So, to use it you need to provide yourself with a tame (perhaps localhost) web server. It needs to run PHP 7.

Place sd2html.php somewhere that it can be run by the web server.


Extracting A WLM Service Definition


In my experience, most customers are still using the ISPF WLM Application. there is a pull down menu to print the Service Definition. Choose the XML option and it will write to a FB 80 sequential file. This you need to place on the web server, as previously mentioned.

Customers send me their WLM Service Definitions in this format, downloaded with EBCDIC to ASCII translation. It’s easy to email this way.

When I receive the file it looks broken. I keep reassuring customers it isn’t because I can one-line it, throwing away the new line characters. This used to be a fiddle in my editor of choice – then Sublime Text, now BBEdit. That works well.

But I’ve eliminated the edit step: sd2html now does the edit for me, before passing the repaired text onto the XML parser. (Originally the XML parser read the file on disk directly. Now the code reads the file in, removes the new lines, and then feeds the result to the XML parser.)


Using sd2html


So you’ve got the Service Definition accessible by your PHP web server. Now what?

From a browser invoke sd2html on your web server with something like

http://localhost/sd2html.php?sds=wlm.xml

You obviously need to adjust the URL to point to sd2html. Also the sds query string parameter needs to point to your WLM Service Definition file.

Then browse to your heart’s content, following links which you’ll find in two places:

  • The table of contents at the beginning.
  • Within the document.

Open Sourcing sd2html


I said in filterCSV, An Open Source Preprocessor For Mind Mapping Software I had another open source project in the works. sd2html is it. I have one more piece of code that will need a lot of work to open source – but I think mainframers will like it. And two more potential ones – that aren’t specific to mainframes.

So, I welcome contributions to sd2html, or even just comments / ideas / requirements. Specifically, right now, I’d value:

  • Documentation writing. (This post is all there is right now.)
  • Early testers.
  • Creative ideas.
  • People who know PHP better than I do.
  • People who can think of how to handle the national language characters that show up from time to time.

Anyhow, try it if you can and let me know what you think.

filterCSV, An Open Source Preprocessor For Mind Mapping Software

I have a number of ideas for things I want to open source, some directly related to the day job and some not. This post is about one piece of software that I use in my day job but which you probably wouldn’t recognise as relevant to mainframe performance.

To me the rule of thumb for candidates for open sourcing is clear: Something of use outside of IBM but with little-to-no prospect of being commercialised.

filterCSV is just such a piece of software.


What’s The Original Point Of filterCSV?


filterCSV started out as a very simple idea: Our processing often leads to CSV (Comma Separated Value) files with a tree structure encoded in them.

This tree structure enables me to create tree diagrams in iThoughts. iThoughts is mind mapping software I’m (ab)using to draw tree diagrams. Whereas most people create mind maps by hand, I’m bulk loading them from a CSV file. Strictly speaking, I’m not creating a mind map – but I am creating a tree.

iThoughts has a variant of CSV for importing mind maps / trees. It’s documented here. It’s a very simple format that could be confected by any competent programmer, or from a spreadsheet.

So, to filterCSV: I’ve got in the habit of colouring the nodes in the tree I create in iThoughts. Originally I did it by hand but that doesn’t scale well as a method. If I discern a bunch of nodes (perhaps CICS regions) are part of a group I want to colour them all at once.

The very first piece of filterCSV, which is a Python 3 script, compared nodes to a regular expression. If they matched they’d be coloured with a specified RGB value – by altering the CSV file. I would import this altered CSV file into iThoughts.

In a real customer engagement this saves a lot of time: For CICS regions the nodes have the string “RC: CICSAORS” in, for example. “RC” is short for “Report Class”, of course. So the following works quite well as a command line invocation:


filterCSV < input.csv > output.csv ‘RC:CICSAORS’ FFBDAA


So every node with “RC: CICSAORS” in its text gets coloured with RGB value FFBDAA.

If I keep going with this I can find rules that colour all the CICS regions. Then I understand them much better.


Open Sourcing filterCSV


Let’s generalise the idea: You might be creating a mind map and want to colour some nodes, based on a readily-codifiable criterion. Here’s what you do:

  1. You export the mind map from iThoughts in CSV format.
  2. You throw the CSV file through filterCSV, specifying the regular expression and the colour on the command line.
  3. You import the resulting CSV file into iThoughts.

I don’t know how many users of mind mapping software want to do this, but I bet I’m not alone in wanting it. If the effort to open source it were minimal it makes sense to do it, rather than accepting I’m going to be the only user.

So, I put it on IBM’s internal GitHub site – and I was pleased when Christian Clauss of IBM Zurich joined me in the effort. He’s brought a lot of experience and, in particular, testing knowledge to bear.

Then I got permission to open source filterCSV. This turned out to be very straightforward for a number of reasons:

  • IBM is keen on open sourcing stuff.
  • There is no prospect of this becoming product code.
  • The process for open sourcing when there are no dependencies is streamlined.

I’ll also say this is a good practice run for open sourcing things that are of wider interest – and particularly for the mainframe community. Which is something I really want to do.

So it’s now a project on GitHub. I subsequently went through the process with another one – which I’ll talk about in another blog post.


filterCSV Has Morphed Somewhat


I realised a couple of things while developing filterCSV:

  1. It’s not just iThoughts that the method could be applied to.
  2. What I really have is a tree manipulation tool. In fact that’s essentially what mind mapping software is.

It’s the combination of those two points that made me think the tool could be more generally useful. So here are some things I’ve added to make it so:

  • It can import flat text – creating a tree using indentation. That can include Markdown using asterisks for bullets.
  • It can import XML.
  • You can delete nodes that match a regular expression.
  • You can change the shape of a matching node, or its colour.
  • You can write HTML in tabular or nested list form.
  • You can write XML – either OPML or Freemind.
  • You can promote nodes up the hierarchy, replacing their parents.
  • You can spread Level 0 nodes vertically or horizontally. This helps when you have multiple trees.

Craig Scott, the developer of iThoughts, kindly gave me the RGB values for iThoughts’ colour palette. So now you can specify a colour number in the palette. (You can actually go “next colour” (or “nc” for short), which is quite a boon when you have multiple regular expression rules.)

Some of these things came to me while using filterCSV in real life; the experience of actually using something you built is useful.


Conclusion


So this has been a fun project, where I’ve learnt a lot of Python. I continue to have things I want to do to filterCSV, including something that should be out in the very next few days. The general “tree manipulation” and “adjunct to iThoughts” ideas seem to have merit. And I’m enjoying using my own tooling.

If you fancied contributing to this open source project I’m open to that. In any case you can find it here on GitHub. The latest release is 1.2 and 1.3 should, as I say, be out soon.

And I have plenty of ideas for things to enhance filterCSV.