Mainframe Performance Topics Podcast Episode 27 “And Another Thing”


So this is one of our longest episodes yet, but jam full of content which is very us. As usual, finding times when we can both be available to record was tough. What wasn’t difficult was finding material. I can say for myself what we talked about is a set of things I’ve wanted to talk about for a long time.

Anyhow, enjoy! And do keep feedback and “Ask MPT” questions coming.


Episode 27 “And Another Thing” Show Notes – The Full, Unexpurgated, Version


Follow up


  • Additional System Recovery Boost enhancements:

    • Sysplex partitioning recovery: Boosts all surviving systems in the sysplex as they recover and takes on additional workload following the planned or unplanned removal of a system from the sysplex.

    • CF structure recovery: Boosts all systems participating in CF structure recovery processing, including rebuild, duplexing failover, and reduplexing.

    • CF data-sharing member recovery: Boosts all systems participating in recovery following the termination of a CF locking data-sharing member

    • HyperSwap recovery: Boosts all systems participating in a HyperSwap recovery process to recover from the failure of a storage subsystem.

    • Existing functions: image-related boosts for IPL (60 min) and shutdown (30 min)

    • These are different in that they boost multiple systems, rather than the single one that the originally announced SRB would boost. These should be fairly rare occurrences – but really helpful when needed.

  • Two more portable software instances: Db2 and IMS added to Shopz on Aug 20, 2020, in addition to CICS (Dec 2019).


ASK MPT


Martin was asked about which systems in a System Recovery Boost Speed Boost situation get their GCPs sped up to full speed. The answer is it’s only the LPARs participating in a boost period that get their GCPs sped up. For example, on a 6xx model the other LPARs don’t get sped up to 7xx speed.


Mainframe – Trying out Ansible for driving z/OSMF Workflows


  • Marna’s been learning about Ansible and how it can drive z/OSMF work. So far with z/OSMF Workflows, so terminology might not be exactly right from inexperience.

  • Found a driving system for Ansible system (ubuntu distribution Linux running on x86).

  • A lot of installs on this Linux distribution was necessary: python 3.7, ansible, ansible-galaxy collection for ibm.ibm_zos_zosmf .

  • Ansible galaxy is a public collection of modules. Run those modules from a playbook . While steps are called roles, like jobs with steps. Ansible has some sophistication where those roles can be run by different people.

  • Had to learn a little yaml, and understand the sample playbook which came with the collection, changing the playbook for my specific system.

    • This is where helped was necessary from two helpful gracious Poughkeepsie system testers (massive thanks to Marcos Barbieri and Daniel Gisolfi !!)

    • Learning about an Ansible with a staging inventory (a configuration file), andn learning which playbook messages were ok or not ok.

  • Encountered two kinds of problems:

    • Security issues connecting the Linux environment to a large Poughkeepsie System Test z/OSMF environment

      • Required changes to the playbook, and to environment files.
    • Workflow-related: duplicate instances when part of the playbook ran ok, and selecting an automated workflow.

  • Why learn Ansible when you are a z/OS person? Ansible gives us a very nice interface to get to z/OS resources via z/OSMF capabilities with a simple programming interface.

    • Also, if we want to get more people to use z/OS, and they are familiar with Linux, they probably will want to drive work to z/OS with some sort of familiar automation

    • Linux installation of products from a command line, and having to keep looking up command syntax isn’t that fun. Although pretty easy find with google.

    • The products all installed quickly and cleanly, however knowing that the dependencies were right was not obvious. Especially the python and Ansible levels.

  • Ansible Tower as the GUI is helpful, but Marna chose to learn Ansible in pieces from the ground up.

  • As it seems to always be: getting your job done comes down to what your search returns, no matter what you are doing – z/OS or Linux work. Or even programming.


Performance – So You Don’t Think You’re An Architect?


  • Brand new presentation for 2020, builds on the “who says what SMF is for?” slide in “How To Be A Better Performance Specialist”

    • Usually kick off an engagement with an Architecture discussion, which might be hours long!

    • Builds common understanding and trust

    • Techniques in this presentation well honed

  • Presentation talks about 3 layers

    • The Top Layer – Machines, Sysplexes And LPARs

      • For example a diagram showing all the LPARs and their definitions – by pool – for a machine. Together with their sysplexes.

        • Important to answer “what is the role of this LPAR?”

        • Important to understand how far apart machines are

        • Inactive LPARs often give availability clues

        • Driven by RMF CPU and Sysplex SMF

    • The Middle Layer – Workloads And Middleware

      • Spikes: e.g. HSM Daily spike, and actually patterns of utilisation in general

      • Restarts: IPLs, Db2, IMS, CICS Regions

      • Cloning: Things like identical CICS regions

      • Topology: e.g. for CICS what Db2 or MQ a region connects to, from SMF 30 Usage Data Section. Invert that and you have the Db2’s role and what connects to this Db2. Enclave statistics give DDF traffic, Db2 Datasharing group structure

      • Software Levels: From SMF 30 for CICS, Db2, MQ in Usage Data Section, often see IMS in transition, and limitations with Db2 12 onwards – function Levels eg M507 vs M500.

        • All this from SMF 30 and 74-2 XCF, run through Martin’s own tooling. The point of mentioning it is to get the ideas out.

        • MXG discussion ongoing – especially about SMF 30 Usage Data Section. Believe the issues have been ironed out.

    • The Bottom Layer – Application Components

      • Things like Db2 and MQ Accounting Trace: Give names in great detail, likewise CICS Monitor Trace (but these are very heavy but very valuable)

      • DDF topology can be handy

      • Data sets – from SMF 42-6, 14, 15, 16, 62, 64

    • Fulfilling the definiton Of Architecture?

      • Understand what’s really going on

      • In a way that’s especially valuable to real systems architects

  • Customers should consider this approach because you can keep an up to date view of the estate and inform your own capacity and performance role.

  • Martin will keep exploring this rich avenue because dabbling in architecture keeps him interested, and lets him find patterns among customers.


Topics – Notifications


  • They are unsolicited messages appearing on the device, generally interruptions whether wanted or not.

  • Can contain text and images, so really just messages. Taking a platform-neutral view follows.

  • Where do they come from?

    • Form all sorts of services, many on the web

    • Need polling applications on the device

    • Can drain the battery

  • What do you receive and where?

    • Some on Watch: For example from my bank immediately from a contactless payment, and from a washing machine.

    • Most on phone: Slack notifications, from when Google home wifi is out. Sometimes inappropriate, like at Carnegie Hall.

    • A few on iPad: For example WordPress notifications

    • Very few on Mac

    • Very few on web browser: Only when someone replies on favourite newsgroups

    • A key way of managing them is trying not to duplicate across multiple devices

    • IFTTT still a favorite of Marna’s, mentioned in Episode 23 “The Preview That We Do” Topics topic. Martin just subscribed to Premium, but at a low-ball price.

    • Constant interruptions not always welcome, especially those with sounds.

  • How do you manage them?

    • Try to reduce their number. Often have “Wood for the trees” problem.

    • On iOS at least apps can stack/group them

    • Many are just plain spammy

      • Many are defaulted to on when you install an app
    • Figure out which devices should get what and watch for “must always receive immediately”

    • Only the place they make sense, such as exercise chivvies, letting each app has its own settings

  • An advanced topic is how you can generate your own

    • IFTTT from them service (emails, messages), from the app (weather tomorrow).

    • On iOS the Pushcut app: a webhook from eg IFTTT can cause a notification

      • Pushcut webhook support includes the ability to trigger actions on the iOS device, using cURL to invoke webhooks
    • On Android we have MacroDroid.

    • If you have an Amazon Echo or related device you can use web hooks with the Notify Me skill to send notifications to the Echo

  • Lots of scope for things to go wrong with automation based on notifications


Customer requirements


  • Customer Requirement 138729 : z/OS Console message help support JES2 messages

    • Use case: Hover over a JES2 messages or other prefixed message to get the messages help.

    • IBM response: This RFE has been delivered via APAR PH24072 at end of June 2020. It’s available on both V2R3 and V2R4. The JES2 message prefix such as $ can be setup in “Configure Message Help” dialog in z/OSMF Console UI.

    • “Configure Message Help”, there is a location for your KC4Z hostname so you can can retrieve message help from the Knowledge Center when you hover over message ID.

    • Note might be usable in principle be used by other subsystems, but it was raised as a z/OSMF requirement for JES2.


On the blog


SAP And Db2 Correlation IDs

Every so often I get to work with a SAP customer. I’m pleased to say I’ve worked with four in recent years. I work with them sufficiently infrequently that my DDF SMF 101 Analysis code has evolved somewhat in the meantime.

The customer situation I’m working with now is a good case in point. And so I want to share a few things from it. There is no “throwing anyone under the bus” but I think what I’ve learnt is interesting. I’m sure it’s not everything there is to learn about SAP, so I won’t pretend it is.

The Structure Of SAP Db2 Correlation IDs

In Db2 a correlation ID (or corrid) is a 12-character name. Decoding it takes some care. For example:

  • For a batch job up to the first 8 characters are the job name.
  • For a CICS transaction characters 5 to 8 are the transaction ID.

In this set of data the correlation ID is interesting and useful:

  • The first three characters are the Db2 Datasharing group name (or SAP application name).
  • The next three are “DIA” or “BTC” – usually. Occasionally we get something else in these 3 positions.
  • Characters 7 to 9 are a number – but encoded in EBCDIC so you can read them.

I wouldn’t say that all SAP implementations are like this, but there will be something similar about them – and that’s good enough. We can do useful work with this.

Exploring – Using Correlation IDs

Gaul might indeed be divided into three parts. (“Gallia est omnis divisa in partes tres”). So let’s take the three parts of the SAP Correlation ID:

Db2 Datasharing Group Name / Application Name

To be honest, this one isn’t exciting – unless the Datasharing Group Name is different from the SAP Application Name. This is because:

  • Each SAP application has one and only one (or zero) Datasharing Groups.
  • Accounting Trace already contains the Datasharing Group Name.

In my DDF SMF 101 Analysis code I’m largely ignoring this part of the Correlation ID, therefore.

BTC Versus DIA

The vast majority of the records have “BTC” or “DIA” in them, and this post will ignore the very few others. Consider the words “have “BTC” or “DIA” in them”. I chose my words carefully: these strings might not be at offsets 3 to 5. Here’s a technique that makes that not matter.

I could use exact equality in DFSORT. Meaning a specific position is where the match has to happen. However DFSORT also supports substring search.

Here is the syntax for an exact match condition:


INCLUDE COND=(CORRID46,EQ,C'BTC')


Here I’ve had to remap the ID field to map positions 4 to 6 (offsets 3 to 5). That’s a symbol I don’t really want and it isn’t flexible enough.

Here’s how it would look using a substring search condition:


INCLUDE COND=(CORRID,SS,EQ,C'BTC')


This is much better as I don’t need an extra symbol definition and the string could be anywhere in the 12 bytes of the CORRID field.

If we can distinguish between Batch (“BTC”) and Dialog (“DIA”) we can do useful things. We can show commits and CPU by time of day – by Batch versus Dialog. We could do Time Of Day anyway, without this distinction. (My DDF SMF 101 Analysis code can go down to 100th of a second granularity – because that’s the SMF time stamp granularity – so I regularly summarise by time of day.) But this distinction allows us to see a Batch Window, or times when Batch is prevalent. If we are trying to understand the operating regime, such distinctions can be handy.

Numeric Suffix

This is the tricky one. Let’s take an example: “XYZBTC083”

We’re talking about the “083” part. It looks like a batch job identifier within a suite. But it isn’t. For a start, such a naming convention would not survive in a busy shop. So what onis it?

There are a few clues:

  • “XYZBTC083” occurs throughout the set of data, day and night. So it’s not a finite-runtime batch job.
  • In the (QWHS) Standard Header the Logical Unit Of Work ID fields for “XYZBTC083” change.
  • the “083” is one value in a contiguous range of suffixes.

What we really have here are SAP Application Server processes, each with their own threads. These threads appear to get renewed every so often. Work (somehow) runs in these processes and, when it goes to Db2, it uses these threads. It’s probably controllable when these threads get terminated and replaced – but I don’t see compelling evidence in the data for that control.

This “083” suffix is interesting: In one SAP application I see a range of “XYZDIA00” – “XYZDIA49”. Then I see “XYZBTC50” – “XYZBTC89”. So, in this example, that’s 50 Dialog processes and 40 Batch processes. So that’s some architectural information right there. What I don’t know is whether lowering the number of processes is an effective workload throttle, nor whether there are other controls in the SAP Application Server layer on threads into Db2. I do know – in other DDF applications – it’s better to queue in the middle tier (or client application) than queue too much in Db2.

IP Addresses And Client Software

Every SMF 101 record has an IP Address (or LU Name). In this case I see a consistent set of a small number of IP addresses. These I consider to be the Application Servers. I also see Linux on X86 64-Bit (“Linux/X8664”) as the client software. I also see it’s level.

So we’re building up a sense of the application landscape, albeit rudimentary. In this case client machines. (Middle tier machines, often – if we’re taking the more general DDF case than SAP.)

Towards Applications With QMDAAPPL

When a client connects to Db2 via DDF it can pass certain identifying strings. One of these shows up in SMF 101 in a 20-byte field – QMDAAPPL.

SAP sets this string, so it’s possible to see quite a high degree of fine detail in what’s coming over the wire. It’s early days in my exploration if this – with my DDF SMF 101 Analysis code – but here are two things I’ve noticed, looking at two SAP applications:

  • Each application has a very few QMDAAPPL values that burn the bulk of the CPU.
  • Each application has a distinctly different (though probably not totally disjoint) set of QMDAAPPL values.

I’ve looked up a few of the names on the web. I’ve seen enough to convince me I could tell what the purpose of a given SAP application is, just from these names. Expect that as a future “stunt”. 🙂

Conclusion

I think I’ve shown you can do useful work – with Db2 Accounting Trace (SMF 101) – in understanding SAP accessing Db2 via DDF.

SAP is different from many other types of DDF work – and you’ve seen evidence of that in this long post.

One final point: SAP work comes in short commits / transactions – which makes it especially difficult for WLM to manage. In this set of data, for instance, there is relatively little scope for period aging. We have to consider other mechanisms – such as

  • Using the Correlation ID structure to separate Batch from Dialog.
  • Using DDF Profiles to manage inbound work.
  • (Shudder) using WLM resource groups.

And, as I mentioned above,

  • Using SAP’s own mechanisms for managing work.

I’ve learnt a fair bit from this customer situation, building as it does on previous ones. Yes, I’m still learning at pace. One day I might even feel competent. 🙂

And it inspires me even more to consider releasing my DDF SMF 101 Analysis code. Stay tuned!

Automating Microsoft Excel

(This post is about automating Excel on Mac OS. If you’re not a Mac user this probably won’t interest you.)

My Process

From the creation of a CSV file onwards, automation is key:

Creating a CSV file is one of two things:

  • A program on z/OS.
  • A piece of javascript code I have that turns HTML tables into CSV.

So that’s not a problem. Where the problem starts is automating Excel, which is what this post is about.

Here is the sequence I generally go through with Excel:

  1. Excel ingests the CSV file.
  2. Then it moves onto the business of creating a chart.
  3. Then resizing the chart.
  4. Then changing some of the attributes of the chart.
  5. Finally exporting the chart in the graphical format that I can use in a presentation.

I orchestrate a lot of things with the Keyboard Maestro automation platform. It’s well worth the money, even if it is your own money. I might kick off a Keyboard Maestro macro in one of several ways:

(As an aside on the “hot key” combination, I’ve repurposed the Caps Lock key to pop up a menu of automations – using Karabiner Elements and a Keyboard Maestro conflict palette.)

ApplesScript And Excel

AppleScript it the prevalent automation language on Mac OS. I have to say the AppleScript support in Excel is very obtuse. So most of the value in this post is a few snippets of AppleScript that people trying to use the data model might find useful.

One tip: If you look at the VBA data model for Excel the AppleScript support is very similar to it, language differences apart.

So here are some snippets of code you might find useful.

Changing A Chart’s Type

To change the currently selected (active) chart’s type to a line chart you would code

tell application "Microsoft Excel"
    set newChart to active chart
    tell newChart
        set chart type to line chart
    end tell
end tell

Note the name “line chart” is not a string. It is literally line chart. I think this is confusing. Other chart types I’ve automated have been

  • xyscatter
  • column stacked
  • column stacked 100

This last is where the y axis stops at 100%.

Editing The Title Of A Chart

The following example does a number of things:

  1. Prompt you for a title for the current (active) chart.
  2. Sets the chart title to the text you returned.
  3. Sets the font size of the title to 24 points.

Here is the code:

tell application "Microsoft Excel"
    set newChart to active chart
    tell chart title of newChart
        set theResponse to display dialog "Edit Chart Title" default answer chart title text as string buttons {"Cancel", "Continue"} default button "Continue"
        set chart title text to text returned of theResponse
        tell its font object
            set font size to 24
        end tell
    end tell
end tell

Setting The Dimensions Of A Chart

I always want the dimensions of a chart to be the same – and suitable for including in a PowerPoint presentation. I have two scripts for setting the dimensions of the active chart:

  • Single Width
  • Double Width

Only the single width one is right for putting in a presentation:

tell application "Microsoft Excel"
    set ch to chart area object of active chart
    set height of ch to 567
    set width of ch to 850.5
end tell

The double width one came in very handy recently: It was a great way to zoom in on a time line:

tell application "Microsoft Excel"
    set ch to chart area object of active chart
    set height of ch to 567
    set width of ch to 1701
end tell

I’ve set these two up on a pair of buttons in Metagrid – so I can readily swap between the two sizes.

In Conclusion

You might wonder why I’ve created a blog post that is essentially code snippets. Here’s why: It took a lot of fiddling, experimentation and web searching to come up with these snippets of code. That they were hard to come up with or find says something.

This post will be findable from Duck Duck Go etc. I hope it saves people some time and frustration.

If you’re not into automation I might be beginning to warm you to it.

I have a lot of Keyboard Maestro macros for Excel that I have yet to convert to pure AppleScript – and these are mainly fiddling with menus. (Keyboard Maestro is very good at that but it is slower and less reliable to automate that way.) As I convert some more I might well post additional code snippets.

mdpre Markdown Preprocessor

mdpre Markdown Preprocessor

A few days ago I released md2pptx, a Markdown to Powerpoint converter on GitHub as an open source project. You can find it here.

Now I’m releasing a companion program – mdpre. This is a preprocessor for Markdown. You can find it here.

I would’ve released them together and why I didn’t is a small story at my own expense: We in IBM have a system for requesting permission to open source software. I put in the cases for both mdpre and md2pptx at the same time. I got the permission email for md2pptx but didn’t spot the one for mdpre. I should have checked the status of the mdpre case but, life being too short, I didn’t get there for another week. Meanwhile I open sourced md2pptx.

All this by way of saying the pairing of mdpre and md2pptx is a strong one: Almost all my presentations are run through mdpre and the resulting Markdown through md2pptx. And one of the features of mdpre is a key reason why I do that.

My Workflow

When I begin a customer study I use a Keyboard Maestro macro that:

  • Creates a folder for the study within the customer’s folder in Documents.
  • Creates subfolders of this one for each presentation within the study.
  • Creates files with the .mdp extension in these folders for each presentation.
  • Pulls in some stock graphics and include files into the relevant subfolders.
  • Creates a make file in each subfolder.

That saves me a lot of time and it’s very consistent. I’m showing you this as it might be a helpful model to follow if you use md2pptx and mdpre.

The Killer Feature For Me

But what is the key feature I mentioned just now?

It’s the ability to convert CSV data into a Markdown table.

For example, you might code

=csv
"A",1
"B",2
=endcsv

And you get something like this:

A 1
B 2

The default for alignment in Markdown is left. But I don’t always want that, so I invented =colalign:

=colalign l r
=csv
"A",1
"B",2
=endcsv

This aligns the first column left and the second column right -using the Markdown :- and -: syntax.

There is one further nuance – which would be ignored by a normal Markdown processor: =colwidth:

=colalign l r
=colwidth 2 1
=csv
"A",1
"B",2
=endcsv

Here the first column is twice the width of the second – but only when rendered on a slide in md2pptx.

Other Notable Features

I less frequently need a few other things:

  • Inclusion of files. The time I use this is when I want to include a CSV file but don’t want to paste it into the main document.
  • Conditional processing. I used this in a couple of my conference presentations.
  • Variables. Again, useful in conference presentations – to create consistency.

As with most things I do, if I get fed up enough I automate it. So I expect there will be additional features added to mdpre over time.

Talking of which, testing, bug reports and even contributions welcome.

md2pptx Markdown To PowerPoint Converter

It’s been a long time since I started writing md2pptx. But finally I’ve open sourced it.

A Problem Statement Of Sorts

I wrote md2pptx because I got tired of four things:

  • The process of embedding graphics in PowerPoint.
  • The fact that pasting a picture into PowerPoint made the resulting presentation file huge.
  • The location of a manually added picture on a page is likely to be inconsistent.
  • Presentations become a hodge podge of inconsistent text styles.

The last one is really the result of success. Take a presentation of mine like “Parallel Sysplex Performance Topics”. It has evolved over at least 10 years. It probably started out in Open Office and ended up in PowerPoint. I know for a fact that each time I “refurbished” it I introduced styling inconsistencies, particularly if I swiped slides from someone else. “Much Ado About CPU” saw that problem in spades.

It also occurred to me that what today is a presentation might tomorrow need to be a document. Or a presentation guide, which is really both a document and slides. If only I could – from the same source – produce both a presentation and a text document.

How Do I Write Documents?

I don’t often write documents but when I do I abjure Word in favour of Markdown. And this is significant.

I rather like a text-based format and, if you forced me to, I’d write in HTML. Thankfully nobody has forced me to write HTML, though I’m very familiar with it. So Markdown (or rather MultiMarkdown) is the text-based format I have chosen.

I write in a wide variety of applications – across Mac, iPad and iPhone. Precisely which needn’t concern us here. What matters is that a text-based format allows to me to write anywhere – and I wanted to extend that to slides.

md2pptx and python-pptx

A while back I discovered a Python package called python-pptx. You can write programs to drive python-pptx to make PowerPoint presentations. Specifically the XML flavour – with file extension “pptx”. Hence the name. It’s actually quite easy – if you’re reasonably adept at Python – to use python-pptx in a program.

So I wrote such a program called md2pptx. I started writing it a few years ago – and I’ve used it in every engagement since. I’ve also refurbished a number of presentations by converting them to Markdown, extracting the graphics, and rebuilding with md2pptx. As I’ve used md2pptx I’ve enhanced it.

md2pptx takes a subset of MultiMarkdown (a superset of “vanilla” Markdown) and builds the presentation, embedding the graphics as appropriate.

If you know Markdown you know how to write in a way that md2pptx can convert.

Everything you would code for md2pptx is valid Markdown so you can turn it into a document – if you need to.

There are some nice wrinkles, such as the automatic creation of a Table Of Contents slide, turning Taskpaper-formatted tasks into task list slides at the end. Likewise a glossary slide. It even supports Critic Markup format – for reviewing presentations.

How This Fits Into My Life

I built a load of automation around preparing for an engagement. I have a Keyboard Maestro macro that creates a new folder for the study (as a subfolder of the one I use for the customer in question). It then creates subfolders – one for each presentation I’d use in a workshop. Inside these folders it writes a stub Markdown file and a make file.

As I develop the presentation I edit it in BBEdit and – on the command line – type make to build the presentation. md2pptx runs very fast, always well under a second to build any presentation I’ve ever used it with. This includes embedding graphics.

I also wrote some automation to cause PowerPoint to exit any slideshow, close the presentation, and reload it. This is a piece of AppleScript I suppose I could share.

Open Sourcing md2pptx

Without giving away any commercial secrets, I can say the process of open sourcing in IBM can be quite straightforward: You submit a proposal, covering points such as licensing, expected IBM effort, commercial considerations. In my case I’ve found getting approval quick – because there’s nothing contentious about it.

So md2pptx is on GitHub. You can find it here. As with my other projects, I invite contributions, issues, testing. The only things I would say are:

  • Don’t expect it to be a highly robust, full function Markdown parser.
  • If you think there is some feature you’d like md2pptx to have consider these two questions:
    • How would you express the semantic in Markdown?
    • How would you express the feature in PowerPoint?

Anyhow, I hope you enjoy using it – if you do. I know I do – and I’m pleased to share it.

What About The Others?

I just produced a new chart, which I think is worth sharing with you.

I produced it for one specific use case, but I think it has slightly wider applicability.

The Trouble With Outsourcers

Quite often we get performance data from outsourcers, whether IBM or some other organisation. Generally they’re running LPARs from more than one of their customers on the one machine.

We have a nice chart for a machine. It shows all the LPARs’ CPU stacked up – with each LPAR a different series. This is fine in a one-company context. But sometimes we are working with the outsourcer and one of their customers. We wouldn’t want to show them the outsourcer’s other customers’ LPARs. But we would want to show them how busy the machine is.

It’s reasonable to show them how busy the machine is because, of course, it affects the performance they’re seeing. And we might well get into LPAR design issues. (A tricky one is the weights because adjusting them is a “robbing Peter to pay Paul” situation – and with a multi customer machine that’s obviously political.)

So here is a new chart, that neatly solves the problem. It’s a real one, though there has had to be a little obfuscation of the names.

In this case CPU2 is a Production LPAR and CPU3 is a Development LPAR. The grey is all the other LPARs’ use of the GCP pool. It’s clearly substantial.

The pool itself isn’t hugely busy – but then this was not said to be a problem day.

But There’s More

Even in the one-company case this chart is useful. Suppose a customer sends us data from what they consider their biggest LPARs. It would be good to show:

  • The LPARs they sent us data for are indeed the bulk of the CPU.

Or

  • We’re missing a big trick as the LPARs they sent data for don’t use the bulk of the CPU.

One Final Plea

I’ve said this many times, but probably not written it in a blog post. Always report processor pools separately. Everything in this post has been for a single machine’s GCP pool. To mix GCP’s with, say, zIIPs makes no sense at all.

Engineering – Part Five – z14 IOPs

I previously wrote about SMT in Born With A Measuring Spoon In It’s Mouth in 2016 – before z14 was announced. I also wrote about it again in 2016 in SMT – Some Actual Graphs. It’s been a year since z15 was announced so enough time has passed for me to want to write about SMT once more.

But actually there isn’t any real SMT news.

But there’s something I thought I’d written about before, but I hadn’t: With z14, IOPs are always enabled for SMT. Actually one of them isn’t, but the rest are. So, in SMF 78–3 you get an odd number of IOPs – and therefore an odd number of IOP Initiative Queue and Utilization Data Sections. One is not SMT-enabled and the rest are.

So, if you have 10 IOP cores you have 19 IOP sections.

It would be interesting to see how they behave. So I took data from a two-drawer z14. (It’s a M02 hardware model, with a software designation 507, with 7 GCPs, 4 zIIPs, and 5 ICFs. It has lots of LPARs.)

So, I used the 78–3 data to plot two metrics:

  • Processed I/O Interrupts per second
  • IOP Busy %

Here is the graph, with IOP Busy on the right-hand axis and I/O Interrupts on the left.

The numbers are interesting but there is no clear pattern:

  • The I/O Interrupt rate varies wildly – and I suspect it has something to do with the devices and channels the IOP is handling.
  • The IOP Busy % doesn’t necessarily correspond to the I/O Interrupt rate.

Probably the more important and useful metric is the IOP Busy number.

When I say “no clear pattern” I mean it would be difficult to say something like “IOP 4 is busier because of its position in the machine”.

I do think it’s worth keeping an eye on IOP Busy %. This particular set of data shows very low IOP utilisations – which is a go thing.

For a 2-drawer z14, 10 IOPs is the standard number but you can buy more. For z13 it was 12 and for z15 it’s 8. there’s a clear trend here. I do think that having SMT as standard on IOPs will have contributed to the possibility of reducing the number of standard IOPs. Obviously them getting a little bit faster with each generation helps, but you have to balance that against other processor types also getting faster. Another factor might be the historical trend towards more memory in a machine and fewer I/Os, relatively speaking.

My code knows that it’s standard for a 2-drawer z14 to have 10 IOPs. It has to calculate – especially from z14 onwards – the number of IOPs as this isn’t recorded. SMT is part of that calculation. So I report standard IOPs and additional IOPs – though I haven’t seen a case of the latter yet.

And this is in the “Engineering” series of blog posts as we’re dealing with individual processors, even if they are IOPs.

filterCSV – 4 Months On

Back in May (2020) I published filterCSV, An Open Source Preprocessor For Mind Mapping Software.

To recap a little, the premise was very simple: I wanted to create a tool that could automate colouring nodes in a mind map, based on simple filtering rules. The format I chose was iThoughts’ CSV file format. (It could both import and export in this format.) Hence the name “filterCSV”.

I chose that format for three reasons:

  • I use iThoughts a lot – and colouring nodes that match patterns is a common thing for me to do.
  • The format is a rather nice text format, with lots of semantics available.
  • Python has good tools for ingesting and emitting CSV files. Likewise for processing arrays – which is, obviously, what CSV can be converted to.

So I built filterCSV and last time I wrote about it I had extended the CSV -> filterCSV -> CSV cycle to

  • Ingest flat text, Markdown and XML
  • Emit HTML, OPML, XML, Freemind

So, it had become a slightly more general tree manipulator and converter.

What Happened Next?

I’ve done a lot of work on filterCSV. I’ll attempt to break it down into categories.

Import

You can now import OPML.

Export

You can now export as tab- or space-indented text.

You can now export in GraphViz Directed Graph format, which means you can get a tree as a picture, outside of a mind-mapping application.

Tree Manipulation Functions

You can sort a node’s children ascending, and you can reverse their order. The latter means you can sort them descending. Imagine a tree with a Db2 subsystem and the CICS regions that attach to it as its children. You’d want the CICS regions sorted, I think. (Possibly by name, possibly by Service Class or Report Class.

Sometimes it makes sense for the children of a node to be merged into the node. Now they can be and they are each preceded by an asterisk – to form a Markdown bulleted list item. (iThoughts can handle some Markdown in its nodes.) I think we might use this in our podcast show notes.

You can now select nodes by their level in the tree. You can also use none as a selector – to deselect all nodes in the tree. (Before you had all as a selector – to allow you to set attributes of all nodes.) You might use none with nc (next colour) to skip a colour in the iThoughts palette.

Here’s an example:

'^A1$' nc
none nc
'A1A' nc

Where the first command says ‘for nodes whose text is “A1” colour with the first colour in the standard iThoughts colour palette’. The second says ‘do not use the second colour in the palette’. The third command says ‘for nodes with “A1A” in their text use the third colour in the palette’.

New Node Attributes

iThoughts, as well as colour and shape, has three attributes of a node that filterCSV now supports:

  • Icon – where you can prefix a node with one of about 100 icons. For example, ticks and single-digit number icons.
  • Progress – which is the percent complete that a task is. Some people use iThoughts for task management.
  • Priority – which can range from 1 to 5.

As with colour and shape, you can set these attributes for selected nodes, with the selection following a rule. And, again, you can combine them. For example a tick node and 100% completion. You can also reset them, for example with noprogress.

Usability

Invoking filterCSV with no commands produces some help. This help points to the Github repository and the readme.

You can now (through Stream 3) read commands from a file. If you do you can introduce comments with // . Those continue until the end of the line. You can also use blank lines.

I learnt to use Stream 3 for input you might invoke filterCSV with something like

filterCSV < input.csv > output.opml 3< command-file

So, you can see filterCSV (now at 1.10) has come on in leaps and bounds over the past few months. Most of the improvements were because I personally needed them, but one of them – indented output – was in response to a question from someone in a newsgroup.

And I’ve plenty more ideas of things I want to do with filterCSV. To reiterate, it’s an open source project so you could contribute. filterCSV is available from here.

And it’s interesting to me how the original concrete idea – colouring iThoughts nodes – has turned into the rather more abstract – ingesting trees and emitting them in various formats with lots of manipulations. I like this and probably should deploy the maxim “abstraction takes time and experience”.

Mainframe Performance Topics Podcast Episode 26 “Sounding Board”

In Episode 25 I said it had been a long time since we had recorded anything. That was true for Episode 25, but it certainly wasn’t true for Episode 26. What is true is that it’s taken us a long time from start to finish on this episode, and ever so much has happened along the way.

But we ploughed on and our reward is an Episode 26 whose contents I really like.

On to Episode 27!

Here are the unexpurgated show notes. (The ones in the podcast itself have a length limitation; I’m not sure Marna and I do, though.) 🙂

Episode 26 “Sounding Board”

Here are the show notes for Episode 26 “Sounding Board”. The show is called this because it relates to our Topics topic, and because we recorded the episode partly in the Pougkeepsie recording studio where Martin sounded zen, and partly at home.

Where we have been

  • Marna has been in Fort Worth for SHARE back in February

  • Martin has been to Las Vegas for “Fast Start”, for technical sales training, and he got out into the desert to Valley Of Fire State Park

  • Then, in April he “visited” Nordics customers to talk about
    • zIIP Capacity and Performance
    • So You Don’t Think You’re An Architect?
  • But he didn’t get to go there for real. Because, of course, the world was upended by both Covid and Black Lives Matter.

Follow up

  • Chapter markers, discussed in Episode 16. Marna finally found an Android app that shows them – Podcast Addict. Martin investigated that app, and noted it is available on iOS too.

What’s New – a couple of interesting APARs

  • APAR PH21919: NEW FUNCTION – WORKFLOW SUPPORT SAVE JOB OUTPUT

    • When you run a workflow step that invokes a job you can automatically save the job output in a location of your choosing files (z/OS Unix file directory).

    • In the same format as you’d see in SDSF . Means users can have an automatic permanent record of the work that was done in a workflow

    • PTF Numbers are UI68359 for 2.3 and UI68360 for 2.4

  • APAR OA56774 (since 2.2) Provides new function to prevent a runaway sysplex application from monopolizing a disproportionate share of CF resources

    • This APAR has a dependency on CFLEVEL 24.

    • This case is pretty rare, but is important when you have it.

    • Not based on CF CPU consumption. Is based on deteriorating service times to other structures – which you could measure with SMF 74–4 Coupling Facility Activity data.

Mainframe – z15 FIXCATs

  • Important to cover as there are many questions about them.

  • IBM.Device.Server.z15–8561.RequiredService

    • Absolute minimum needed to run on a z15

    • Unfortunately some of these PTFs in that list have been involved in messy PE chains

    • If that happens, involve IBM Service (Bypass PE or ++APAR)

    • Usually intent is to keep these PTFs to a minimum – and keep the number of PTFs relatively constant.

      • CORRECTION: System Recovery Boost for z15 GA1 is in Required, not Exploitation category, as the recording states!
  • IBM.Device.Server.z15–8561.Exploitation

    • Needed for optional functions, and you can decide when you want to use them.

    • This PTF list could grow – if we add new functions

  • IBM.Device.Server.z15–8561.RecommendedService

    • This is more confusing. Usually to fix a defect that is found but haven’t risen up to required. We might’ve detected it in testing, or a customer might have.

    • Over time this category probably will grow, as field experience increases

    • Might want to run an SMP/E REPORT MISSINGFIX to see what’s in this FIXCAT. Might install some, all, or none of the fixes. Might want to be more selective. Based on how much change you want to encounter, versus what problems are fixed

  • By the way there are other FIXCATs you might want to be interested in for z15, e.g. IBM.Function.SYSPLEXDataSharing

Performance – DFSORT And Large Memory

  • A very special guest joins us, Dave Betten, former DFSORT performance lead.

  • Follows on from Elpida’s item in Episode 10 “234U” in 2017, and continues the “Managing Large Memory” theme.

  • Number of things to track these days:
    • Often track Average Free
    • Also need to track Minimum Free
    • Fixed frames – Especially Db2, and now with z/OS 2.4 zCX
    • Large frames – Again Db2 but also Java Heap
  • In z/ 2.2
    • OPT controls simplified
      • Thresholds set to Auto
      • Default values changed
      • 64GB versus %
  • In z/ 2.3
    • LFAREA
      • Not reserved anymore but is a maximum
      • BTW the LFAREA value is in SMF 71
      • Dave reminded us of what’s in SMF 71
  • Dave talked about DFSORT memory controls
    • DFSORT has historically been an aggressive user of memory
    • Installation defaults can be used to control that
    • But the EXPOLD parameter needs special care – because of what constitutes “old pages”, which aren’t actually unused.
    • DFSORT Tuning Guide, especially Chapter 3
  • Dave talked about how handy rustling up RMF Overview Reports can be, with several Overview conditions related to memory.

  • Most of the information in this topic is relevant to LPARs of all sizes

Topics – Update on recording techniques

  • Last talked about this in Episode 12, December 2017

  • Planning for podcast – still using iThoughts for outlining the episode (though its prime purpose is for mind mapping and Martin (ab)uses it for depicting various topologies.

  • Recording of podcast – still using Skype to collaborate
    • Record locally on each side, but now Marna’s side is in the new Poughkeepsie recording studio!
    • Martin has moved to Piezo and then Audio Hijack
      • Recording in stereo, with a microphone stand to minimise bumps
      • Has to slow the computer’s fan speed, and has an external cooling pad
      • Also he hides behind pillows to minimise the noise and improve audio quality.
    • For a guest, it’s different. We can’t record in stereo. Guests might not have recording software. But still use Skype (unless in Poughkeepsie).
  • Production

    • Martin’s editing

      • Moved from Audacity on Mac to Ferrite on iPad OS
      • Moved to iPad so he can edit anywhere, except where there is noise. Apple Pencil helps with precision.
      • Then, throw away remote side – in stereo terms.
      • Then, perform noise reduction, still not perfect.
  • Publishing

    • Marna’s publishing: Uploading the audio, publishing show notes, still the same as before.

Customer requirements

  • – Insert Usual Disclaimer Here – which is only our thoughts.

  • RFE 139477 “Please include the CPU Time Limit for a Job/Step in SMF Type 30”

    • The CPU Time Limit in effect for a JobStep is not currently written to SMF Type30 at the end of the step.

      While a job is running this information is available in the Address Space Control Block (ASCBJSTL) and can be displayed or even modified by tools such as OMEGAMON.

      However the information is not retained after the JobStep completes. This information would be very useful after the fact to see the CPU time limit in effect for a JobStep.

      This enhancement request is to include the information in ASCBJSTL in the SMF Type30 Subtype 4 record written at the end of the JobStep.

      An additional consideration would be how to best deal with the Job CPU time Limit (as specified on the JOB statement) and whether this can also be catered for in the RFE

    • Business justification: Our site got caught out by a Test job being submitted overnight with TIME=1440 and consuming over 6 hours CPU before it was cancelled. We would like to be able to prevent similar issues in future by having the CPU Time Limit data available in SMF.

    • Our comments:

      • After the fact
        • The RFE was calling for “after the fact”. i.e. when the step has ended. Might also like the source of the limit.

        • End of step looks useful. Could run query comparing to actual CPU time, then track to see if ABEND is on the horizon

      • “As it happens”

        • Would like on the SMF Interval as well as Step End records, maybe with tools to dynamically change the parameters.

        • May not need the SMF information if vendor and IBM tools already do it today, making it perhaps not a high enough priority for SMF

        • And the source of the parameters might not be readily available in control blocks so this might not even be feasible.

On the blog

Contacting Us

You can reach Marna on Twitter as mwalle and by email.

You can reach Martin on Twitter as martinpacker and by email.

Or you can leave a comment below. So it goes…