Coupling Facility Structure Performance – A Multi-System View

It’s been quite a while since I last wrote about Coupling Facility performance. Indeed it’s a long time since I presented on it – so I might have to update my Parallel Sysplex Performance presentation soon.

(For reference, that last post on CF Performance was Maskerade in early 2018.)

In the past I’ve talked about how a single system’s service time to a single structure behaves with increasing load. This graphing has been pretty useful. Here’s an example.

This is from a system we’ll call SYS1. It is ICA-SR connected. This means a real cable, over less than 150m distance. It’s to a single structure in Coupling Facility CF – DFHXQLS_POOLM02, which is a list structure. Actually a CICS Temporary Storage sharing pool – “POOLM02”.

From this graph we can see that the service time for a request stays pretty constant at around 7.5μs. Also that the Coupling Facility CPU time per request is almost all of it.

I have another stock graph, actually a pair of them, which show a shift average view of all the systems’ performance with a single structure. This is pretty nice, too.

Here’s the Rate Graph across the entire sysplex.

Here we see SYS1 and it’s counterparts in the Sysplex – SYS2, SYS3, and SYS4.

(Note to self: They really are numbered that way.)

We can see that in general the traffic is mostly from SYS1 and SYS2, and almost none from SYS3. I would call that architecturally significant.

We can also see that there is no asynchronous traffic to this structure from any LPAR.

And here’s the Service Time graph.

You can see that the two IC-Peer-connected LPARs have better service times than the two ICA-SR-connected LPARs. This is reasonable given that IC Peer links are simulated by PR/SM and so unaffected by the speed of light or distance. Again, the statement has to be qualified by in general.

But the graphs you’ve seen so far leave a lot of questions unanswered.

So, for a long time I’ve wanted to do something that combined the two approaches: Performance With Increasing Load, and Differences Between Systems.

I wanted to get beyond the single-system view of scalability. I usually put a number of systems’ scalability graphs on a single slide but

  • The graphs end up smaller than I would like.
  • This doesn’t scale beyond four systems.

The static multi-system graphing is fine but it really doesn’t tell the full story.

Well, now I have it in my kitbag. I’m sharing a new approach with you – because I think you’ll find it interesting and useful.

The New Approach

How about plotting all the systems’ service times versus rates on one graph? It sounds obvious – now I mention it.

Well, let’s see how it works out. Here’s a nice example:

Again we have the same four systems and the same CF structure. Here’s what I conclude when I look at this:

  • SYS2 and SYS4 have consistently better service times – across the entire operating range – than SYS1 and SYS3. This shows the same IC Peer vs ICA-SR dynamic as we saw before.
  • SYS3 service times are worse than those of the other 3 – and again we see its rate top out considerably lower than those of the other 3.
  • SYS2 service times are always worse than SYS4’s. They happen to share the same machine and SYS2 is a much bigger LPAR than SYS4, actually spanning more than 1 drawer. That might have something to do with it.


Coupling Facility service times and traffic remain key aspects of tuning Parallel Sysplex implementations. The approach of “understand what happens with load” also remains valid.

The new piece – combining the service times for all LPARs sharing a structure none graph – looks like the best way of summarising such behaviours so far.

Of course this graph will evolve. I can already think of two things to do to it:

  • Add the link types into the series legend.
  • Avoid showing systems that don’t have any traffic to the structure (and maybe indicating that in the title).

But, for now, I want to get more experience with using this graph. For example, an even more recent customer has all systems connected to each coupling facility by ICA-SR links. The graphs for that one show similar curves for each system – which is unsurprising. But maybe in that case I would see a difference if the links were of different lengths.

And, as always, if I learn something interesting I’ll let you know.

More On Samples

This post follows on from A Note On Velocity from 2015. Follows on at a respectful distance, I’d say – since it’s been 5 years.

In that post I wrote “But those ideas are for another day or, more likely, another year (it being December now).” This is that other day / year – as this post reports on some of those “left on the table” aspects. For one, I do now project what happens if we include (or exclude) I/O samples.

In a recent customer engagement I did some work on WLM samples for a Batch service. This service class has 2 periods, the first period having an incredibly short 75 service units duration.

  • Period 1 is Importance 4, with a reasonable velocity.
  • Period 2 is Discretionary.

Almost everything ends in Period 2 – so almost all batch work in this shop is running Discretionary i.e. bottom dog without a goal.

As I said in A Note On Velocity, RMF reports attained velocity from Using and Delay samples and these come direct from WLM. Importantly this means you can calculate Velocity without having to sum all the buckets of Using and Delay samples. You won’t, for example, add in I/O Using and I/O Delay samples when you shouldn’t – if you’re calculating velocity from the raw RMF SMF fields (as our code does). I’ll call this calculation using the overall Using and Delay buckets the Headline Velocity Calculation.

I thought this would be useful for figuring out if I/O Priority Management is enabled. In fact there’s a flag for that – at the system level – but if you do the calculation by totting up the buckets you get sensible numbers for both cases: Enabled and Disabled.

I/O Priority Management can be enabled or disabled at the service class level. I don’t definitively see a flag in RMF for this at the service class level but presumably if the headline calculation doesn’t work versus totting up the individual buckets with I/O samples then the Service Class is not subject to I/O Priority Management. And the converse would be true.

Batch Samples

For Batch, the headline calculation is matched by totting up the buckets for Using and Delay, if you include QMPL in the Delay samples tally – because this represents Initiator Delay. This is sensible to include in the velocity calculation as WLM-managed initiators are, as the name suggests, managed according to goal attainment and a delay in being initiated really ought to be part of the calculation.

Equally, though, with JES-managed initiators you could get a delay waiting for an initiator. And WLM isn’t going to do anything about that.

(By the way, SMF 30 – at the address space / job level – has explicit times fields for a job getting started. The most relevant one is SMF30SQT.)

I was reminded in this study that samples where the work is eligible to run on a zIIP but where it actually runs on a GCP are included in Using GCP samples. If you do the maths it works. It’s not really surprising.

This is also a good time to remind you samples aren’t time, except for CPU – which is measured and converted to samples.

An example of where this is relevant is when zIIP speed is different from GCP speed. there are two cases for this:

  • WIth subcapacity GCPs – where the zIIPs are faster than GCPs.
  • With zIIPs running SMT-2 – where zIIP speed is slower than when SMT is not enabled. (It might still be faster than a GCP but it might not be.)

Here, it becomes interesting to think about how you get all the sample types approximately equivalent. I would expect – in the “zIIPs are different speed from GCPs” case there might need to be some use of the (R723NFFI) conversion factor. I wouldn’t, though expect the effective speed of SMT-2 zIIPs to be part of the conversion.

But perhaps I’m overthinking this and perhaps a raw zIIP second is treated the same as a raw GCP second. And both are, of course, different to Using I/O.

Sample Frequency And Sampleable Units

WLM samples Performance Blocks (PBs). These might be 1 per address space or there might be many. CICS regions would be an example of where there are many.

I’m told PBs in a CICS region are not the same as MXT (maximum number of tasks) but could approach it if the workload in the region built up enough. This is different from what I thought.

I tried to calculate MXT from sample counts divided by the sampling interval and didn’t get a sensible estimate. Which is why I asked a few friends. You can imagine that a method of calculating MXT not requiring CICS-specific instrumentation would’ve been valuable.


One thing I should note in this post is that – in my experience – sampling is exact. That is to say, if you add up the samples in the buckets right you get exactly the headline number. Exactness is valuable in that it gives you confidence in your inferences. Inexactness could still leave you wondering.

Most people don’t get into the raw SMF fields but if you do:

  • You can go beyond what eg RMF reports give you.
  • You get a much better feel for how the data (and the reality it describes) actually works.

But, as with the CICS MXT case, you can get unexpected results. I hope you (and I) learn from those.

Automating Microsoft Excel Some More

As I said in Automating Microsoft Excel, I thought I might write some more about automating Excel.

Recall I wrote about it because finding snippets of code to do what you want is difficult. So if I can add to that meagre stockpile on the web, I’m going to.

That other post was about automating graph manipulation. This post is about another aspect of automating Excel.

The Problem I Wanted To Solve

Recently I’ve had several instances where I’ve created a CSV (Comma-Separated Value) file I wanted to import into Excel. That bit’s easy. What made these instances different (and harder) was that I wanted to import them into a bunch of sheets. Think “15 sheets”.

This is a difficult problem because you have to:

  1. Figure out where the break points are. I’m thinking a row with only a single cell as a good start. (I can make my CSV file look like that.)
  2. Load each chunk into a separate new sheet.
  3. Name that sheet according the the value in that single cell.
  4. (Probably) delete any blank rows, or any that are just a cell with (underlining) “=” or “-” values.

I haven’t solved that problem. When I do I’ll be really happy. I expect to in 2021.

The Problem I Actually Solved

Suppose you have 15 sheets. There are two things I want to do, given that:

  • Rapidly move to the first or last sheet.
  • Move the current sheet left or right or to the start or end.

The first is about navigation when the the sheets are in good shape. The second is about getting them that way. (When I manually split a large CSV file the resulting sheets tend not to be in the sequence I want them in.)

As noted in the previous post I’m using the Metagrid app on a USB-attached iPad. Here is what my Metagrid page for Excel currently looks like:

In the blue box are the buttons that kick off the AppleScript scripts in this post. As an aside, note how much space there is around the buttons. One thing I like about Metagrid is you can spread out and not cram everything into a small number of spots.

The Scripts

I’m not going to claim my AppleScript is necessarily the best in the world – but it gets the job done. Unfortunately that’s what AppleScript is like – but if you are able to improve on these I’m all ears eyes.

Move To First Sheet

tell application "Microsoft Excel"
	select worksheet 1 of active workbook
end tell

Move To Last Sheet

tell application "Microsoft Excel"
	select worksheet (entry index of last sheet) of active workbook
end tell

Move Sheet To Start

tell application "Microsoft Excel"
	set mySheet to active sheet
	move mySheet to before sheet 1
end tell

Move Sheet To End

tell application "Microsoft Excel"
	set mySheet to active sheet
	set lastSheet to (entry index of last sheet)
	move mySheet to after sheet lastSheet
end tell

Move Sheet Left

tell application "Microsoft Excel"
	set mySheet to active sheet
	set previousSheet to (entry index of active sheet) - 1
	move mySheet to before sheet previousSheet
end tell

Move Sheet Right

tell application "Microsoft Excel"
	set mySheet to active sheet
	set nextSheet to (entry index of active sheet) + 1
	move mySheet to after sheet nextSheet
end tell


Those snippets of AppleScript look pretty simple. However, each took quite a while to get right. But now they save me time on a frequent basis. And they might save you time.

They are all Mac-based but the model is similar to that in VBA. If you’re a Windows person you can probably replicate them quite readily with VBA.

And perhaps I will get that “all singing, all dancing” Import-A-CSV-Into-Multiple-Sheets automation working. If I do you’ll hear read about it here.

Mainframe Performance Topics Podcast Episode 27 “And Another Thing”

So this is one of our longest episodes yet, but jam full of content which is very us. As usual, finding times when we can both be available to record was tough. What wasn’t difficult was finding material. I can say for myself what we talked about is a set of things I’ve wanted to talk about for a long time.

Anyhow, enjoy! And do keep feedback and “Ask MPT” questions coming.

Episode 27 “And Another Thing” Show Notes – The Full, Unexpurgated, Version

Follow up

  • Additional System Recovery Boost enhancements:

    • Sysplex partitioning recovery: Boosts all surviving systems in the sysplex as they recover and takes on additional workload following the planned or unplanned removal of a system from the sysplex.

    • CF structure recovery: Boosts all systems participating in CF structure recovery processing, including rebuild, duplexing failover, and reduplexing.

    • CF data-sharing member recovery: Boosts all systems participating in recovery following the termination of a CF locking data-sharing member

    • HyperSwap recovery: Boosts all systems participating in a HyperSwap recovery process to recover from the failure of a storage subsystem.

    • Existing functions: image-related boosts for IPL (60 min) and shutdown (30 min)

    • These are different in that they boost multiple systems, rather than the single one that the originally announced SRB would boost. These should be fairly rare occurrences – but really helpful when needed.

  • Two more portable software instances: Db2 and IMS added to Shopz on Aug 20, 2020, in addition to CICS (Dec 2019).


Martin was asked about which systems in a System Recovery Boost Speed Boost situation get their GCPs sped up to full speed. The answer is it’s only the LPARs participating in a boost period that get their GCPs sped up. For example, on a 6xx model the other LPARs don’t get sped up to 7xx speed.

Mainframe – Trying out Ansible for driving z/OSMF Workflows

  • Marna’s been learning about Ansible and how it can drive z/OSMF work. So far with z/OSMF Workflows, so terminology might not be exactly right from inexperience.

  • Found a driving system for Ansible system (ubuntu distribution Linux running on x86).

  • A lot of installs on this Linux distribution was necessary: python 3.7, ansible, ansible-galaxy collection for ibm.ibm_zos_zosmf .

  • Ansible galaxy is a public collection of modules. Run those modules from a playbook . While steps are called roles, like jobs with steps. Ansible has some sophistication where those roles can be run by different people.

  • Had to learn a little yaml, and understand the sample playbook which came with the collection, changing the playbook for my specific system.

    • This is where helped was necessary from two helpful gracious Poughkeepsie system testers (massive thanks to Marcos Barbieri and Daniel Gisolfi !!)

    • Learning about an Ansible with a staging inventory (a configuration file), andn learning which playbook messages were ok or not ok.

  • Encountered two kinds of problems:

    • Security issues connecting the Linux environment to a large Poughkeepsie System Test z/OSMF environment

      • Required changes to the playbook, and to environment files.
    • Workflow-related: duplicate instances when part of the playbook ran ok, and selecting an automated workflow.

  • Why learn Ansible when you are a z/OS person? Ansible gives us a very nice interface to get to z/OS resources via z/OSMF capabilities with a simple programming interface.

    • Also, if we want to get more people to use z/OS, and they are familiar with Linux, they probably will want to drive work to z/OS with some sort of familiar automation

    • Linux installation of products from a command line, and having to keep looking up command syntax isn’t that fun. Although pretty easy find with google.

    • The products all installed quickly and cleanly, however knowing that the dependencies were right was not obvious. Especially the python and Ansible levels.

  • Ansible Tower as the GUI is helpful, but Marna chose to learn Ansible in pieces from the ground up.

  • As it seems to always be: getting your job done comes down to what your search returns, no matter what you are doing – z/OS or Linux work. Or even programming.

Performance – So You Don’t Think You’re An Architect?

  • Brand new presentation for 2020, builds on the “who says what SMF is for?” slide in “How To Be A Better Performance Specialist”

    • Usually kick off an engagement with an Architecture discussion, which might be hours long!

    • Builds common understanding and trust

    • Techniques in this presentation well honed

  • Presentation talks about 3 layers

    • The Top Layer – Machines, Sysplexes And LPARs

      • For example a diagram showing all the LPARs and their definitions – by pool – for a machine. Together with their sysplexes.

        • Important to answer “what is the role of this LPAR?”

        • Important to understand how far apart machines are

        • Inactive LPARs often give availability clues

        • Driven by RMF CPU and Sysplex SMF

    • The Middle Layer – Workloads And Middleware

      • Spikes: e.g. HSM Daily spike, and actually patterns of utilisation in general

      • Restarts: IPLs, Db2, IMS, CICS Regions

      • Cloning: Things like identical CICS regions

      • Topology: e.g. for CICS what Db2 or MQ a region connects to, from SMF 30 Usage Data Section. Invert that and you have the Db2’s role and what connects to this Db2. Enclave statistics give DDF traffic, Db2 Datasharing group structure

      • Software Levels: From SMF 30 for CICS, Db2, MQ in Usage Data Section, often see IMS in transition, and limitations with Db2 12 onwards – function Levels eg M507 vs M500.

        • All this from SMF 30 and 74-2 XCF, run through Martin’s own tooling. The point of mentioning it is to get the ideas out.

        • MXG discussion ongoing – especially about SMF 30 Usage Data Section. Believe the issues have been ironed out.

    • The Bottom Layer – Application Components

      • Things like Db2 and MQ Accounting Trace: Give names in great detail, likewise CICS Monitor Trace (but these are very heavy but very valuable)

      • DDF topology can be handy

      • Data sets – from SMF 42-6, 14, 15, 16, 62, 64

    • Fulfilling the definiton Of Architecture?

      • Understand what’s really going on

      • In a way that’s especially valuable to real systems architects

  • Customers should consider this approach because you can keep an up to date view of the estate and inform your own capacity and performance role.

  • Martin will keep exploring this rich avenue because dabbling in architecture keeps him interested, and lets him find patterns among customers.

Topics – Notifications

  • They are unsolicited messages appearing on the device, generally interruptions whether wanted or not.

  • Can contain text and images, so really just messages. Taking a platform-neutral view follows.

  • Where do they come from?

    • Form all sorts of services, many on the web

    • Need polling applications on the device

    • Can drain the battery

  • What do you receive and where?

    • Some on Watch: For example from my bank immediately from a contactless payment, and from a washing machine.

    • Most on phone: Slack notifications, from when Google home wifi is out. Sometimes inappropriate, like at Carnegie Hall.

    • A few on iPad: For example WordPress notifications

    • Very few on Mac

    • Very few on web browser: Only when someone replies on favourite newsgroups

    • A key way of managing them is trying not to duplicate across multiple devices

    • IFTTT still a favorite of Marna’s, mentioned in Episode 23 “The Preview That We Do” Topics topic. Martin just subscribed to Premium, but at a low-ball price.

    • Constant interruptions not always welcome, especially those with sounds.

  • How do you manage them?

    • Try to reduce their number. Often have “Wood for the trees” problem.

    • On iOS at least apps can stack/group them

    • Many are just plain spammy

      • Many are defaulted to on when you install an app
    • Figure out which devices should get what and watch for “must always receive immediately”

    • Only the place they make sense, such as exercise chivvies, letting each app has its own settings

  • An advanced topic is how you can generate your own

    • IFTTT from them service (emails, messages), from the app (weather tomorrow).

    • On iOS the Pushcut app: a webhook from eg IFTTT can cause a notification

      • Pushcut webhook support includes the ability to trigger actions on the iOS device, using cURL to invoke webhooks
    • On Android we have MacroDroid.

    • If you have an Amazon Echo or related device you can use web hooks with the Notify Me skill to send notifications to the Echo

  • Lots of scope for things to go wrong with automation based on notifications

Customer requirements

  • Customer Requirement 138729 : z/OS Console message help support JES2 messages

    • Use case: Hover over a JES2 messages or other prefixed message to get the messages help.

    • IBM response: This RFE has been delivered via APAR PH24072 at end of June 2020. It’s available on both V2R3 and V2R4. The JES2 message prefix such as $ can be setup in “Configure Message Help” dialog in z/OSMF Console UI.

    • “Configure Message Help”, there is a location for your KC4Z hostname so you can can retrieve message help from the Knowledge Center when you hover over message ID.

    • Note might be usable in principle be used by other subsystems, but it was raised as a z/OSMF requirement for JES2.

On the blog

SAP And Db2 Correlation IDs

Every so often I get to work with a SAP customer. I’m pleased to say I’ve worked with four in recent years. I work with them sufficiently infrequently that my DDF SMF 101 Analysis code has evolved somewhat in the meantime.

The customer situation I’m working with now is a good case in point. And so I want to share a few things from it. There is no “throwing anyone under the bus” but I think what I’ve learnt is interesting. I’m sure it’s not everything there is to learn about SAP, so I won’t pretend it is.

The Structure Of SAP Db2 Correlation IDs

In Db2 a correlation ID (or corrid) is a 12-character name. Decoding it takes some care. For example:

  • For a batch job up to the first 8 characters are the job name.
  • For a CICS transaction characters 5 to 8 are the transaction ID.

In this set of data the correlation ID is interesting and useful:

  • The first three characters are the Db2 Datasharing group name (or SAP application name).
  • The next three are “DIA” or “BTC” – usually. Occasionally we get something else in these 3 positions.
  • Characters 7 to 9 are a number – but encoded in EBCDIC so you can read them.

I wouldn’t say that all SAP implementations are like this, but there will be something similar about them – and that’s good enough. We can do useful work with this.

Exploring – Using Correlation IDs

Gaul might indeed be divided into three parts. (“Gallia est omnis divisa in partes tres”). So let’s take the three parts of the SAP Correlation ID:

Db2 Datasharing Group Name / Application Name

To be honest, this one isn’t exciting – unless the Datasharing Group Name is different from the SAP Application Name. This is because:

  • Each SAP application has one and only one (or zero) Datasharing Groups.
  • Accounting Trace already contains the Datasharing Group Name.

In my DDF SMF 101 Analysis code I’m largely ignoring this part of the Correlation ID, therefore.

BTC Versus DIA

The vast majority of the records have “BTC” or “DIA” in them, and this post will ignore the very few others. Consider the words “have “BTC” or “DIA” in them”. I chose my words carefully: these strings might not be at offsets 3 to 5. Here’s a technique that makes that not matter.

I could use exact equality in DFSORT. Meaning a specific position is where the match has to happen. However DFSORT also supports substring search.

Here is the syntax for an exact match condition:


Here I’ve had to remap the ID field to map positions 4 to 6 (offsets 3 to 5). That’s a symbol I don’t really want and it isn’t flexible enough.

Here’s how it would look using a substring search condition:


This is much better as I don’t need an extra symbol definition and the string could be anywhere in the 12 bytes of the CORRID field.

If we can distinguish between Batch (“BTC”) and Dialog (“DIA”) we can do useful things. We can show commits and CPU by time of day – by Batch versus Dialog. We could do Time Of Day anyway, without this distinction. (My DDF SMF 101 Analysis code can go down to 100th of a second granularity – because that’s the SMF time stamp granularity – so I regularly summarise by time of day.) But this distinction allows us to see a Batch Window, or times when Batch is prevalent. If we are trying to understand the operating regime, such distinctions can be handy.

Numeric Suffix

This is the tricky one. Let’s take an example: “XYZBTC083”

We’re talking about the “083” part. It looks like a batch job identifier within a suite. But it isn’t. For a start, such a naming convention would not survive in a busy shop. So what onis it?

There are a few clues:

  • “XYZBTC083” occurs throughout the set of data, day and night. So it’s not a finite-runtime batch job.
  • In the (QWHS) Standard Header the Logical Unit Of Work ID fields for “XYZBTC083” change.
  • the “083” is one value in a contiguous range of suffixes.

What we really have here are SAP Application Server processes, each with their own threads. These threads appear to get renewed every so often. Work (somehow) runs in these processes and, when it goes to Db2, it uses these threads. It’s probably controllable when these threads get terminated and replaced – but I don’t see compelling evidence in the data for that control.

This “083” suffix is interesting: In one SAP application I see a range of “XYZDIA00” – “XYZDIA49”. Then I see “XYZBTC50” – “XYZBTC89”. So, in this example, that’s 50 Dialog processes and 40 Batch processes. So that’s some architectural information right there. What I don’t know is whether lowering the number of processes is an effective workload throttle, nor whether there are other controls in the SAP Application Server layer on threads into Db2. I do know – in other DDF applications – it’s better to queue in the middle tier (or client application) than queue too much in Db2.

IP Addresses And Client Software

Every SMF 101 record has an IP Address (or LU Name). In this case I see a consistent set of a small number of IP addresses. These I consider to be the Application Servers. I also see Linux on X86 64-Bit (“Linux/X8664”) as the client software. I also see it’s level.

So we’re building up a sense of the application landscape, albeit rudimentary. In this case client machines. (Middle tier machines, often – if we’re taking the more general DDF case than SAP.)

Towards Applications With QMDAAPPL

When a client connects to Db2 via DDF it can pass certain identifying strings. One of these shows up in SMF 101 in a 20-byte field – QMDAAPPL.

SAP sets this string, so it’s possible to see quite a high degree of fine detail in what’s coming over the wire. It’s early days in my exploration if this – with my DDF SMF 101 Analysis code – but here are two things I’ve noticed, looking at two SAP applications:

  • Each application has a very few QMDAAPPL values that burn the bulk of the CPU.
  • Each application has a distinctly different (though probably not totally disjoint) set of QMDAAPPL values.

I’ve looked up a few of the names on the web. I’ve seen enough to convince me I could tell what the purpose of a given SAP application is, just from these names. Expect that as a future “stunt”. 🙂


I think I’ve shown you can do useful work – with Db2 Accounting Trace (SMF 101) – in understanding SAP accessing Db2 via DDF.

SAP is different from many other types of DDF work – and you’ve seen evidence of that in this long post.

One final point: SAP work comes in short commits / transactions – which makes it especially difficult for WLM to manage. In this set of data, for instance, there is relatively little scope for period aging. We have to consider other mechanisms – such as

  • Using the Correlation ID structure to separate Batch from Dialog.
  • Using DDF Profiles to manage inbound work.
  • (Shudder) using WLM resource groups.

And, as I mentioned above,

  • Using SAP’s own mechanisms for managing work.

I’ve learnt a fair bit from this customer situation, building as it does on previous ones. Yes, I’m still learning at pace. One day I might even feel competent. 🙂

And it inspires me even more to consider releasing my DDF SMF 101 Analysis code. Stay tuned!

Automating Microsoft Excel

(This post is about automating Excel on Mac OS. If you’re not a Mac user this probably won’t interest you.)

My Process

From the creation of a CSV file onwards, automation is key:

Creating a CSV file is one of two things:

  • A program on z/OS.
  • A piece of javascript code I have that turns HTML tables into CSV.

So that’s not a problem. Where the problem starts is automating Excel, which is what this post is about.

Here is the sequence I generally go through with Excel:

  1. Excel ingests the CSV file.
  2. Then it moves onto the business of creating a chart.
  3. Then resizing the chart.
  4. Then changing some of the attributes of the chart.
  5. Finally exporting the chart in the graphical format that I can use in a presentation.

I orchestrate a lot of things with the Keyboard Maestro automation platform. It’s well worth the money, even if it is your own money. I might kick off a Keyboard Maestro macro in one of several ways:

(As an aside on the “hot key” combination, I’ve repurposed the Caps Lock key to pop up a menu of automations – using Karabiner Elements and a Keyboard Maestro conflict palette.)

ApplesScript And Excel

AppleScript it the prevalent automation language on Mac OS. I have to say the AppleScript support in Excel is very obtuse. So most of the value in this post is a few snippets of AppleScript that people trying to use the data model might find useful.

One tip: If you look at the VBA data model for Excel the AppleScript support is very similar to it, language differences apart.

So here are some snippets of code you might find useful.

Changing A Chart’s Type

To change the currently selected (active) chart’s type to a line chart you would code

tell application "Microsoft Excel"
    set newChart to active chart
    tell newChart
        set chart type to line chart
    end tell
end tell

Note the name “line chart” is not a string. It is literally line chart. I think this is confusing. Other chart types I’ve automated have been

  • xyscatter
  • column stacked
  • column stacked 100

This last is where the y axis stops at 100%.

Editing The Title Of A Chart

The following example does a number of things:

  1. Prompt you for a title for the current (active) chart.
  2. Sets the chart title to the text you returned.
  3. Sets the font size of the title to 24 points.

Here is the code:

tell application "Microsoft Excel"
    set newChart to active chart
    tell chart title of newChart
        set theResponse to display dialog "Edit Chart Title" default answer chart title text as string buttons {"Cancel", "Continue"} default button "Continue"
        set chart title text to text returned of theResponse
        tell its font object
            set font size to 24
        end tell
    end tell
end tell

Setting The Dimensions Of A Chart

I always want the dimensions of a chart to be the same – and suitable for including in a PowerPoint presentation. I have two scripts for setting the dimensions of the active chart:

  • Single Width
  • Double Width

Only the single width one is right for putting in a presentation:

tell application "Microsoft Excel"
    set ch to chart area object of active chart
    set height of ch to 567
    set width of ch to 850.5
end tell

The double width one came in very handy recently: It was a great way to zoom in on a time line:

tell application "Microsoft Excel"
    set ch to chart area object of active chart
    set height of ch to 567
    set width of ch to 1701
end tell

I’ve set these two up on a pair of buttons in Metagrid – so I can readily swap between the two sizes.

In Conclusion

You might wonder why I’ve created a blog post that is essentially code snippets. Here’s why: It took a lot of fiddling, experimentation and web searching to come up with these snippets of code. That they were hard to come up with or find says something.

This post will be findable from Duck Duck Go etc. I hope it saves people some time and frustration.

If you’re not into automation I might be beginning to warm you to it.

I have a lot of Keyboard Maestro macros for Excel that I have yet to convert to pure AppleScript – and these are mainly fiddling with menus. (Keyboard Maestro is very good at that but it is slower and less reliable to automate that way.) As I convert some more I might well post additional code snippets.

mdpre Markdown Preprocessor

mdpre Markdown Preprocessor

A few days ago I released md2pptx, a Markdown to Powerpoint converter on GitHub as an open source project. You can find it here.

Now I’m releasing a companion program – mdpre. This is a preprocessor for Markdown. You can find it here.

I would’ve released them together and why I didn’t is a small story at my own expense: We in IBM have a system for requesting permission to open source software. I put in the cases for both mdpre and md2pptx at the same time. I got the permission email for md2pptx but didn’t spot the one for mdpre. I should have checked the status of the mdpre case but, life being too short, I didn’t get there for another week. Meanwhile I open sourced md2pptx.

All this by way of saying the pairing of mdpre and md2pptx is a strong one: Almost all my presentations are run through mdpre and the resulting Markdown through md2pptx. And one of the features of mdpre is a key reason why I do that.

My Workflow

When I begin a customer study I use a Keyboard Maestro macro that:

  • Creates a folder for the study within the customer’s folder in Documents.
  • Creates subfolders of this one for each presentation within the study.
  • Creates files with the .mdp extension in these folders for each presentation.
  • Pulls in some stock graphics and include files into the relevant subfolders.
  • Creates a make file in each subfolder.

That saves me a lot of time and it’s very consistent. I’m showing you this as it might be a helpful model to follow if you use md2pptx and mdpre.

The Killer Feature For Me

But what is the key feature I mentioned just now?

It’s the ability to convert CSV data into a Markdown table.

For example, you might code


And you get something like this:

A 1
B 2

The default for alignment in Markdown is left. But I don’t always want that, so I invented =colalign:

=colalign l r

This aligns the first column left and the second column right -using the Markdown :- and -: syntax.

There is one further nuance – which would be ignored by a normal Markdown processor: =colwidth:

=colalign l r
=colwidth 2 1

Here the first column is twice the width of the second – but only when rendered on a slide in md2pptx.

Other Notable Features

I less frequently need a few other things:

  • Inclusion of files. The time I use this is when I want to include a CSV file but don’t want to paste it into the main document.
  • Conditional processing. I used this in a couple of my conference presentations.
  • Variables. Again, useful in conference presentations – to create consistency.

As with most things I do, if I get fed up enough I automate it. So I expect there will be additional features added to mdpre over time.

Talking of which, testing, bug reports and even contributions welcome.

md2pptx Markdown To PowerPoint Converter

It’s been a long time since I started writing md2pptx. But finally I’ve open sourced it.

A Problem Statement Of Sorts

I wrote md2pptx because I got tired of four things:

  • The process of embedding graphics in PowerPoint.
  • The fact that pasting a picture into PowerPoint made the resulting presentation file huge.
  • The location of a manually added picture on a page is likely to be inconsistent.
  • Presentations become a hodge podge of inconsistent text styles.

The last one is really the result of success. Take a presentation of mine like “Parallel Sysplex Performance Topics”. It has evolved over at least 10 years. It probably started out in Open Office and ended up in PowerPoint. I know for a fact that each time I “refurbished” it I introduced styling inconsistencies, particularly if I swiped slides from someone else. “Much Ado About CPU” saw that problem in spades.

It also occurred to me that what today is a presentation might tomorrow need to be a document. Or a presentation guide, which is really both a document and slides. If only I could – from the same source – produce both a presentation and a text document.

How Do I Write Documents?

I don’t often write documents but when I do I abjure Word in favour of Markdown. And this is significant.

I rather like a text-based format and, if you forced me to, I’d write in HTML. Thankfully nobody has forced me to write HTML, though I’m very familiar with it. So Markdown (or rather MultiMarkdown) is the text-based format I have chosen.

I write in a wide variety of applications – across Mac, iPad and iPhone. Precisely which needn’t concern us here. What matters is that a text-based format allows to me to write anywhere – and I wanted to extend that to slides.

md2pptx and python-pptx

A while back I discovered a Python package called python-pptx. You can write programs to drive python-pptx to make PowerPoint presentations. Specifically the XML flavour – with file extension “pptx”. Hence the name. It’s actually quite easy – if you’re reasonably adept at Python – to use python-pptx in a program.

So I wrote such a program called md2pptx. I started writing it a few years ago – and I’ve used it in every engagement since. I’ve also refurbished a number of presentations by converting them to Markdown, extracting the graphics, and rebuilding with md2pptx. As I’ve used md2pptx I’ve enhanced it.

md2pptx takes a subset of MultiMarkdown (a superset of “vanilla” Markdown) and builds the presentation, embedding the graphics as appropriate.

If you know Markdown you know how to write in a way that md2pptx can convert.

Everything you would code for md2pptx is valid Markdown so you can turn it into a document – if you need to.

There are some nice wrinkles, such as the automatic creation of a Table Of Contents slide, turning Taskpaper-formatted tasks into task list slides at the end. Likewise a glossary slide. It even supports Critic Markup format – for reviewing presentations.

How This Fits Into My Life

I built a load of automation around preparing for an engagement. I have a Keyboard Maestro macro that creates a new folder for the study (as a subfolder of the one I use for the customer in question). It then creates subfolders – one for each presentation I’d use in a workshop. Inside these folders it writes a stub Markdown file and a make file.

As I develop the presentation I edit it in BBEdit and – on the command line – type make to build the presentation. md2pptx runs very fast, always well under a second to build any presentation I’ve ever used it with. This includes embedding graphics.

I also wrote some automation to cause PowerPoint to exit any slideshow, close the presentation, and reload it. This is a piece of AppleScript I suppose I could share.

Open Sourcing md2pptx

Without giving away any commercial secrets, I can say the process of open sourcing in IBM can be quite straightforward: You submit a proposal, covering points such as licensing, expected IBM effort, commercial considerations. In my case I’ve found getting approval quick – because there’s nothing contentious about it.

So md2pptx is on GitHub. You can find it here. As with my other projects, I invite contributions, issues, testing. The only things I would say are:

  • Don’t expect it to be a highly robust, full function Markdown parser.
  • If you think there is some feature you’d like md2pptx to have consider these two questions:
    • How would you express the semantic in Markdown?
    • How would you express the feature in PowerPoint?

Anyhow, I hope you enjoy using it – if you do. I know I do – and I’m pleased to share it.

What About The Others?

I just produced a new chart, which I think is worth sharing with you.

I produced it for one specific use case, but I think it has slightly wider applicability.

The Trouble With Outsourcers

Quite often we get performance data from outsourcers, whether IBM or some other organisation. Generally they’re running LPARs from more than one of their customers on the one machine.

We have a nice chart for a machine. It shows all the LPARs’ CPU stacked up – with each LPAR a different series. This is fine in a one-company context. But sometimes we are working with the outsourcer and one of their customers. We wouldn’t want to show them the outsourcer’s other customers’ LPARs. But we would want to show them how busy the machine is.

It’s reasonable to show them how busy the machine is because, of course, it affects the performance they’re seeing. And we might well get into LPAR design issues. (A tricky one is the weights because adjusting them is a “robbing Peter to pay Paul” situation – and with a multi customer machine that’s obviously political.)

So here is a new chart, that neatly solves the problem. It’s a real one, though there has had to be a little obfuscation of the names.

In this case CPU2 is a Production LPAR and CPU3 is a Development LPAR. The grey is all the other LPARs’ use of the GCP pool. It’s clearly substantial.

The pool itself isn’t hugely busy – but then this was not said to be a problem day.

But There’s More

Even in the one-company case this chart is useful. Suppose a customer sends us data from what they consider their biggest LPARs. It would be good to show:

  • The LPARs they sent us data for are indeed the bulk of the CPU.


  • We’re missing a big trick as the LPARs they sent data for don’t use the bulk of the CPU.

One Final Plea

I’ve said this many times, but probably not written it in a blog post. Always report processor pools separately. Everything in this post has been for a single machine’s GCP pool. To mix GCP’s with, say, zIIPs makes no sense at all.