Let's Play Master And Servant

I’ve been experimenting with invoking services on one machine from another. None of these machines have been mainframes. They’ve been iPads, iPhones, Macs and Raspberry Pis.

I’ve used the plural but the only one actively dualed is iPhone. And that plays a part of the story.

For the purpose of this post you can view iPads and iPhones the same.

Actually mainframers needn’t switch off – as some of this might be a useful parallel.

Why Would You Invoke Services On Another Machine?

It boils down to two things, in my view:

  • Something the other machine has.
  • Something the other machine can do.

I have examples of both.

At this point I’m going to deploy two terms:

  • Heterogeneous – where the architecture of the two machines is different.
  • Homogeneous – where the architecture of the two machines is the same.

You can regard Pi calling iOS as heterogeneous. You can regard iOS calling iOS as homogeneous. there is quite a close relationship between these two terms and the “has” versus “can do” thing. But it’s not 100% true.

On “has” I would probably only invoke a service on another device of the same type of it had something that wasn’t on the invoking machine. Such as a calendar. So, for example, I could copy stuff from my personal calendar into my work calendar.

On “can do” I would invoke a service on another machine of a different type of it can do something the invoking machine can’t do. For example, a Raspberry Pi can’t create a Drafts draft – but iOS (and Mac) can.

But Pi can do a lot of things iOS can do, and vice versa. So sometimes “has” comes into play in a heterogeneous way. (I don’t have a good example of this – and it’s not going to feature any further in this post.)

Now let’s talk about some experiments – that I consider “Production-ready”.

Automating Raspberry Pi With Apache Webserver

There are certain things Raspberry Pi – which runs Linux – can do that iOS can’t. One of them is use certain C-based Python libraries. One I use extensively is python-pptx – which creates PowerPoint presentations. My Python code uses this library to make presentations from Python. Normally it would run on my Mac.

At 35,000 feet I can run my Pi on a battery and access it on a Bluetooth network. I can SSH in from my iPad and also have the Pi serve web pages to the iPad.

So I don’t need to pull out my nice-but-cumbersome Mac on a plane.

SSH is my method of choice for administering the Pi. But what of invoking services? Here’s where Apache (Webserver) comes in handy. It’s easy to get Apache with PHP up and running. In my case I use URLs of the form http://pi4.local/services/phpinfo.php to access it from iOS:

  • I can use the Safari web browser.
  • I can use a Shortcuts action to GET or POST. In fact all http verbs.
  • I can write JavaScript in Scriptable to access the webserver.
  • and so on.

Here’s a very simple PHP example:

<?PHP                                                                                                                                          
phpinfo();                                                                                                                                     
?>                                                                                                                                             

I’ve saved this as phpinfo.php in /var/www/html/services on the Pi. So the URL above is correct to access it.

I don’t want to get into PHP programming but the first and the last line are brackets around PHP code. Outside of these brackets could be HTML, including CSS and javascript. phpinfo() itself is a call to a built in routine that gives information about the PHP setup in the web server. (I use it to validate PHP is even installed properly and accessible from the web server.)

The point of this example is that it’s very simple to set up a web server that can execute real code. PHP can access, for example, the file system.

Here’s another example:

<?php                                                                                                                                          
print("<table>\n");                                                                                                                            
print("<th><td>Parameter</td><td>Value</td></th>\n");                                                                                          
foreach ($_GET as $key => $value) {                                                                                                            
  print("<tr><td>$key</td><td>$value</td></tr>\n");                                                                                            
}                                                                                                                                              
print("</table>\n");                                                                                                                           

?>                         

This is queryString.php and it illustrates one more important point: You can pass parameters into a script on the Pi, using the query string portion of the URL. This example tabulates any passed in parameters.

So the net of this is you can automate stuff on the Raspberry Pi (or any Linux server or even a Mac) if you set it up as a web server.

Automating iOS With Shortcuts And Pushcut Automation Server

Automating an iOS device from outside is rather more difficult. There are several components you need:

  1. Shortcuts – which is built in to both iPad OS and iOS.
  2. A way to automatically kick off a shortcut from another device.
  3. Some actual shortcuts to do the work.

Most people reading this are familiar with items 1 and 3. Most people won’t, however, know how to do item 2.

This is where Pushcut comes to the rescue. It has an experimental Automation Server. When I say “experimental”, Release 1.11 three weeks ago from the time of writing introduced it as “experimental” but there have been enhancements released since then. So I guess and hope it’s here to stay, with a few more enhancements perhaps coming.

Before I go on I should say some features of Pushcut require a subscription – but it’s not onerous. I don’t know if Automation Server is one of them – as I happily subscribed to Pushcut long before Automation Server was a thing. There’s a lot more to Pushcut than what I’m about to describe.

When you start up Automation Server it has access to all your Shortcuts, ahem, shortcuts. You can invoke one using a URL of the form

https://api.pushcut.io/<my-secret>/execute?shortcut=Hello%20World&input=Bananas%20Are%20Nice

The input parameter is the shortcut’s input parameter. You don’t have to have an input parameter, of course so just leave off the Bananas%20Are%20Nice part. (Notice it is percent encoded, which is easy to do, as is the shortcut name.)

The above invokes a (now obsolete version of my) Hello Worldshortcut. Obsolete because it now turns the input into a dictionary.

JSON is passed in thus:

https://api.pushcut.io/<my-secret>/execute?shortcut=Hello%20World&input={"a":"123%20456,789"}

Again it’s percent encoded.

On Pi (via SSH from Mac) you can use the Lynx textual browser, wget, or curl. Actually, on Mac, you can do it natively – if you want to.

When you use the URL Pushcut – via the web – invokes the Automation Server on your device. “Via the web” is important as you can’t use the Automation Server technique without Internet access. You can, however, use another function of Pushcut – Notifications-based running of a shortcut.

Curl Invocation

Curl invocation is a little more complex than invoking using a web browser. Curl is a command line tool to access URLs. Here is how I invoke the same shortcut, complete with a JSON payload

curl -G -i 'https://api.pushcut.io/<my-secret>/execute' --data-urlencode 'shortcut=Hello World' --data-urlencode 'input={"a":"123 456,789"}'

Note the separation of the query string parameters and their encoding. I found this fiddly until I got it right.

An Automation Server

It’s worth pointing out that Automation Server should run on a “dedicated” device. In practice few people will have these. So I’ve experimented with what that actually means.

First, buying a secondhand device that is good enough to run iOS 12 (the minimum supported release and not recommended) is prohibitive. In the UK I priced them above £100 so I’d have to have a much better use case than I do.

Second, I don’t have a spare device lying around.

But I do have a work phone (that I donated, actually) that rarely gets calls. It does get notifications but those don’t seem to stop Automation Server from functioning just fine.

So this approach might not be available to many people.

Automating Drafts From Raspberry Pi

One good use case is being able to access my Drafts drafts from my Raspberry Pi. Drafts is an excellent environment for creating, storing and manipulating text. I highly recommend it for iPhone, iPad and Mac users. Again a well-worth-it subscription could apply.

So, my use case is text access on the Pi. I might well create a draft on iOS from the Pi, perhaps storing some code created on the Pi.

So I’ve written shortcuts that:

  • Create a Drafts draft.
  • Retrieve the UUIDs (identifiers) of drafts that match a search criterion. (In the course of writing this post I upgraded this to return a JSON array of UUID / draft title pairs.)
  • Move a draft to the trash (as the way of deleting it).

Here’s what the first one of these looks like:

It’s a simple two action shortcut:

  • Create the draft.
  • Return the UUID (unique ID) of the newly created draft.

The second action isn’t strictly necessary but I think there will be cases where the UUID will be handy. For example, deleting the draft requires the UUID. So this action returns the UUID as a text string.

The first action is the interesting one. It takes the shortcut’s input as the text of the draft and creates the draft. Deselecting “Show When Run” is important as otherwise the Drafts user interface will show up and you don’t want that with an Automation Server. (It’s likely to temporarily stall automation.)

So this is a nice example where iOS can do something Raspberry Pi (or Linux) can’t. This is not a Mac problem as Drafts has a Mac client.

Outro

A couple of other things:

I did a lot of my testing on a Mac – both directly as a client and also by SSH’ing into the Pi. (Sometimes I SSH’ed into the Pi on both iPad and iPhone using the Prompt app.)

By the way, the cultural allusion in the title is to Depeche Mode’s Master And Servant. I’ve been listening to a lot of Depeche Mode recently.

I’ve titled this section “Outro” rather than “Conclusion” because I’m not sure I have one – other than it’s been a lot of fun playing with one machine calling another one. Or playing Master and Servant. 🙂

My Generation?

Ten years ago in Channel Performance Reporting I talked about how our channel-related reporting had evolved.

That was back when the data model was simple: You could tell what channel types were and what attached to each channel:

  • Field SMF73ACR in SMF Type 73 (RMF Channel Path Activity) told you the channel was, for example, FICON Switched (“FC_S”). This was an example I gave in that blog post.
  • SMF Type 78 Subtype 3 (RMF I/O Queuing Activity) gave you the control units attached to each channel. (You had to aggregate them through SMF 74 Subtype 1 (RMF Device Activity) to relate them to real physical controlers.)

SMF 74 Subtype 7 FICON Switch Statistics

In that post I talked about SMF 74 Subtype 7 (FICON Director Statistics) as a possible extension to our analysis. I mapped this record type pretty comprehensively. Unfortunately so few customers have 74–7 enabled that I haven’t got round to writing analysis code.

I’m grateful to Steve Guendert for information on how to turn on these records:

RMF 74–7 records are collected for each RMF interval if

  • The FICON Management Server (FMS) license is installed on the switching device. (This is the license for FICON CUP)
  • The FMS indicator has been enabled on the switch
  • The IOCP has been updated to include a CNTLUNIT macro for the FICON switch (2032 is the CU type)
  • The IECIOSnn parmlib member is updated to include “FICON STATS=YES”
  • “FCD” is specified in the ERBRMFnn parmlib member

With these instructions I’m hoping more customers will turn on 74–7 records.

Talking About My Generation

What I don’t think we had 10 years ago was a significant piece of information we have today: Channel Generation. This is field SMF73GEN in SMF Type 73.

This gives further detail on how the channel is operating.

There isn’t a complete table anywhere for the various codes but here’s my attempt at one:

Value Meaning Value Meaning
0 Unknown 18 Express8S at 4 Gbps
1 Express2 at 1 Gbps 19 Express8S at 8 Gbps
2 Express2 at 2 Gbps 21 Express16S at 4 Gbps
3 Express4 at 1 Gbps 22 Express16S at 8 Gbps
4 Express4 at 2 Gbps 23 Express16S at 16 Gbps
5 Express4 at 4 Gbps 24 Express16S at 16 Gbps
7 Express8 at 2 Gbps 31 Express16S+ at 4 Gbps
8 Express8 at 4 Gbps 32 Express16S+ at 8 Gbps
9 Express8 at 8 Gbps 33 Express16S+ at 16 Gbps
17 Express8S at 2 Gbps 34 Express16S+FEC at 16 Gbps

Actually that “24” entry worries me.

By the way I’m grateful to Patty Driever for fishing many of these out for me.

A Case In Point

We’ve been decoding these channel “generations” for a number of years but it came into sharper focus with a recent customer situation.

This customer has a pair of z14 processors, each with quite a few LPARs. A few weeks ago I pointed out to them that quite a few of their channels were showing up at 4 Gbps or 8 Gbps, not the 16 Gbps they might have expected.

Where the channels were at 16 Gbps they were indeed using Forward Error Correction (FEC).

Now, why would a 16 Gbps channel run at 8 or even 4 Gbps? It’s a matter of negotiation. If the device or the switch can’t cope with 16 the result might well be negotiation down to a slower speed. So older disks or older switches might well lead to this situation. I guess it’s also possible that unreliable fabric might lead to the same situation.

(Quite a few years ago I had a customer situation where their cross-town fabric was being run above its rated speed, with quite a lot of errors. It was a critsit, of course. 🙂 Later I had another critsit where the customer had ignored their disk vendor (not IBM) ’s advice in how to configure cross-town fabric. So it does happen.)

Apart from SMF73ACR the above all relates exclusively to FICON. So what about Coupling Facility links?

I’ve written extensively on these. The most recent post is Sysplexes Sharing Links. In this post you’ll find links (if you’ll pardon the pun) to all my other posts on the topic. I don’t need to repeat myself here.

But SMF73ACR does indeed identify broad categories of Coupling Facility links:

  • “ICP” means “IC Peer link”.
  • “CS5” means “ICA-SR link”.
  • “CL5” means “Coupling Express LR link”.

Conclusion

Whether we’re talking about Coupling Facility links or FICON links it’s worth checking what you’re actually getting. And RMF has – both in its reports (which I haven’t touched on) or in its SMF records – the information to tell you how the links panned out.

It’s also worthwhile knowing that raw link speed doesn’t translate directly into I/O response time savings. But modern channels, especially when exploiting zHPF, can yield impressive gains. But this is not the place to go into that.

One final thought: Ideally you’d allow the latest and greatest links to run at their full speed. But economics – such as the need to keep older disks in Production – might defeat this. But, over time, it’s worthwhile to think about whether to upgrade – to get the full channel speed the processor is capable of.

Dock It To Me

With z/OS 2.4 you can now run Docker containers in zCX address spaces. I won’t get into the whys and wherefores of Docker and zCX, except to say that it allows you to run many packages that are available on a wide range of platforms.

This post, funnily enough, is about running Docker on a couple of reasonably widely available platforms. And why you might want to do so – before running it in zCX address spaces on z/OS.

Why Run Docker Elsewhere Before zCX?

This wouldn’t be a permanent state of affairs but I would install Docker and run it on a handy local machine first.

This is for two reasons:

  1. To get familiar with operating Docker and installing packages.
  2. To get familiar with how these packages behave.

As I’ll show you – with two separate environments – it’s pretty easy to do.

The two environments I’ve done it on are

  • Raspberry Pi
  • Mac OS

I have no access to Windows – and haven’t for over 10 years. I’d hope (all snark aside) it was as simple as the two I have tried.

One other point: As the Redbook exemplifies, not all the operational aspects of zCX are going to be the same, but understanding the Docker side is a very good start.

Raspberry Pi / Linux

Raspberry Pi runs a derivative of Debian. I can’t speak for other Linuxes. But on Raspberry Pi it’s extremely easy to install Docker, so it will be for other Debian-based distributions. (I just have no experience of those.)

You simply issue the command:

apt-get install docker

If you do that you get a fully-fledged Docker set up. It might well pull in a few other packages.

My Raspberry Pi 4B has a 16GB microSD card and it hasn’t run out of space. Some docker packages (such as Jupyter Notebooks) pull in a few gigabytes so you probably want to be a little careful.

After you’ve installed Docker you can start installing and running other things. A simple one is Node.js or “node” for short.

With node you can run “server side” javascript. Most of the time I prefer to think of it as command-line javascript.

A Simple Node Implementation

I created a small node test file with the nano editor:

Console.log("Hello")

And saved it as test.js.

I can run this with the following command:

docker run  -v "$PWD":/usr/src/app -w /usr/src/app node test.js

This mounts the current working directory as the /usr/src/app directory in Docker (-v option of the docker run command), sets the docker working directory to this directory (-w option), and then invokes node to run test.js. The result is a write to the console.

(This combination of -v and -w is a very common idiom, so it’s worth learning it.)

Accessing A Raspberry Pi From iOS

Though I SSH into my Pi from my Mac the most fun is doing it from iOS. (Sorry if you’re an Android user but you’ll have to find your own SSH client as I have little experience of Android. Likewise Windows users. I’m sure you’ll cope.)

My Raspberry Pi is set up so that I use networking over Bluetooth or WiFi. This means I can play with it on a flight or at home. In both cases I address it as Pi4.local, through the magic of Bonjour.

Specifically I can SSH into the Pi from an iOS device in one of two ways:

  • Using the “Run Script over SSH” Shortcuts action.
  • Using the Prompt iOS app.

I’ve done both. I’ve also used a third way to access files on the Pi: Secure ShellFish.

All these ways work nicely – whether you’re on WiFi or networking over BlueTooth.

Mac OS

For the Mac it’s a lot simpler. For a start you don’t need to SSH in.

I installed Docker by downloading from here. (I note there is a Windows image her but I haven’t tried it.)

Before it lets you download the .dmg file you’ll need to create a userid and sign in. Upon installation a cute Docker icon will appear in the menu bar. You can use this to control some aspects of Docker. You can sign in there.

From Terminal (or, in my case, iterm2) you can install nginx with:

docker pull nginx

This is a lightweight web server. You start it in Docker with:

docker run -p 8080:80 –d nginx

If you point your browser at localhost:8080 you’ll get a nice welcome message. This writeup will take you from there.

Conclusion

There’s an excellent Redbook on zCX specifically: Getting started with z/OS Container Extensions and Docker

It’s also pretty good on other Docker environments, though it doesn’t mention them.

Which, I think, tells you something. Probably about portability and interoperability.

So, as I said, it’s easy and rewarding to play with Docker just before doing it on z/OS with zCX. And, as I said on Episode 25 of our podcast, I’d love to see SMF data from zCX.

If I get to see SMF data from a zCX situation – whether a benchmark or a real customer environment – I’ll share what I’m seeing. I already have thoughts.

And maybe I’ll write more about my Raspberry Pi setup some day.

Mainframe Performance Topics Podcast Episode 25 "Flit For Purpose"

It’s been a long time since Marna and I recorded anything. So long in fact that I’d forgotten how to use my favoured audio editor Ferrite on iOS.

But it soon came back to me. Indeed the app has moved on somewhat since I last used it – in a number of good ways.

So, we’re back – in case you thought we’d gone away for good.

And here are the show notes. I hope you’ll find some interesting things here.

Episode 25 “Flit for Purpose”

Here are the show notes for Episode 25 “Flit for Purpose”. The show is called this because it relates to our Topic, and also can be related to our Mainframe topic (as a pun for “Fit for Purpose”).

It’s been a long time between episodes, for good reason

  • We’ve been all over the place, and too many to mention thanks to a very busy end of 2019, with two major GAs (z/OS V2.4 on September 30, 2019 and z15 September 23, 2019)

Feedback

  • Twitter user @jaytay had a humorous poke at SMF field “SMF_USERKEYCSAUSAGE”, how it looks like sausage. We agree and are glad to find humor everywhere.

Follow up

  • We mentioned in Episode 24 CICS ServerPac in a z/OSMF Portable Software Instance Format . It GA’ed December 6 on ShopZ. If you will be ordering CICS or CICS program products, please order it in z/OSMF format!

Mainframe Topic: Highest highlights of z/OS V2.4 and z/OS on z15

  1. Highlight 1:  zCX

    • zCX is a new z/OS address space that is running Linux on Z with the Docker infrastructure. Docker containers have become popular in the industry. Examples include nginx and MongoDB, and WordPress.

      • (The use cases depicted reflect the types of software that could be deployed in IBM zCX in the future. They are not a commitment or statement of software availability for IBM zCX.)
    • Take Dockerhub image and run “as is”. About 3k images to choose from that can immediately run. vJust look for the “IBM Z” types.

    • The images are not necessarily from IBM, which brings about a “community” and “commonality” with Linux on Z.

    • zCX is packaged with SMP/E, and serviced with SMP/E.  However, configuration (getting it up and running) and service updates must be done with z/OSMF Workflow.

    • Application viewpoint: Docker images themselves are accessed through the TCP/IP stack, with the standard Docker CLI using SSH. And for the application people they might not even know it’s running under z/OS. Docker Command Level Interface is where you implement which containers run in which zCX address spaces.

    • For cost: No SW priced feature (IFAPRDxx). However, does required a priced HW feature (0104, Container Hosting Foundation) on either z14 GA2 or z15.  This is verified at the zCX initialization. Lastly, zCX cycles are zIIP-eligible

    • It’s an architectural decision whether to run Docker applications on Linux on Z or z/OS, and that’s for another episode.

      • Martin wants to see some SMF data, naturally. He’s installed Docker on two different platforms: Mac and his Raspberry Pi. In the latter case he installed nginx and also gcc.
  2. Highlight 2: z/OSMF

    • Lots of z/OSMF enhancements that have arrived in z/OS V2.4, and the good news is that most of them are rolled back to V2.3 in PTFs that have been arriving quarterly. 

    • Security Configuration Assistant: A way within z/OSMF to validate your security configuration with graphic views, on the user and user group level. Designed to work with all three External Security Managers!

      • Security is and continues to be one of the hardest part of getting z/OSMF working . This new application itself has more security profiles that users of Security Configuration Assistant will need access to, but the good side is once those users are allowed, they can greatly help the rest. 

      • Use case: if a user is having problems accessing an application and you don’t know why you could easily see if this user had authority to access the application to eliminate that as a problem.

      • Available back to V2.3 with APAR PH15504 and additional group id enhancements in APAR PH17871

    • Diagnostic Assistant for z/OSMF : A much simplier way to gather the necessary information to need for a Service person to perform debug for your z/OSMF problem. 

      • Not quite easy before. It could have been streamlined, and it took this application to give us that.  It is now so easy. Now, Marna doesn’t gather problem doc grudgingly because there are lots of different locations that contain necessary diagnostic files.

      • It could not be easier to use (although one additional tiny enhancement Marna has requested to the z/OSMF team how to make it even easier to not require the z/OSMF server jobname and jobid).

      • You open up the Diagnostic Assistant application, you select the pieces of information you want to gather. This includes the configuration data, the job log, and server side log and some other files.  Having z/OSMF collect it for you it really nice.

        • It is then zipped up and stored down to your hard drive (not the z/OS system).
      • Available back to V2.3 with Diagnostic Assistant APAR PH11606

  3. Highlight 3: z/OS on z15: System Recovery Boost :Speeds up your shutdown for up to 30 minutes and speeds your re-IPL for 60 minutes, with no increase to your rolling four hour average.

    • Some bits are free, some will cost should use choose to you them. Note that System Recovery Boost is available on z/OS V2.3 and V2.4. There is no priced z/OS SW feature (IFAPRDxx) at all. Made up of individual solutions, not all of them you may choose to use (or may apply to you).

      • No charge for this one: Sub-capacity CPs will be boosted to full capacity. At full-capabity already, this will not help.

      • No charge for this one: Use your entitled zIIPs for running workload that usually would not be allowed to run there (General CP workload).

        • Martin will have updates to his zIIP Capacity Planning presentation!
      • If you are a GDPS user, there is GDPS 4.2 scripting and firmware enhancements.  This will allow parallelization of GDPS reconfiguration actions that may be part of your restart, reconfiguration, and recovery process.

        • Martin notes that if you parallelise more than you otherwise would it might affect the resource picture.
      • Lastly, and now this one is priced, if you want to, you can purchase HW features (9930 and 6802).  These features will allow you to have extra temporary zIIP capacity which you can then make use of for even more boost processing.

      • Additional reference: System Recovery Boost Redpaper

      • Martin again, looks forward to seeing data from this as RMF could show some challenging things for his reporting.

Performance Topic: z15 from chip design on upwards

  • Disclaimer: personal view, not from Development or Marketing. Marna and Martin were talking about the z15 Chip design – and we thought those observations might be useful to include in the Performance topic. Mostly focusing on the CP chip, which is where the processor cores are.

  • Two traditional levers were raising clock speed or shrinking the feature size.

    • z15 clock speed still 5.2 GHz. And we’ve been as high as 5.5 GHz with zEC12.

    • Feature size still 14 nanometers (nm). Some other fabs have 10nm and even 7nm processes.

  • GHz and nm aren’t the be all and end all. Looking at chip design now.

    • Start with a similar sized CP chip and putting more on it. It helped to get rid of the Infiniband-related circuits, and some layout enhancements.

      • Very sophisticated software used for laying out all modern chips. Once you have more chip real estate, good stuff can happen.

        • Same size CP chip has 3 billion more transistors, that’s 9.1 billion transistors.
      • This can give us two more cores, taking us to twelve.

        • As an aside, Martin has seen the two more cores on a z14 PU chip allow better Cycles-Per-Instruction (CPI) than on z13.
      • More L2 Instruction Cache, at the core level. Double L3 Cache size, at the chip level, shared between cores. So almost double per core. All of this has got to lead to better CPI.

      • Nest Acceleration Unit (NXU): Compression on this chip is such a fascinating topic, but for another episode.

      • Drawer can go down from 5 or 6 PU chips to 4 and – for 1-drawer machine – still have one more purchasable core than z14 – 34 vs 33.

        • 33 vs 34 is for 1 drawer. Similar things apply for two or more drawers.

        • As a result they also were able to remove one set of X-Bus circuitry.

          • The X-Bus is used to talk to other CP chips and the System Controller (SC) Chip.

          • Now down from 3 to 2: one for the SC chip and one for the remaining other CP chip in the cluster.

        • Now fit the contents of the drawer in an industry standard 19 inch (narrower) rack. This following what was on the z14 ZR1.

      • At the top end there are up to 190 characterisable cores, coming up from 170. This can give us a fifth drawer – which is quite important.

        • Speculation that reducing the number of PU chips enabled the System Controller (SC) chip to talk to four other SC chips, up from 3 in z14, getting us from 170 to 190.
      • Many other things too: Like 40TB Max Memory at the high end vs 32TB, improved branch prediction so deeper processor design, and enhanced Vector processing with new instructions.

Topics Topic: How To Do A Moonlight Flit

  • This topic is about moving one’s social output, in particular blogs and podcast series. Martin’s blog had to move, because the IBM developerWorks blog site is being shut down.

    • Many blogs moved to ibm.com, or elsewhere like Martin’s Mainframe Performance Topics. Marna’s blog remains in the same spot (Marna’s Musings.

    • This podcast probably also has to move, for similar reasons, and we are looking for another location now.

      • When we move it, the feed will have to be replaced in your podcast client, sadly.
  • Immediately people might worry about Request For Enhancements being affected , and it is not.

  • Following are the important criteria we thought, when selecting the right sight to move social media to:

    • Cost, ideally it’d be free, but may be worth paying a few dollars a month for better service and facilities.

    • Good ecosystem. Easy to use, especially for publishing a podcast.

    • Good blog publishing tools, that integrating well with writing systems and good preview capabilities.

    • Longevity is important. You do not want to migrate again soon. Martin has 15 years of blogging!

    • Security. Our material is public, however tampering is the concern.

  • Moving the media:

    • Retrieval from old location. Martin wrote some Python to webscrape the blog. He built a list of posts and retrieved their HTML.

      • Graphics: The same python retrieved the graphics referenced in the blog posts.

      • Podcasting needed show notes and keywords (for Search Engine Optimisation). Also audio and graphics.

      • Cross references: Martin’s blogs have references from one post to another, both absolute and relative. And our podcast shownotes have links too that will break.

    • Re-posting . A lot of HTML editing is required. Different posts using different authoring tools have different generated HTML, and had post-cross-references that needed refactoring.

      • Martin’s graphics needed uploading afresh, and how they are positioned on the page had to change.
  • Redirecting Audience:

    • We really don’t want to lose our listeners and readers!

    • Martin posted progress reports on Twitter where there would be a trace. His blog’s root URL had to change. Fortunately the old blogging site redirected to the new, but not to individual posts.

    • Our podcast subcribers will need to specify a new feed URL, as there is no possibility of not affecting subscriptions. Watch out for an announcement.

      • New feed will likely cause all episodes to be re-downloaded.

      • If you’re subscribing, once you re-subscribe to the new location, you should be fine for a long time. However, we don’t know how many we’ll lose. We don’t actually know how many people listen (e.g. on the web).

  • We try to turn such experiences into something useful.

Customer requirements

  • RFE 133491: “Write IEFC001I and IEFC002I information to SMF”

  • Abstract: At the point where IEFC001I and IEFC002I messages are produced, also write this information to SMF. The record should indicate if it was a cataloged procedure (the IEFC001I infomation) or an INCLUDE group, the member used, whether it came from a private or system library, and the dsname of that library, in addition to job identification information. Possibly the stepname should be included (names internal to a cataloged procedure are not needed).

  • Use Case: Organizations of long standing often have thousands of cataloged procedures. Often a large percentage of these are obsolete or never used but the mechanisms for discovering which ones should be archived or deleted do not exist as far as I can find. Being able to summarize SMF records to compare against member lists would allow us to clean up old cataloged procedures and/or INCLUDE members. This could also have security use if one suspects a malicious person had temporarily substituted a JCLLIB version of a proc.

  • Currently it is an Uncommitted candidate , moved from JES to BCP component for response.

  • The messages referenced were:

    • IEFC001I (PROCEDURE procname WAS EXPANDED USING text) : The system found an EXEC statement for a procedure. In the message text

    • IEFC002I INCLUDE GROUP group-name WAS EXPANDED USING text: The system found an INCLUDE statement to include a group of JCL statements. In the message text:

  • Our thoughts:

    • Martin thought it could be useful in SMF 30. Already have some stuff about PROCs and PROC STEPs in SMF 30. Some of the information, particularly data set, is quite lengthy. So probably an optional section in the record, maybe repeating. Might need some SMFPRMxx parameter. It looks useful, probably not a new record, needs care in designing.

    • Marna likes it for two reasons: 1. Helpful in cleanup. and 2. Has security benefits.

Future conferences where we’ll be

On the blog

Contacting Us

You can reach Marna on Twitter as mwalle and by email.

You can reach Martin on Twitter as martinpacker and by email.

Or you can leave a comment below. So it goes…

Normal Service Resumed

I’m sat in a nice coffee shop in Wallingford, relaxing after an interesting few weeks.

Over the past few weeks I’ve migrated my blog over to WordPress. If you’re reading this you’ve followed me over so a big thank you.

I migrated 526 blog posts. This number frankly astonishes me, though it’s been almost 15 years so that’s only 30 something a year.

The topics have varied, which is why I stuck the commas in the title some time ago. In the migration I made it official.

In the meantime a few years ago my dear friend Marna Walle joined me in using the “Mainframe, Performance, Topics” name – for our podcast series.

The focus of the blog has not changed at all. It’s just the “social conditions” that have changed: I’m funding the blog – on WordPress. This gives me very little more latitude but does mean I can take it with me – into (eventual) retirement.

The process of writing still needs sorting out – but my experience of migration tells me that WordPress is going to be a good place to be. I certainly can confect HTML and upload pictures and even PDFs. But the WordPress offers a wider ecosystem. For example, I’m writing this across Mac, iPad and iPhone using the very excellent Drafts app and I know Drafts can publish direct to WordPress. (Drafts’ automation might be handy in helping me write, actually.)

So expect even more

  • Mainframe whatever I want to talk about
  • Performance whatever I want to talk about
  • Topics whatever I want to talk about

You get the picture: It’s whatever I want to talk about. 🙂

So normal service resumed, then. 🙂

More On Native Stored Procedures

(Originally posted 2019-08-25.)

This follows up on Going Native With Stored Procedures?, and it contains a nice illustrative graph.

I could excuse a follow up so soon with the words "imagine how unreadably long the post would be I had written this as all as one piece”.

However, the grubby truth is I got to write some code I didn’t expect to just yet. And this post became possible as a result.

Where Were We?

The gist of the previous post is:

  • Non-native stored procedures aren’t zIIP eligible, regardless of their caller’s eligibility.
  • You can see the level of zIIP eligibility for DDF and note if it looks less than you might expect or desire.

And that’s where we (and my code) left it. A useful place to be but not the best that could be done.

The New Code

So, for the very same study as in Going Native With Stored Procedures?, I began to dig, and to teach my code new tricks.

I wanted to know more about the stored procedures that weren’t zIIP-eligible.

To do that would take processing of package-level Db2 Accounting Trace (SMF 101).

Package-Level SMF Records

There are two types of SMF 101:

  • IFCID 3 – Plan-Level Accounting
  • IFCID 239 – Package-Level Accounting

Each IFCID 239 record contains up to 10 QPAC Package Sections[1], each describing one package name. I emphasise “name” because you don’t get one each time you call a stored procedure with that name. Instead you get one section for all calls that invoke that package.

My Old Package-Level Code

To be honest, I stopped earlier with developing my package-level code than I wanted to. The code takes an IFCID 239 record and “flattens” it, with each Package Section placed in a fixed position in the output record.

So, I have records still with up to 10 package names in them.

If you’d sent me Package-Level Accounting for a study involving DDF my code flattened the records and left it at that. There was no reporting.

My New Package-Level Code

Now I have code that takes a 10-package record and turns it into 10 1-package records[2]. I also have some reporting.

The interesting bit is the reporting.

I create a CSV file, sorted by subsystem and, within that, GCP CPU. Each line is a separate Correlation ID and Package combination. There is a nice field in the QPAC section (QPACAAFG) which says which type of package it is. Here are the values QPACAAFG can take.

CodePackage Type
X’0000’Regular package
C’01’Non-Native Stored Procedure
C’02’User-Defined Function (UDF)
C’03’Trigger
C’04’Natived Stored Procedure
C’05’Inline UDF

Notice anything strange about the above table?

I think the X’0000’ is odd but it’s probably indicating “this field not filled”.

In any case you can readily tell what kind of package we have.

When I import the CSV file into Excel (and hold my nose) 🙂 I get a useful spreadsheet[3]. I define a field which is the concatenation of the package type and its name. That enables me to produce a nice graph. Here’s an example.

It shows CPU seconds – both GCP and zIIP – for the 20 packages with the biggest GCP CPU.

I’ve obfuscated all but one package name. We’ll get to that. But here are some observations:

  1. This package is a non-native stored procedure. There is very little zIIP eligibility. (Why there is any zIIP eligibility is unclear.)
  2. This package is a native stored procedure. It has about 60% zIIP eligibility.
  3. This package is well known – SYSLH200 – so I’ve not obfuscated it. It also shows about 60% zIIP.
  4. This is a second native stored procedure – again showing good zIIP eligibility.

Most of the rest of the packages are non-native stored procedures, with a couple of native ones. The zIIP usage is much as you would expect.

Let’s return to SYSLH200. It’s a Db2-supplied default package. You see it a lot. In a different subsystem it’s – by miles – the biggest package. This would be entirely normal for a subsystem that wasn’t a heavy user of explicit DDF packages, such as stored procedures.

Here’s a table of such package names:

Package NameDescription
SYSSHxyydynamic placeholders – small package WITH HOLD
SYSSNxyydynamic placeholders – small Package NOT WITH HOLD
SYSLHxyydynamic placeholders – large package WITH HOLD
SYSLNxyydynamic placeholders – large package NOT WITH HOLD

Now I can spot them I’ve added this table to my bank of standard slides to use in customer engagements. I’ll add other well-known package names over time.

For this subsystem it’s clear the customer has some familiarity with Native Stored Procedures, whether accidentally or intentionally. (It’s possible the native ones were supplied by a vendor and the install script set up the Db2 Catalog to enable them.) But this customer has some way to go – assuming the remaining major stored procedures can and should be converted to native.

Conclusion

This code is going into Production – Just as soon as I can find the time to convert the JCL to an ISPF File Tailoring skeleton.

But I’m already using it so it is in a sense in Production.

So if I ask you for package-level Db2 Accounting Trace this will be one of the good uses I’ll put it to.

The Journey Continues

I’m sure there’ll be “another thrilling instalment” 🙂 in the DDF saga. I just have no idea right now what what it would be.

Stay tuned. 🙂


  1. Mapped by mapping macro DSNDQPAC in the SDSNMACS library shipped with Db2.  ↩

  2. I’ll admit it’s not the most efficient code, involving two passes over the data; One day I’ll probably rework my Assembler DFSORT E15 edit to emit 10 records and eliminate passes over the data.  ↩

  3. Well, useful when I’ve taken each subsystem and plonked it in its own sheet manually 😦 . If anyone knows how to automate this sort of thing I’d love to know.  ↩

Making The Connection?

(Originally posted 2019-08-11.)

I wrote the last blog post on my way to my first SHARE conference. I’m writing this one on the way back.

So a big thank you to all who made me feel so welcome in Pittsburgh. In many cases it was good to meet up with long standing friends; In some cases people I’ve known a long time but never actually met. I hope I’ll be back1.

Right now the team is working on a customer study, one component of which looking at big batch CPU consumers. And I think I‘ve just solved a long-standing issue of interpretation. So I’m sharing the thought with you.

It concerns batch job steps with a TSO step program name – IKJEFT01, IKJEFT1B, or IKJEFT1A. There are two main cases here:

  1. Those that call Db2 – through the TSO connection.
  2. Those that don’t.

The latter are usually something like REXX, or they could be compiled / assembled programs called from the Terminal Monitor Program (TMP) in Batch.

It would be good to know which is which. So, there’s no use tuning the SQL of a job step that doesn’t go to Db2. You’re better off tuning the REXX or program code. As it happens, I am perfectly capable of tuning REXX programs; I’m not so hot at tuning SQL. In any case I’m looking to advise the customer as to which is indicated for any given heavy CPU step.

A Previous Method

Previously we would’ve used our code which matches SMF 101 (DB2 Accounting Trace) with SMF 30 Step-End records (Subtype 4) to detect whether a job step accesses a Db2 subsystem or not. This is the most comprehensive way but it’s quite likely Accounting Trace isn’t turned on for the subsystem the job step attaches to. Further, we don’t even know which subsystem it is.

So this is OK – and in practice we’ve used this method for a long time.

By the way, for TSO Attach Batch the Db2 Correlation ID is the job name2.

A More Scalable Method

There is a more scalable way of distinguishing between TSO job steps that access Db2 and those that don’t. And it’s more scalable because it doesn’t require Db2 Accounting Trace. Further, it yields the subsystem name.

It uses only SMF 30 records.

I’ve mentioned the SMF 30 Usage Data Section before: If an address space accesses Db2 the record will have at least one Usage Data Section pointing to Db2. The Product Name field will say “DB2” and the Product Qualifier field will contain the name of the subsystem. (The Product Version field will tell you which version of Db2 it is, which is also handy.)

Today I tested the following query:

‘Show me all the SMF 30–4 records with a Program Name field beginning “IKJEFT” and with a Usage Data Section that points to Db2."

This query only uses SMF 30. It ran quickly and produced an interesting list. It would be a small matter to turn this into the following query:

‘Show me all the SMF 30–4 records with a Program Name field beginning “IKJEFT” without a Usage Data Section that points to Db2."

The first query yields the first case above: TMP job steps that attach to Db2. The second query yields the other case: Non-Db2 TMP steps.

So now we can more usefully discuss any step whose Program Name begins “IKJEFT” – by working out which case it is.

I consider this a nice result – which is why I’m sharing it with you.

And, yes, if the step goes to Db2 we can dive into Accounting Trace to figure out:

  • If there is after all any non-Db2 CPU to tune out.
  • Any other buckets of time worth tuning.

  1. Both to Pittsburgh and SHARE, in case you wondered.

  2. The observant will note that Correlation ID is up to 12 characters and job name is at most 8. The Correlation ID pads to the right with blanks up to 12.

Going Native With Stored Procedures?

(Originally posted 2019-08-04.)

I hope the term “Going Native” isn’t considered offensive. If it is my defence is that I didn't coin the term “Native Stored Procedures” and that’s what I’m alluding to.

This post is about realising the benefit of Native Stored Procedures for DDF callers. Most particularly, assessing the potential benefit of converting Stored Procedures to Native.

It’s not been long since l wrote about DDF. (My last DDF post was in 2018: DDF TCB Revisited. ) It’s been rather longer since I last wrote about Db2 Stored Procedures: 2007’s WLM-Managed DB2 Stored Procedure Address Spaces is the latest I can find.1

You might ask “what on earth is a z/OS Performance guy doing talking about Db2?” In “Low Self Esteem Mode” 🙂 I might agree, but I’ve always thought I could pull off “one foot in each camp” when it comes to Db2 and z/OS. “I consider it a challenge…” 🙂

Seriously, z/OS folks have quite a bit to say that can help Db2 folks, if they try. And so we come to a recent example.

We’ll get to Stored Procedures presently.2 But let me build up to it.

And, yes, this post is inspired by a recent customer case. And, no, it didn’t alert me to the topic; It just was a good excuse to talk about it.

zIIP Eligibility And DDF

In principle, a DDF workload should see roughly 60% zIIP Eligibility. As I said in DDF TCB Revisited

“The line in the graph – or rather the round blobs – shows zIIP eligibility hovering just under 60% – across all the DB2 subsystems across both customers. (One of the things I like to do – in my SMF 101 DDF Analysis Code – is to count the records with no zIIP eligibility.”

That last sentence is interesting, and I don’t think I spelt it out sufficiently: For quite a while now a DDF thread is either entirely zIIP-eligible or not at all.

I glossed over something in that post – which was the right thing to do as that set of data didn’t display an important structural feature: If your DDF thread calls a Non-Native Stored Procedure (SP) or User-Defined Function (UDF) it loses zIIP eligibility for the duration of that call.

Native Stored Procedures

You can write SPs and UDFs in pretty much any language you like, and “shell out” to anywhere you like3. For example, you can write them in REXX, Java, COBOL, PL/I. And SPs and UDFs can call each other. Lovely for reuse.

These capabilities have been around for at least 20 years. More recently (but not that recently (but progressively enhanced over recent Db2 versions)) a different set of capabilities have emerged: Native Stored Procedures. You write these in an extension to SQL called PL/SQL. The term “Native” here refers to using the built-in language, rather than a programming language.

Native SPs run in the Db2 DBM1 address space – but with the caller’s WLM attributes. Non-Native SPs run in WLM Stored Procedures Server address spaces (as described by WLM-Managed DB2 Stored Procedure Address Spaces and the Redbook it references.)

For the rest of this post I shall talk of SPs – for brevity. UDFs and Triggers are generally similar.

Why Do Native Stored Procedures Matter?

As a Performance person, Native Stored Procedures matter to me because there’s a fundamental difference in zIIP eligibility:

  • When a DDF thread calls a Native SP the zIIP eligibility is preserved.
  • When a DDF thread calls a Non-Native SP the CPU to run the SP is not zIIP eligible.

So, there’s an economic advantage to SPs being Native. Actually it’s twofold:

  • The obvious one of zIIP CPU having lower software (etc) costs than GCP CPU.
  • When work switches between zIIP eligibility and non-eligibility the work doesn’t get undispatched on one processor and redispatched on another immediately. However, if it does – for any reason – get redispatched its eligibility is checked. That could well cause redispatch on a different type of processor. Of course that’s a different processor – with the potential for cacheing loss.

In any case, we’d want to maximise zIIP eligibility, wouldn’t we?

A Case In Point

I said early on in this point I have a real customer case to discuss.

I did my standard “let’s see how much GCP and how much zIIP this Db2 subsystem uses” analysis. The results were emphatically not 60% zIIP eligible, 40% ineligible.

I examined this at three levels:

  • DDF Service Class, using SMF 72-3 Workload Activity data.
  • DIST Address Space, using SMF 30 Interval Independent Enclave data.
  • Db2 DDF Application, using SMF 101 Db2 Accounting Trace data.

They all agreed on transaction rates, GCP CPU, and zIIP CPU.

The customer has 9 subsystems, across 3 LPARs. These are not cloned subsystems, though the same application “style” – JDBC4 – dominated the DDF traffic in all cases.

Of these 9 subsystems only 1 approached a 60%/40% split. The rest varied up 90% or so GCP.

Db2 Accounting Trace is helpful here – if you examine Class 2 CPU times:

  • You get overall TCB time on a GCP [1]
  • You get overall zIIP CPU time [2]
  • You get Stored Procedures CPU time [3]
  • You get UDF CPU time [4]

So I took away [3] and [4] from [1]. I compared the result to [2]. With this adjusted measure of GCP CPU Time I got extremely close to 60% being zIIP eligible – for all 9 subsystems.

As an aside, Stored Procedures Schedule Wait and UDF Schedule Wait time buckets can tell you if there are delays in starting the relevant server address spaces. In this data it was notable that the Test Db2 subsystems were more prone to this than Production. As the machines ran pretty busy it’s good to see the delays being directed to less important work5.

So, Shall We Go Native?

“Not so fast” applies – both as an exclamation and as an expectation:

  • Because these are generally comprehensive application changes it’s going to take a while to convert.
  • It might not even be worth it – at least not for 100% of Non-Native SPs.

Here’s how I think you should proceed:

  1. Figure out how much of a problem this is (and you’ve seen that above).
  2. Use Db2 Accounting Trace at 2 levels to establish which moving parts might be worth reworking:
    • IFCID 3 would show you which applications, including client computer.
    • IFCID 239 would show you which package burns the most CPU, and whether it’s a SP, UDF, or Trigger.
  3. Assess – for each of the “high potential” SPs – which applications call them. Of course, only some of them will benefit – as SPs can be called from a wide variety of non-DDF applications as well as DDF ones.

If you carry out the above steps you should see whether it’s worth it to convert to Native.

One final word of caution: Many Non-Native SPs can’t be converted to PL/SQL so any discussion with your Db2 and Applications colleagues need to be sensitive to that.


  1. I need a Content Management System (CMS) as you, dear reader, might find a more recent one. 🙂 

  2. To “break the fourth wall” a moment, this post is more or less writing itself (at what I’m told is 10972m) so it might or might not let me get to the point. 🙂 

  3. Including to other services, even ones out on the web. 

  4. In Db2 Accounting Trace you can sum up by e.g. Correlation ID. JDBC shows up as “db2jcc_appli”. 

  5. And Workload Activity Report and WLM bear out it is less important work that is getting hit. 

Engineering – Part Three – Whack A zIIP

(Originally posted 2019-08-02.)

(I’m indebted to Howard Hess for the title. You’ll see why it’s appropriate in a bit.)

Since I wrote Engineering – Part Two – Non-Integer Weights Are A Thing I’ve been on holiday and, refreshed, I’ve set to work on supporting the SMF 99 Subtype 14 record.

Working with this data is part of the original long-term plan for the “Engine-ering” project.

Recall (or, perhaps, note) the idea was to take CPU analysis down to the individual processor level. And this, it was hoped, would provide additional insights.[1]

Everything I’ve talked about so far – in any medium – has been based on one of two record types:

  • SMF 70 Subtype 1 – for engine-level CPU busy numbers, as well as Polarisation and individual weights.
  • SMF 113 Subtype 1 – for cache performance.

SMF 99 Subtype 14 provides information on home locations for an LPAR’s logical processors. It’s important to note that a logical processor can be dispatched on a different physical processor from its home processor, especially probably in the case of a Vertical Low (VL)[2]. I will refer to such things as where a logical processor’s home address is as “processor topology”.

It should be noted that SMF 99–14 is a cheap-to-collect, physically small record. One is cut every 5 minutes for each LPAR it’s enabled on.

Immediate Focus

Over the past two years a number of the “hot” situations my team has been involved in have involved customers reconfiguring their machines or LPARs in some ways. For example:

  • Adding physical capacity
  • Shifting weights between LPARs
  • Varying logical processors on and off
  • Having different LPARs have the dominant load at different times of day

All of these are entirely legitimate things to do but they stand to cause changes in the home chips (or cores) of logical processors.

The first step with 99–14 has been to explore ways of depicting the changing (or, for that matter, unchanging) processor topology.

I’ll admit I’m very much in “the babble phase”[3] with this, experimenting with the data and depictions.

So, here’s the first case where I’ve been able to detect changing topology.

Consider the following graph, which is very much zoomed in from 2 days of data – to just over an hour’s worth.

Each data point is from a separate record. From the timestamps you can see the interval is indeed 5 minutes. This is not the only set of intervals where change happens. But it’s the most active one.

There are 13 logical processors defined for this LPAR. All logical processors are in Drawer 2 (so I’ve elided “Drawer” for simplicity.)

Let me talk you through what I see.[4]

  1. Initially two logical processors are offline. (These are in dark blue.) Cluster 2 Chip 1 has 7 logical processors and Cluster 2 Chip 2 has 2.
  2. A change happens in the fifth interval. One logical processor is brought online. Now Cluster 2 Chip 1 has 6 and Cluster 2 Chip 2 has 4. Bringing one online is not enough to explain why Cluster 2 Chip 2 gained two, so one must’ve moved from Cluster 2 Chip 1.
  3. In interval 7 another logical processor is brought online. The changes this time are more complex:
    • A logical processor is taken away from Cluster 1 Chip 1.
    • A logical processor appears on Cluster 1 Chip 2.
    • Two appear on Cluster 1 Chip 3.
    • One is taken away from Cluster 2 Chip 2.
  4. Towards the end, as each of two logical processors are offlined they are taken away from Cluster 2 Chip 2.

The graph is quite a nice way of summarising the changes that have occurred, but it is insufficient.

It doesn’t tell me which logical processors moved.

What we know – not least from SMF 70–1 – is the LPAR’s processors are defined as:

  • 0 – 8 General Purpose Processors (GCPs)
  • 9 – 12 zIIPs

The Initial State

With 99–14, diagrams such as the following become possible:

This is the “original” chip assignment – for the first 4 intervals.

This is very similar to what the original WLM Topology Report Tool[5] would give you. (I claim no originality.)

I drew this diagram by hand; I can’t believe it would be that difficult for me to automate – so I probably will.[6]

After 1 Set Of Moves

Now let’s see what it looks like in Interval 5 – when a GCP was brought online:

GCP 5 has been brought online in Cluster 2 Chip 2, alongside the 2 (non-VL) zIIPs. But also GCP 0 has moved from Cluster 2 Chip 2 to Cluster 2 Chip 1.

What’s The Damage?[7]

Now, what is the interest here? I see two things worth noting:

  • A processor brought online has empty Level 1 and Level 2 caches.
  • A processor that has actually moved also has empty Level 1 and Level 2 caches.

Within the same node/cluster or drawer probably isn’t too bad. (And within a chip – which we can’t see – even less bad as it’s the same Level 3 cache). Further afield is worse.

Of course the effects are transitory – except in the case of VLs being dispatched all over the place all the time. Hence the desire to keep them parked – with no work running on them.

After The Second Set Of Moves

Finally, let’s look at what happened when the second offline GCP was brought online – in Interval 7:

GCP 8 has been brought online in Cluster 1 Chip 2. But also zIIP 10 has moved from Cluster 2 Chip 2 to Cluster 1 Chip 1. Also zIIPs 11 and 12 have moved from Cluster 1 Chip 1 to Cluster 1 Chip 3.

This information alone (99–14) isn’t enough to tell you if there was any impact from these moves. However, you can see that in neither case was a “simple” varying a GCP online quite so simple. Both led to other logical cores moving. This might be news to you; It certainly is to me – though the possibility was always in the back of my mind.

Note: This isn’t a “war story” but rather using old customer data for testing and research. So there is no “oh dear” here.

Later On

To really understand how a machine is laid out and operating you need to consolidate the view across all the LPARs. This requires collecting SMF 99–14 from them all. This, in fact, is a motivator for collecting data from even the least interesting LPAR. (If its CPU usage is small you might not generally bother.)

But there’s a snag: Unlike SMF 70–1, the machine’s plant and serial number isn’t present in the SMF 99–14 record. So to form a machine-level view I have two choices:

  • Input into the REXX exec a list of machines and their SMFIDs.
  • Have code generate a list of machines and their SMFIDs.

I’ll probably do the former first, then the latter.

What also needs doing is figuring out how to display multiple LPARs in a sensible way. There is already a tool doing that. My point in replicating it would be to add animation – so when logical processors’ home chips change we can see that.

Maybe Never

SMF 99–14 records aren’t cut for non-z/OS LPARs, which is a significant limitation. So I can’t see a complete description of a machine. For that you probably need an LPAR dump which isn’t going to happen on a 5-minute interval.

However, for many customer machines, IFL- and ICF-using LPARs are on separate drawers. It’s a design aim for recent machines but isn’t always possible. For example, a single-drawer machine with IFLs and GCPs and zIIPs will see non-z/OS LPARs sharing the drawer. Most notably, this is what a z14 ZR1 is.

One other ambition I have is to drive down to the physical core level. On z14, for instance, the chip has 10 physical cores, though not all are customer-characterisable. But this won’t be possible unless the 99–14 is extended to include this information. This would be useful for the understanding of Level 1 and Level 2 cache behaviour.

Finally, there is no memory information in 99–14. I would dearly love some, of course.

Conclusion

While 99–14 doesn’t completely describe a machine, it does extend our understanding of its behaviour by relating z/OS logical processors to their home chips. Taken with 70–1 and 113–1, this is a rather nice set of information.

Which prompts lots of unanswerable questions. But isn’t that always the way? 🙂

A question you might have asked yourself is “do I need to know this much about my machine?” Generally the answer is probably “no”. But if you are troubleshooting performance or going deep on LPAR design you might well need to. Which is why people like myself (and the various other performance experts) might well be involved anyway. So – for us – the answer is “yes”.

The other time you might want to see this data “in action” is if you are wondering about the impact of reconfigurations – as the customer whose data I’ve shown ought to be. 99–14 won’t tell you about the impact but it might illuminate the other data (70–1 and 113). And together they enhance the story no end.


  1. Actually it’s a matter of “legitimised curiosity”. 🙂  ↩

  2. I’ll follow the usual convention of “VL” for “Vertical Low”, “VM” for “Vertical Medium ”, and “VH” for “Vertical High”.  ↩

  3. See, perhaps, Exiting The Babble Phase?.  ↩

  4. You might well see something different. And that’s where the fun begins.  ↩

  5. Which I now can’t find. 😦 and couldn’t run even if I could find it.  ↩

  6. Actually, I’ve experimented with creating such diagrams using nested HTML tables. It works fine. The idea would be to write some code (probably Python) to generate such (fiddly) HTML from the data.  ↩

  7. In UK English, at least, “What’s the damage” means “what’s the bill”.  ↩

Buttoned Up

(Originally posted 2019-06-18.)

In Automation On Tap I talked about NFC tags and QR codes as ways of initiating automation.

This post is about a different initiator of automation.

I recently acquired the cheapest (and least functional) Elgato Stream Deck. This is a programmable array of buttons. My version has 6 buttons but you can buy one with 15 and one with 32 buttons.

A Previous Attempt Didn’t Push My Buttons

A while back I bought an external numeric keypad for my MacBook Pro. When I was editing our podcast with Audacity on the Mac I used this keypad to edit. It’s just numeric, with a few subsidiary keys.

What makes this programmable is that the Mac sees external keypad keys as distinct from their main-keyboard counterparts. So the “1” key on the external keypad has a different hardware code from the “1” key on the main keyboard. Keyboard Maestro macros can be triggered when keys are pressed. So I wrote macros to control Audacity, triggered by pressing keys on the external keypad.

But there was a problem: How do you know what each button does? It’s quite a pretty keypad – as you can see:

So I really didn’t want to stick labels on the key tops. I also tried to make my own template but there was an obvious difficulty: It’s hard to design one that tells you what the middle buttons in a cluster do.

There’s another problem: Suppose I set up automation for multiple apps – which Keyboard Maestro lets you do. Then I need a template for each app – as each will probably have its own key pad requirements. These templates are fragile and in any case this is a fiddly solution.

So this was good – up to a point. It certainly made podcast editing easier. But that was all.

And then I moved podcast editing to Ferrite on iOS, discussed here. And this keypad is now in a drawer. 😦

The One That I Want

Cue the Elgato Stream Deck Mini. First I should say there are other programmable keypads but this is the one that people talk about. I’d say with good reason.

As you can see, it’s corded and plugs into a USB A port.[1] Power to it is supplied by this cable.

It’s easy to set up, literally just plugging it in. To make it useful, though, you need their app – which is available for Mac or Windows. It’s a free app so no problem.

With their app you drag icons onto a grid that represents the Stream Deck’s buttons. Here’s what mine generally looks like when powered up and the Mac unlocked.

The key thing[2] to notice is the icons on the key tops. This is already a big advance on paper templates and ruined shiny key pads. I didn’t have to do anything to get these icons to show up[3]. What you can see is:

  • 4 applications – Sublime Text, Drafts, Airmail and Firefox.
  • A folder called “More”. More 🙂 on this in a moment.
  • Omnifocus Today

The applications are straightforward, but the Omnifocus Today one is more complex. Pushing the button simulates a hot key – in this case Ctrl+Alt+Shift+Cmd+T. I’ve set up a Keyboard Maestro script triggered by this – extremely difficult to type – hot key combination. This script brings Omnifocus (my task manager) to the front and – via a couple of key presses – switches to my Today perspective.

So it’s not just launching applications. It can kick off more complex automation.

And someone built a bridge to IFTTT’s Maker Channel – which could open up quite a lot of automation possibilities. But right now I only have a test one – which emails me when I press the button.

One thing I haven’t tried yet is setting up buttons to open URLs in a browser. I don’t think I’ve got the real estate for that, with only six buttons.

Unreal Estate

In reality I probably should’ve plumped for the 15 button one[4], rather than the 6, but this has been an interesting exercise in dealing with limited button “real estate”.

So, if I press the “More” button I get this:

Because the button tops are “soft” they can change dynamically – as this demonstrates. This is much better than sticking labels on physical keys.

Two features of note:

  • The back button – top left
  • The “Still More” button bottom right

Being able to create the latter means I can nest folders within folders. This greatly increases my ability to define buttons.

The Right Profile

There’s one final trick that’s worth sharing.

Everything I’ve shown you so far has been from the Default Profile. You can set up different profiles for different apps.

I happen to have a few Keyboard Maestro macros for Excel – to make it a little less tedious to use. So I created a profile for Excel.

So when Excel is in the foreground the Stream Deck looks like this:

I’m still setting it up – and you can see two undefined (blank) buttons. – but you get the general idea. The two Excel-specific buttons – using an image I got off the web[5] – kick off some really quite complex AppleScript:

  • Size Graph – which resizes the current Excel graph to a standard-for-me 20cm high and 30cm wide. This is fiddly so having a single button press to do it is rather nice.
  • Save Graph – which fiddles with menus to get to the ability to save a graph as a .PNG file with a single button press.

(In retrospect I should probably have changed the text to black – which is easy to do.)

So in real estate terms, application-specific profiles are really handy.

One pro tip: Close the Stream Deck editor if you want application-specific profiles to work. I spent ages wondering why they didn’t – until I closed the editor.

Conclusion

This programmable key pad with dynamically-changing keytops is a really nice way to kick off automation from a single key press (or a few if you use cascading menus).

I would go for the most expensive one you can afford – but I’ve shown you ways you can get round the 6-button constraint in the Stream Deck Mini.

Application-specific profiles are a really nice touch.

The whole thing is rather fun for an inveterate tinkerer like me. 🙂

In short, I think I’ll be throwing this in my backpack and taking it with me wherever my MacBook Pro goes.

And one day I’ll upgrade to a model with more buttons.


  1. As they say not to use a USB hub I don’t know if it would plug into USB C via an adapter.  ↩

  2. Pun intended.  ↩

  3. Except the OmniFocus one – where I had to find an icon – but even that was easy.  ↩

  4. At much greater cost, of course. Still more so with the 32-button variant.  ↩

  5. You can create your own icons using a nice editor they have, which contains quite a wide variety of icons. But Excel wasn’t one of them – so I found an image and used that. The preferred dimensions are a 72-pixel square.  ↩