(Originally posted 2009-11-22.)
Our channel reporting has consisted forever of a single chart. Before I tell you what the chart looked like I’ll hazard that your channel reporting was about as bad. 🙂
See, it’s not something people tend to put much effort into.
Our one-chart report basically listed the top channels, from the perspective of the z/OS system under study, ranked by total channel utilisation descending – as a bar chart. The raw data for this is SMF Type 73. Actually there were two refinements people had made over the decades:
- Someone acknowledged the existence of the (then-called) EMIF capability to share channels between LPARs in the same machine. So stacked on top of this partition’s busy they added other partitions’ busy.
- Someone supported FICON by using the new FICON instrumentation to derive channel utilisation. (Of course if the channel’s not FICON we still use the old calculation: with some smart copying involved.)
And that’s where we left it until I got my hands on the code…
- The first thing I did, some months ago, was to add the channel path acronym (for example “FC_S” for “FICON Switched”). This is also in SMF 73.
- The second thing was much more significant:
The “other partitions’ busy” number is all other partitions’ use of the channel, without breaking down which other partitions these are.
- The third thing was a nice “fit and finish” item: Listing which controllers were attached to which channel.
Which LPARs Share This Channel
Each z/OS image can create its own SMF 73 records. For me I’m hostage to which systems my clients send in data for. Also I have to cut down the potential LPARs in the data. I do this using the following rules:
- The channel number (in Type 73) has to match.
- For multiple Logical Channel Subsystem (LCSS) machines (System z9 and System z10) the LCSS number must match. (This can be gleaned from Type 73. Actually Type 70 as well – as each LPAR has only one LCSS.)
- The machine serial number has to match. (Machine serial number isn’t in Type 73. You have to go to the Type 70 for it.)
- (I do a “belt and braces” check that the Channel Path Acronym (in Type 73) matches.)
So that set of checks tells you which LPARs really share the channel. And so you can then stack up their utilisations to gain a better picture of the channel. It’s quite nice when you do.
One other thing: Because I don’t necessarily see all the LPARs sharing a channel I compute an “Other Busy” number and add that to the stacked bar. In fact my test data showed all the major channels were missing LPARs’ contributions.
Which Controllers Are Accessed Using This Channel
To me a channel isn’t really interesting until you know what’s attached to it. (In my current set of data my test LPAR’s data shows one group of four channels attached to five controllers and another group of eight attached to two controllers.)
Working out which controllers are attached is quite fiddly:
- Use SMF 78 Subtype 3 (I/O Queuing) records to list the Logical Control Units (LCUs) attached to this channel.
- Use some magic code we have to relate LCUs to Cache Controller IDs. Basically it does clever stuff with SMF 74-5 (Cache) and 74-1 (Device) records to tie the two together.
I made a design decision not to annotate the graph with LCU names as there are usually many in a Cache Controller. It would be very cluttered if I had. (I do have another report that lists them and the channels attached to them.) Instead I list the Cache Controller IDs. You can probably relate to Controller IDs. If we’ve done our homework (and as we use your cache controller serial numbers we generally have) you’ll recognise the IDs.
So, if you’re one of my customers and I throw up a chart that shows channels and systems sharing them and the controllers attached it may look serene and slick. But believe me, there’s a lot of furious paddling that’s gone on under the surface. 🙂
But I tell you all this in case you’re wondering about how to improve your channel reporting. And I still think there’s more I can do in this area – particularly with the (more exotic) SMF 74-7 record, which brings FICON Director topology into play. And everything I’ve said above applies equally to whichever tools you use to crunch RMF SMF, I’m quite sure.