SRB And SMF

I’ve just had my first brush with SMF from z15’s System Recovery Boost (SRB).

(Don’t blame me for the reuse of “SRB”.) 🙂

The point of this post is to share what I’ve discerned when processing this SMF data.

System Recovery Boost

To keep the explanation of what it is short, I’ll say there are two components of this:

  • Speed Boost – which enables general-purpose processors on sub-capacitymachine models to run at full-capacity speed on LPARs being boosted
  • zIIP Boost – which enables general-purpose workloads to run on zIIP processors that are available to the boosted LPARs

And there are several different triggers for the boost period where these are available. They include:

  • Shutdown
  • IPL
  • Recovery Processes:
    • Sysplex partitioning
    • CF structure recovery
    • CF datasharing member recovery
    • HyperSwap

The above are termed Boost Classes.

If you want more details a good place to start is IBM System Recovery Boost Frequently Asked Questions

I’ve bolded the terms this post is going to use.

The above-mentioned boosts stand to speed up the boosted events – so they are good for both performance and availability.

RMF SMF Instrumentation For SRB

If you wanted to detect when a SRB boost had happened and the nature of it you would turn to SMF 70 Subtype 1 (CPU Activity).

There are two fields of interest here:

  • In the Product Section field SMF70FLA has extra bits giving you information about this system’s boosts.
  • In the Partition Data Section field SMF70_BoostInfo has bits giving you a little information about other LPARs on the machine’s boosts.

It should also be noted that when a boost period starts the current RMF interval stops and a new one is started. Likewise when it ends that interval stops and a new one is started. So you will get “short interval” SMF records around the boost period. (In this set of data there was another short interval before RMF resynchronised to 15 minute intervals.) Short intervals shouldn’t affect calculations – so long as you are taking into account the measured interval length.

After a false start – in which I decoded the right byte in the wrong section 🙂 – I gleaned the following information from SMF70FLA:

  • Both Speed Boost and zIIP Boost occurred.
  • The Boost Class was “Recovery Process” (Class 011 binary).
    • There is no further information in SMF 70-1 record as to which recovery process happened.

From SMF70_BoostInfo I get the following information:

  • Both Speed Boost and zIIP Boost occurred – for this LPAR.
  • No other LPAR on the machine received a boost. Not just in this record but in any of the situation’s SMF data.

The boost period was about 2 minutes long – judging by the interval length.

Further Investigations

I felt further investigation was necessary as the type of recovery process wasn’t yielded by SMF 70-1.

I checked SMF 30 Interval records for the timeframe. I drew a blank here because:

  • No step started in the appropriate timeframe.
  • None of the steps running in this timeframe looked like a cause for a boost.

I’m not surprised as the types of boost represented by Recovery Process really should show up strongly in SMF 30.

One other piece of evidence came to light: Another LPAR in the Sysplex (but not on the same machine) was deactivated. It seems reasonable to me that one or other of the Recovery Boost activities would take place in that event.

Conclusion

While I do think z15 SRB is a very nice set of performance enhancements, I do think you’re going to need to cater for it in your SMF processing for a number of reasons:

  • It’s going to affect intervals and their durations.
  • It’s going to cause things like speed changes on subcapacity GCPs and also zIIP behaviours.
  • A boosted LPAR might compete strongly with other (possibly Production) LPARs.
  • It’s going to happen, in all probability, every time you IPL an LPAR.

That last says it’s not “exotica”. SRB is the default behaviour – at least for IPL and Shutdown boost classes.

As I’ve indicated, SMF 70-1 tells most of the story but it’s not all of it.

There is one piece of advice I can give on that: Particularly for Recovery Process Boost, check the system log for messages. There are some you’re going to have to automate for anyway.

One final point: A while back I enhanced my “zIIP Performance And Capacity” presentation to cover SRB. I’ll probably refine the SRB piece as I gain more experience. Actually, even this blog post could turn into a slide or two.

Published by Martin Packer

I'm a mainframe performance guy and have been for the past 35 years. But I play with lots of other technologies as well.

4 thoughts on “SRB And SMF

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: