Mainframe, Performance, Topics

Memory Metrics Now That z/OS Release 8 Is Upon Us

(Originally posted 2007-09-27.)

I was asked an interesting question today by a customer – one with dozens of LPARs and therefore not much time to study each one in excruciating detail…

“Given that System High UIC (hereafter referred to as ‘UIC’) behaves differently in z/OS R.8 how should I treat it?”

I’ve posted on this question in MXG-L Listserver, soliciting experiences and opinions.

With z/OS R.8 we replaced the “page-UIC” method of managing memory with one very much like the old expanded storage algorithm. Pages are divided into two categories:

Those without the Reference Bit set (regarded as “old” pages).
Those with the Reference Bit set (regarded as “new” pages).

When we need to acquire a new page frame to back a new page request we search through pages (using a cursor) to find old pages…

If the Reference Bit is set we reset it.
If the Reference Bit was not set we reuse the page frame. If the Changed Bit is set we page out the contents. If not we discard them.

I hope you’ll recognise this is very much like the old expanded storage algorithm.

The consequence of this algorithm is that the UIC is now the time taken to search through the whole of memory. If old pages are scarce we need to search through more pages until we find one – and hence the time to traverse all of memory is lower. So a high UIC means relatively little constraint. A low one means relatively more constraint. In that sense UIC behaves similarly to how it did before.

The difference is that UIC ticks on up in unconstrained times – so its absolute value is less useful for Performance Analysis. So what DO I recommend?

I recognise that the “state of the art” is evolving – so this is subject to revision / abandonment etc…
I still recommend understanding how many pages are free.
The UIC still has value in that a low value is still meaningful. Therefore I would want to keep track of the MINIMUM value during your focus window (perhaps prime shift) as that’s the worst memory constraint gets.
Paging rates are still important.

So track all three metrics and develop a view of how your own systems behave.

And do contribute to the folklore – perhaps by replying to my post on MXG-L. There haven’t been many visible R.8 customers with memory stories. So I’m hoping that means that going to R.8 was a non-event, memorywise. It has been a year since R.8 GA’ed so I’d expect to see some customers using it.

And, finally, I think this topic has got to be worth another couple of foils in my “Memory Matters in 20xy” presentation. I think it’s already getting to the point where I should split it into two…

System Memory Performance Management
Subsystem and Application Memory Performance Management

System z Technical Conference, San Antonio TX

(Originally posted 2007-09-26.)

Last week I was at the System z Tech Conference (again getting my zeds mixed up with my zees). 🙂

I presented 4 times, one of which was a repeat. What’s especially nice is that I got 87 attendees for the “Memory Matters in 2008” presentation – spread across 2 sessions. This was despite the second time being the very last session of the conference… and 22 people still showed up to that! So thanks to everyone who listened to me, made comments and asked questions. I had a great time!

It was also nice to hear people (one high profile in this industry) mention that they followed this blog. But that gives me a greater sense of responsibility to contribute more to it.

Now that z/OS R.9 is right around the corner I feel an entry on SMF using System Logger coming on. And I’d better add that to the “SMF Management” wiki. I think THIS might be the way to go with some of our SMF management requirements.

The overwhelming theme of the conference was how much confusion there STILL is surrounding zIIPs and zAAPs. And I’m not sure I’m not adding to it myself. 🙂 Certainly there were many sessions on specialty engines. And the shows of hands in sessions showed that both zIIPs and zAAPs are gaining momentum, with lots of customers using them. Add to that the recent statements of direction regarding XML and zIIPs and zAAPs and DB2 V9’s DDF SQL Stored Procedures support of zIIP. (And I can’t speak to how the Linux / z/VM crowd feel about IFLs – as I didn’t attend ANY of their sessions – but I get the impression IFLs are doing well.)

And the week before last I spent two days talking with hardware and software developers in Poughkeepsie. So I’m pretty confident there’ll be stuff to talk about in this blog for many years to come. 🙂

Again, thanks to everyone who in one way or another contributed to my trip to Poughkeepsie and San Antonio being a success. Next stop Ehningen and Boeblingen. 🙂

SCRT Version 14.1.0 is Announced

(Originally posted 2007-07-13.)

I previously mentioned the change in zNALC LPAR setup… You can (with APAR OA20314) now specify LICENCE=ZNALC. The SCRT co-requisite is Version 14.1.0.

By the way, from when you submit your SCRT report at the beginning of August you have to use this new version (until a new one is mandated) in order to be eligible for Sub-Capacity Pricing.

There are a number of other changes, some of which are bug fixes and some are additions to the product list. The details are here. And there’s a link there to the newly-updated User Guide.

z/OS Performance Instrumentation Management Techniques wiki

(Originally posted 2007-07-12.)

I’ve just created a wiki to discuss primarily SMF. Mainly from the management perspective, rather than the contents of each individual record.

This follows on from things I’ve mentioned in this blog before.

If you’d like to contribute to it (and it is DESPERATELY in need of contributions right now) get a developerWorks screenname and send it to me here. Then I’ll enable you to edit the wiki.

You don’t need a screenname to be able to view the wiki.

The wiki is here.

Why Mainframe Folk Should Care About Web 2.0

(Originally posted 2007-07-05.)

I presented a set of (someone else’s) foils on Web 2.0 to my team meeting last week. (Interestingly, being 6 months old they were already way out of date – what with Twitter and all.) Remember I’m in a mainframe crowd of effectively “gurus”. 🙂 So why should they be interested in new-fangled webby stuff? So I got to thinking… Dear reader, why should you care about Web 2.0?

The minimal answer is “because it’s going to happen anyway, whether you like it or not, and you and your organisation are going to be left in the dust if you don’t embrace it”. I think that’s a fair answer but really there is positive stuff in there for us.

But rather than quote from The Long Tail or The Wisdom of Crowds (Read it and reading it, respectively) I think it better to point out some examples of Web 2.0 you may well already be using…

developerWorks blogs (like this one).
Wikipedia wiki (where those two book references above were from)
Flickr photo sharing site
Twitter microblogging

What these sites have in common is that they get better the more people use them. Both in adding content and also in rating and ranking material (or editing it in the case of wikis). As such they’re marking a shift away from static websites to ones where users have more control and the sites themselves just become enablers. And that leads onto changes in the web that we need to take notice of.

The other element of note is the idea of a “mashup“. This is where content from one site is mashed together with that of another to create (usually) a third site. Good examples of this would be the whole host of mashups built around Google Maps. Now they’ve been smart as they publish an API that web developers and mashup creators can use. The lesson here is that if you build your website so that it can be mashed up with others then your website will be used in such mashups and it will attract many more visitors.

A good analogy might be an insurance company that makes it hard for an insurance quoting website to garner quotes… That insurance company isn’t going to get so many quote requests as one that does.

Now, how does that affect the mainframe? It doesn’t directly but it does lead to a driving up of traffic and an ever higher reliance on good response times. So our old friends scalability and performance come into play. And we play well in those terms. And it does keep the focus on availability as well.

And how does it affect mainframers? My answer would be that we can really use a lot of these new technologies in our day jobs. And if we don’t we risk letting other platforms have all the fun. 🙂

So I’d encourage people to dive into Web 2.0. And that’s what I told my team last week.

Feedback from my UKCMG Mainframe Performance Instrumentation Birds Of A Feather

(Originally posted 2007-07-05.)

It was a very good session, even if it was attended by just a “hard core” of mainframe sites. I think everyone said at least something and several said rather more than that. Here are some things I’d like to note from it…

There was a general feeling that it’d be useful to have a name for a machine that could be entered on the HMC and flow through to SMF records, particularly Type 70 (CPU). So for example a site might like to name it’s machines “North Mainframe” and “South Mainframe” rather than just being identifiable by the hardware serial numbers 5112345 and 8356789. I think this is a really good idea – at least from MY perspective as someone who wanders onsite and would rather use the names YOU use for your machi nes, even if I do remember the serial numbers for at least one customer. 🙂 This idea, though, would require changes in at least three components. So I’m not overly optimistic. But I’ll make some enquiries and see what we can do.

We also think that the serial number should appear in OTHER SMF records (than 70-1 and 74-4) such as the other RMF ones and also Type 30. That would allow much easier matching up.

On the memory front I didn’t meet with that much interest – except that we think it important that the memory numbers become more accurate in Type 30 and Type 72. (Both of these are currently reported based on the notion of Service – which means that swappable workloads are under-reported in Type 72 and those that endure CPU queuing are under-reported in SMF 30.) We also would like to see SMF 70 report how big the machine’s HSA is and how much memory is purchased but not assigned to a partition.

We discussed instrumentation for VLF / LLA and for Catalog – which we think could be improved.

When talking about techniques for managing SMF data the SMFUTIL tool was mentioned. I’ll have to do some research into this, but I still think it worthwhile to post some examples of DFSORT being used to “slice and dice” SMF. One day. The meeting also felt that some kind of “best practices” guidance would be useful. Maybe I should start a wiki on the subject – so that the collective wisdom of the mainframe performance community can be tapped.

All in all I think the session was a success – and I’d like to do it again next year. I’ll work on these items but, as I originally said, there are no guarantees.

So thanks to the participants. Plenty of food for thought.

Abstracts for System z Technical Conference, San Antonio, September 17-21

(Originally posted 2007-06-23.)

Here are my abstracts for the conference:

Session B11: DB2 Data Sharing Performance for Beginners

This presentation provides an introductory-level view of how to look at the DB2 Data Sharing performance numbersfrom both a z/OS / RMF and a DB2h perspective.

Performance topics include: XCF, Coupling Facility, Data Sharing Structures, The application’s perspective, and Structure Duplexing.

Performance topics don’t include: Other forms of Data Sharing eg VSAM RLS, and overly detailed descriptions.

Session P22: Memory Matters in 2008

For z/OS LPARs memory management has changed radically over the years – from both the operating system perspective and that of applications. And the pendulum has swung back and forth between focusing on Real Memory and on Virtual Memory.

This presentation discusses managing both Real and Virtual Memory – from the perspectives of both the operating system and the exploiting products. The products include DB2, DFSORT, CICS, IMS, MQ and Websphere. One topic of particular importance to installations upgrading z/OS is the Release 8 Real Storage Manager rewrite.

Session P23: Much Ado About CPU

zSeries and System z9 processors have in recent years introduced a number of capabilities of real value to mainframe customers. These capabilities have, however, required changes in the way we think about CPU management.

This presentation describes these capabilities and how to evolve your CPU management to take them into account. It is based on the author’s experience of evolving his reporting to support these changes.

And Something Else That Confuses Me

(Originally posted 2007-06-21.)

This customer is using DB2 Hiperpools, despite being on a z9 BC processor and running DB2 Version 7.

What I notice from DB2 Statistics trace is that most of the Hiperpool pages are not backed by memory. In other words they’ve ceased to exist. This despite there being plenty of memory in the LPAR. One thing that’s also significant is that there are very few reads back from the Hiperpools. It’s mostly writes.

Somehow I think the two facts are linked. In any case I would recommend the customer moves to Version 8 and, in the short term, uses Dataspace pools instead of the combination of Hiperpools and Virtual Pools.

Meanwhile I’m still scratching my head. 🙂

It Doesn’t Take Much To Confuse Me

(Originally posted 2007-06-20.)

… with a Performance Consultant? 🙂

Seriously, here’s something that made me go “hmm”…

I started looking at the latest set of data from a customer…

Their biggest-CPU WLM workload is called “BATCH” when they told me they were a CICS shop – and that CICS was their main CPU consumer during the day.

But it turns out that the biggest service class within that workload contains exclusively CICS regions. (And it is called “PRDONL”.)

So that’s all right then. But you can see why I’d be confused. 🙂

Moral: A name is just a label, not a description. At least that’s true with WLM. But funny how we read things into names that generally are there. 🙂

IMS Version 10 Memory Enhancements

(Originally posted 2007-06-13.)

Regular readers would know I’m working on my “Memory Matters in 2008” presentation, which is a re-spin of the ’07 Version. One of the things I talk about is IMS.

Thanks to my team mate Andy Wilkinson for this list of IMS Version 10 enhancements, all of which are virtual storage usage improvements.

Here are the more important items:

OTMA message flood protection (prevents LSQA filling). This is retrofitted to IMS V9 by PTF.
New IMS log records (x’4511′ and x’4512′) to report on control region memory use.
IMS V10 acquires CSA for Fast Path resources in discrete pieces, rather than insisting it all be contiguous, which helps large Fast path customers

And here are some other, less important, IMS V10 items:

ACBGEN supports 31-bit addressing for the first time, which helps programs which access thousands of databases.
PCBs are passed from CICS to IMS using 31-bit addresses (instead of 24-bit addresses), which also helps programs which access thousands of databases.
Some IMS modules move above the 16Mb line, saving about 100KB of virtual storage.
Several IMS modules that used to reside in 24-bit common storage now reside in 31-bit common storage

The details of these are in the IMS V10 Release Planning Guide GC18-9717-00 available from here.