Mainframe, Performance, Topics

If The Cap Doesn’t Fit…

(Originally posted 2010-01-16.)

… swear at it. 🙂

No, I KNOW that’s not right – but it’s (for me) an irresistibly bad pun. And it’s a natural reaction, too. 🙂

In a recent customer situation I looked at the RMF Workload Activity Report data for a number of service classes. One WLM Sample count was particularly high: "Capped". In fact I look at, with tooling, SMF and the actual field is R723CCCA. (An IBM Development lab HAD looked at the data through the RMF Postprocessor "prism" and come to the same conclusion.)

It turns out, however, that the service classes in question aren’t part of any WLM Resource Groups. (There IS a service class that is subject to Resource Group capping but it’s not involved here.)

So, how can this be?

A piece of background will help:

The reason I had been asked to look at the SMF data was because a large dump episode had taken rather longer than it should have. It’s the usual lesson of "don’t dump into already busy page packs". The best way to ensure this doesn’t happen is, of course, to dump into memory. (Which might not be affordable, but it IS the best way.)

What had in fact happened was that the system had become under extreme Auxiliary Storage stress. And this had been my suspicion all along.

I’m indebted to Robert Vaupel of WLM Development for confirming this:

Capping delays occur when an address space in the service class is marked non-dispatchable. This can occur when Resource Group capping takes place (switching between non-dispatchable and dispatchable in defined intervals) or when a paging or auxiliary storage shortage occurs and the address space is detected as being the reason for it.

In the above the address spaces are related to dumping, of course.

And the reason I asked Robert was because R723CCCA is populated by a WLM-maintained field (RCAECCAP from IWMRCOLL) – it always paying to understand the source of RMF numbers.

So, if you see values in R723CCCA when Resource Group capping is not in play this might be the cause. I’ve not seen this documented anywhere.

(One thing I’d NOT been crisp about – but Robert firmed up in my mind – is that "Capped" samples have NOTHING to do with Softcapping or LPAR Capping in general. That’s a whole ‘nother story.)

So, there may be a moral tale here: If you THINK the cap doesn’t fit – it might well be the case it doesn’t. 🙂

What I Did On My Vacation

(Originally posted 2010-01-04.)

First of all, a happy and prosperous 2010 to one and all.

As with most vacations it’s been a time partially filled with playing with technology and learning stuff there isn’t (legitimate) time to learn about during the rest of the year.

So, lest the rest of this post make you think I ONLY play with web stuff 🙂 I present to you a short list of REALLY good other things from the past few weeks:

Avatar in 3D (as the great Doctor Brian May recommended).
Uncharted 2 (on the PS3).
Beatles Rock Band (also on the PS3).
Neil Gaiman’s "American Gods".
The company of friends and family.

Now onto the "geek stuff": 🙂

My Performance Management tooling (standing on the shoulders of giants, as it happens) produces reports and charts as Bookmaster and GIFs, respectively. (Actually the GIF bit I built mid-year 2009.)

Some time in late 2009 I installed Apache on my Thinkpad – with PHP support. That enabled me to treat my laptop as an automation platform. I also installed Dojo and B2H. (B2H is a NICE but old piece of REXX that takes Bookmaster output and converts it into HTML.)

So this PHP code allows me to download all the GIFs and Bookmaster source and display it on my laptop.

In November I wrote some PHP code to selectively bundle the GIFs into a zip file – to make it easier to share them with colleagues and customers. (If YOU get one from me I hope you can readily unpack and view its contents.)

In mid-December I took this zip code and modified it to create OpenOffice ODP files from selected GIFs. Although legitimate ODP files OpenOffice couldn’t read them – but KOffice on Linux COULD. And when written out again by KOffice OpenOffice was then able to read them. (I’ve not got to the bottom of this but it’s something to do with some assumptions OpenOffice makes about XML.)

Vacation Learning and Developing

I think it’s fair to say I’ve been using "interstitial" time to play with stuff and get things built.

Learning How To Hack The DOM with jQuery and Dojo

(For those that don’t know jQuery and Dojo are javascript frameworks – free and Open Source.)

The first thing I did was to install jQuery and buy the excellent O’Reilly "jQuery Cookbook". This introduced me to a better way of parsing HTML / XML. It uses CSS selectors as a query mechanism – which is REALLY nice.

The second thing I did was to see if Dojo could do something similar. It turns out that dojo.query is pretty similar and converging on jQuery’s capabilities. (1.4 adds some more.) If you’re wedded to Dojo (as I am) I recommend you look at dojo.query and (related) NodeList support. It’ll make "hacking the DOM" much easier. (And later developments built on this.)

(If you’re looking for a good introduction to Dojo try Matthew Russell’s “Dojo: The Definitive Guide”, also published by O’Reilly. It could do with updating for the next release but it’s perfectly fine for 1.4.)

Using PHP To Simplify Dojo Development

I now have a small set of PHP functions I’ve built up over the months that make it very easy for me to create a web page that takes advantage of Dojo. So, for instance, it’s very easy to write the stuff in the "head" and "body" tags to make Dojo create widgets (Dijits) and pull in the necessary CSS and javascript.

One problem I wanted to solve was to prettify the HTML that B2H generates. It’s at the 3.2 level and is really not at all "structural" so CSS styling would prove to be a bear. (It has no class or id attributes, for example.)

Dojo can automate (with xhrGet) the asynchronous loading of files from the server. So the first thing I taught my PHP code how to do was to load some HTML and then to insert it (via innerHTML) below a specified element in the web page. (At first I used "id" as the anchor but then used dojo.query (see above) to allow the HTML to be injected ANYWHERE in the page.)

(Because not all the data I want to display in a page is HTML I added a "preprocess the loaded file" capability. So, for example I can now take a newline-separated list of names and wrap each name in an "option" tag.)

So, I can now pull in HTML from a side file. The point is to be able to work on it…

Injecting a CSS link was easy. It’s just a static “link” tag.

But some parts of the dragged-in HTML aren’t really distinquishable from other parts. So I can’t style them differently. So I wrote some more code to be able to post-process the injected HTML (once it’s part of the page). So, for example, a table description acquired a “tdesc” class name – and so CSS selectors can work with that. To do the post-processing I leaned heavily on Dojo’s NodeList capability – as it made the coding MUCH easier.

So now, if I show you an HTML report based on your data it should look MUCH prettier. (I’ve been showing customers their machines and LPARs as pretty ugly HTML.)

Dojo TabContainer Enhancements in 1.4

Some time over the vacation I installed Dojo 1.4 and converted from using 1.3.2.

I hadn’t expected this but the dijit.TabContainer widget that I was already using to display GIFs got enhanced in 1.4…

Instead of multiple rows of tabs you (by default) now have one – with a drop-down list to display all the tab titles. (Amongst other things this means a PREDICATABLE amount of screen real-estate taken up by the tabs.)
Scroll forwards and backwards buttons to allow you to page amongst the tabs. (Actually left and right arrow keys allow scrolling as well.)

Altogether it’s a much slicker design. I’ve opened a couple of enhancement tickets:

These really are “fit and finish” items but they would help with a11y (Accessibility) as well. (I’ve made contact (via Twitter) with IBM’s Dojo a11y advocate and she’s aware of these two tickets.)

Conclusion

This has been a long and winding blog post. But I think it illustrates one thing: Through small incremental enhancements (done in “interstitial time”) you can make quite large improvements in code. But then, this IS hobbyist code.

I’d also like to think I learnt a lot along the way.

Now to go explain to my manager why I’d like (as a mainframe performance guy) to become a contributor to the Dojo code base. 🙂

zAAP CPU Time Bug in Type 72 Record – OA29974 Is The Cure

(Originally posted 2009-12-07.)

If you use RMF Postprocessor you won’t see this one. If you use Service Units rather than CPU seconds fields in the SMF 72-3 record you also won’t see it. It’s only if (like me) you use the CPU time for zAAPs in your CPU Utilisation calculation that you’ll run into this problem.

If you examine fields R723IFAT (zAAP CPU Time) and R723IFCT (zAAP on GCP CPU Time) you might find them zero when you don’t expect them to be i.e. when the Service Units analogues (R723CIFA and R723CIFC) are non-zero. IFAT and IFCT are indeed out of line and CIFA and CIFC (and the Workload Activity Report) are correct.

A (sensible) suggested workaround is to use CIFA and CIFC, converting from service units to CPU time.

The answer is to apply the fix for APAR OA29974. I think I’d apply it anyway. It seems like a fairly harmless PTF.

I ran into this because a colleague showed me a Workload Activity postprocessor report with substantial zAAP on GCP time in it, whereas MY code showed zero. (My code was correct but misdirected by the data.) 🙂

Because I can’t control the SMF that customers send me I think I’m going to have to code around this one – if I start seeing this regularly.

Plan Your ESQA Carefully For z/OS Release 11

(Originally posted 2009-11-30.)

Thanks to Marna Walle for pointing out this change:

In z/OS Release 11 there is a requirement for an additional 1608 bytes of ESQA per address space. To put that in context, I’ll do some obvious maths: That’s about 1.6MB per 1000 address spaces. It just might be of interest to certain customers I know with thousands of CICS regions in a system, or very large TSO or Batch systems. It’s probably not enough to trouble most people. But it reminds me of the importance of having a quick virtual storage check when migrating from one major product release to another.

There are several ways of checking for this particular one:

You can use Healthchecker VSM_SQA_THRESHOLD check.
You can process the SMF 78-2 Virtual Storage record.

The latter would be my favourite as using the SMF 78-2 data to look at usage by time of day can show some useful patterns. You might want to review, for example, whether (E)SQA threatens to overlow into (E)CSA. It’s not a big tragedy if that happens but your installation might have views on such things.

(In case you’re unfamiliar with such things the “E” in “(E)SQA” and “(E)CSA” refers to 31-bit areas whereas the names without the “E” refer to 24-bit areas, there being analogues above and below the line for both SQA and CSA.)

One other thing – in case you think ESQA and ECSA are unimportant having very large such areas can impact on the 31-bit Private Area virtual storage picture.

DFSORT Does JOIN

(Originally posted 2009-11-27.)

A new set of function was recently made available for DFSORT via PTFs UK51706 and UK51707.

In this post I want to talk about the new JOINKEYS function, and try to add a little value by discussing some performance considerations. I’ve had the code for a couple of months and have played with it but not extensively. So much of what follows is based on thinking about the function (described in this document) and bringing some of my DB2 experience to bear.

With this enhancement DFSORT allows you to do all the kinds of two-way joins DB2 folks would expect to be able to do – in a single simple operation. "Two way" refers to joining two files together. You can perform e.g. a three-way join by joining two files together and then joining the resulting file with a third. With "raw" DFSORT that would be two job steps. With ICETOOL you can make this a single job step. In any case I think I’d recommend using ICETOOL because converting to ICETOOL later when you find you want to add a third file to the join would be additional work.

How JOINKEYS Works

Before talking about performance let me describe how JOINKEYS works. In JOINKEYS parlance we talk about files "F1" and "F2". Indeed the syntax uses those terms…

The join itself is performed by the main DFSORT program task. I receives its data through a special E15 exit and processes it like any other DFSORT invocation, with the exception that it knows it’s doing a join. So things like E35 exits and OUTFIL all work as normal.
Both F1 and F2 files are read by separate tasks. Each of these writes their data using an E35 exit. Normal processing capabilities such as E15 exits (potentially different for F1 and for F2) and INCLUDE / OMIT and INREC processing apply.
The F1 and F2 tasks and the main tasks communicate by "pipes" constructed between the F1 and F2 E35 exits and the main task E15 exit. These pipes have no depth and don’t occupy significant working memory or any intermediate disk space.

A Potential For Parallelism?

So we have three DFSORT tasks operating in parallel, feeding data through pipes. In principle they could run on separate processors. The extent to which that’s useful would, I think, depend on whether these tasks are performing sorts or just reformatting copies. I say this because in the copy case I’d expect the F1 and F2 tasks to be interlocked with the main task whereas in the sort case there’s stuff to do before we get to writing through the pipes. And in the latter case we’re probably only effectively driving two separate processors. But this is a fine point.

In any case we derive I/O Parallelism because the F1 and F2 tasks run in parallel. Again its usefulness depends on timing.

Managing The Sorts

You can specify whether the F1 and F2 tasks perform a sorts. So you could declare that F1 was already sorted, whereas F2 wasn’t.

You can decide whether DFSORT will terminate if the F1 or F2 files are not in order. (This only applies and makes sense if you’ve claimed the data was already sorted.)

You can specify whether the main task sorts the results of the joined F1 and F2 files.

More on why sort avoidance might be important in a minute.

Join Order

As I mentioned earlier, you can use repeated invocations of JOINKEYS (most readily using ICETOOL) to join more than two files together.

Now this is where some DB2 SQL tuning background comes in handy…

You have a choice which order to join the files in. As this isn’t DB2 you don’t have the Optimizer making such decisions for you. So you have to decide for yourself. But think about it: If you joined a large file to a small file in Step 1 and then joined the large resulting intermediate file to another small file in Step 2 you’ve chucked a lot of data around – twice. If you could arrange to join the large file in Step 2 to the results of joining the small files in Step 1 there would be less data chucking around. It ought to run faster.

Cutting Down The Data

As with all DFSORT invocations, cutting down the data early is important: Joining two large files together, only to throw away large amounts of the result is inefficient: If you can throw away unwanted records on the way in, or can throw away unwanted fields, the join will be more efficient. In the F1 and F2 tasks you can.

In the F1 and F2 tasks you can supply file size estimates – as they each have their own control files – by default “JNF1CNTL” and “JNF2CNTL”. You could do this for the main sort, too. In the F1 and F2 case this is more important when you cut down the files on the way in.

Avoiding Unnecessary Sorts

If you know the files you are joining are already sorted in an appropriate order for the join you can avoid sorts on the way into the join. And this will obviously be more efficient. If you can live with the order DFSORT writes the records from JOINKEYS you can use COPY rather than SORT in the main task.

Memory Usage

In the worst case – where F1 and F2 files are sorted in parallel and where the main task also sorts data – you have the potential for large amounts of memory being necessary. You need to cater for that.

In Summary

I really like this function. It removes the need for much fiddliness – and it does it in a simple way. (I’m conscious I’ve shown no examples but the documentation linked to above is replete with them.)

My perspective is as a performance guy who has some knowledge of how DB2 does joins. This isn’t the same code so the lessons from the DB2 Optimizer have to be applied sparingly. And note we don’t even have indexes on sequential files (though you could simulate an “index scan” by retrieve only the join keys.)

I’d like to do some performance runs that illustrate the points above. I’m a little tight on time right now – so that’ll have to wait. And I’m sure there’s more thinking that could be done on how to tune JOINKEYS invocations.

Channel Performance Reporting

(Originally posted 2009-11-22.)

Our channel reporting has consisted forever of a single chart. Before I tell you what the chart looked like I’ll hazard that your channel reporting was about as bad. 🙂

See, it’s not something people tend to put much effort into.

Our one-chart report basically listed the top channels, from the perspective of the z/OS system under study, ranked by total channel utilisation descending – as a bar chart. The raw data for this is SMF Type 73. Actually there were two refinements people had made over the decades:

Someone acknowledged the existence of the (then-called) EMIF capability to share channels between LPARs in the same machine. So stacked on top of this partition’s busy they added other partitions’ busy.
Someone supported FICON by using the new FICON instrumentation to derive channel utilisation. (Of course if the channel’s not FICON we still use the old calculation: with some smart copying involved.)

And that’s where we left it until I got my hands on the code…

The first thing I did, some months ago, was to add the channel path acronym (for example “FC_S” for “FICON Switched”). This is also in SMF 73.
The second thing was much more significant:
The “other partitions’ busy” number is all other partitions’ use of the channel, without breaking down which other partitions these are.
The third thing was a nice “fit and finish” item: Listing which controllers were attached to which channel.

Which LPARs Share This Channel

Each z/OS image can create its own SMF 73 records. For me I’m hostage to which systems my clients send in data for. Also I have to cut down the potential LPARs in the data. I do this using the following rules:

The channel number (in Type 73) has to match.
For multiple Logical Channel Subsystem (LCSS) machines (System z9 and System z10) the LCSS number must match. (This can be gleaned from Type 73. Actually Type 70 as well – as each LPAR has only one LCSS.)
The machine serial number has to match. (Machine serial number isn’t in Type 73. You have to go to the Type 70 for it.)
(I do a “belt and braces” check that the Channel Path Acronym (in Type 73) matches.)

So that set of checks tells you which LPARs really share the channel. And so you can then stack up their utilisations to gain a better picture of the channel. It’s quite nice when you do.

One other thing: Because I don’t necessarily see all the LPARs sharing a channel I compute an “Other Busy” number and add that to the stacked bar. In fact my test data showed all the major channels were missing LPARs’ contributions.

Which Controllers Are Accessed Using This Channel

To me a channel isn’t really interesting until you know what’s attached to it. (In my current set of data my test LPAR’s data shows one group of four channels attached to five controllers and another group of eight attached to two controllers.)

Working out which controllers are attached is quite fiddly:

Use SMF 78 Subtype 3 (I/O Queuing) records to list the Logical Control Units (LCUs) attached to this channel.
Use some magic code we have to relate LCUs to Cache Controller IDs. Basically it does clever stuff with SMF 74-5 (Cache) and 74-1 (Device) records to tie the two together.

I made a design decision not to annotate the graph with LCU names as there are usually many in a Cache Controller. It would be very cluttered if I had. (I do have another report that lists them and the channels attached to them.) Instead I list the Cache Controller IDs. You can probably relate to Controller IDs. If we’ve done our homework (and as we use your cache controller serial numbers we generally have) you’ll recognise the IDs.

So, if you’re one of my customers and I throw up a chart that shows channels and systems sharing them and the controllers attached it may look serene and slick. But believe me, there’s a lot of furious paddling that’s gone on under the surface. 🙂

But I tell you all this in case you’re wondering about how to improve your channel reporting. And I still think there’s more I can do in this area – particularly with the (more exotic) SMF 74-7 record, which brings FICON Director topology into play. And everything I’ve said above applies equally to whichever tools you use to crunch RMF SMF, I’m quite sure.

A Few Thoughts On Parallel Sysplex Test Environments

(Originally posted 2009-11-09.)

There’s a pattern I’ve seen over a number of test Parallel Sysplex environments over the past few years, a couple of them in situations this year:

It’s not much use drawing performance inferences from test environments if they’re not set up properly for performance tests.

Sounds obvious, doesn’t it?

There are two problem areas I want to draw your attention to:

Shared Coupling Facility Images

If you run a performance test in an environment with shared coupling facility images you stand to get horrendous request response times and the vast majority of requests going async (given a chance). I’ve even seen environments where XCF refuses to use coupling facility structures and routes ALL the traffic over CTCs. (And I’ve seen a couple of environments where there are no CTCs to route it over and XCF traffic is then reduced to a crawl.)
"Short Engine" z/OS Coupled Images

In a recent customer situation I saw the effect of this: The customer was testing DB2 Loads where actually it was a bunch of SQL inserts. They were also duplexing the LOCK1 structure for the data sharing group. The Coupling Facility setup was perfect, but still response times became really bad once duplexing was established for the LOCK1 structure. Two salient facts: Because of duplexing all the LOCK1 requests were async. XCF list structure request response times were always awful.

The answer to why this problem occurred lies in understanding how async requests are handled: The coupled z/OS CPU doesn’t spin in the async case. In the "low LPAR weight relative to logical engines online" case the z/OS LPAR’s logical engines were but rarely dispatch on physical engines. This meant there was a substantial delay in z/OS detecting the completion of an async request. Hence the elongated async response times. As I said, the LOCK1 structure went async once it was duplexed.

As it happens the physical machine wasn’t all that busy: Allowing the LPAR to exceed share – using a soaker job – ensured logical engines remained dispatched on physical engines longer. And, perhaps paradoxically, the async request response times went right down. This, I hope, reassured the customer that in Production (with "longer-engine" coupled z/OS LPARs) async coupling facility response times ought to be OK.

Now, this is just Test. But it could unnecessarily freak people out. But, hopefully, it’s easy to see why Test Parallel Sysplex environments might perform much worse tan Production ones.

(I’m guessing you’re going "duh, I knew Test would be worse than Prod". 🙂 But these two cases are specifics of why Test might be even worse compared to Prod than expected.)

Anyhow, I thought they were interesting. And I have seen 1. quite a few times now. 2. not so much, in fact only once so far.

DDF Performance – Version 3 – but still highly relevant

(Originally posted 2009-11-08.)

I’ve just submitted a set of slides to Slideshare. They’re not mine, they’re not new, they’re not even in a modern format. But they are a very good presentation worth preserving…

In 1993 Curt Cotner presented a set of slides on the new DDF Inactive Thread support in Version 3 of DB2. It’s still highly relevant and this support was the base on which the Version 4 WLM classification line item was built.

You can find the slides here.

I’d also recommend you went on to read John Arwe’s paper on Preemptible-Class SRBs.

Hello World Again!

(Originally posted 2009-11-08.)

Apologies for being away. For a while I couldn’t get the blog to operate – as an author. And then I gave up trying. 😦

Count this as a "testing 1-2-3" post but also as notice I intend to back with more (hopefully useful) content soon. MUCH has happened in the "intermission". 🙂

European System z Tech Conference – Brussels 4-8 May 2009

(Originally posted 2009-05-07.)

I’m reporting what I’m learning (or I think is significant) in conference sessions on Twitter. My Id is “MartinPacker” and I’m using the Hashtag “#zOS09” to tag my posts. Feel free to follow along. In principle you don’t even need to sign up to Twitter to do this.

It seems more immediate than posting here.

Oh, and feel free to comment on Twitter using the same tag.