Of interest to performance people in DB2 Stored Procedures environments

(Originally posted 2005-05-04.)

I just saw the following APAR close. PQ99525 fixes a couple of problems for “nested” DB2 access i.e stored procedures, triggers and user-defined functions (UDFs).

For anyone thinking “this is not my shop” I think you should consider that these functions are appearing in many modern applications and your Applications people probably won’t think to tell you when they start using them.

To quote from the APAR description:

  1. Class 1 accounting logic for UDFs, stored procedures and triggers could capture and record small fractions of class 2 in db2 time. This would result in QWACASC and QWACAJST having non-zero values when class 2 accounting was NOT active.
  2. UDFs and stored procedure require in db2 time to connect and disconnect UDF or stored procedure tasks to db2. This time was not being accounting for in class 2 in db2 times (QWACSPTT, QWACSPEB, QWACUDTT, QWACUDEB). Class 3 suspension time is clocked during this connect and disconnect processing and thus class 3 time could be significantly greater than class 2 time.

It’s hard enough keeping track of the nested time without problems like this. We described how to do this in the “Stored Procedures: Through The Call And Beyond” Red Book.

ESCON vs FICON and LPAR IDs

(Originally posted 2005-05-03.)

My thanks to Greg Dyck for pointing out the following on IBM-MAIN:

“The Store-CPU-Identifier now allows for a 8 bit partition identifier but the ESCON I/O protocols only allow for a 4 bit identifier. This is the reason that multiple channel subsystems must be implemented if you want more than 15 partitions on a CEC. FICON does not have this limitation.”

And to think ESCON’s only 15 years old. 🙂 I remember at the time having the EMIF support explained to me – including the LPAR number field. At the time 15 LPARs seemed an awfully large number, particularly as people were still grappling with how to configure machines with 2 or 3 LPARs for performance.

LLA and Measuring its use of VLF

(Originally posted 2005-04-28.)

I’m reminded by a question on MXG-L Listserver that many people don’t understand how LLA works – and in particular how to interpret the statistics in VLF’s SMF 41 Subtype 3 record.

You really have to understand the exploiter to make sense of the statistics. Here’s how it applies to LLA…

LLA (when it was Linklist Lookaside, prior to becoming Library Lookaside) could cache load library directories in its own address space. You get no statistics for this behaviour. 😦

Then in December 1988, when MVS/ESA 3.1.0e was released, LLA got renamed to “Library Lookaside”. The key difference was that you could (selectively) enable exploitation of VLF cacheing of modules. Compare this with “XA” LLA which only cached directories. You enabled module cacheing by specifying in your COFVLFxx member a new class (CSVLLA) with an EMAJ of LLA and a MAXVIRT (which defaulted to 4096 pages or 16MB). 16MB was large in 1988. Now it’s puny.

Now here’s why the VLF statistics are deemed “meaningless” for (CSV)LLA: LLA only asks VLF for something if it knows it’s going to get it. So you always get something close to 100% hits with LLA’s exploitation of VLF. But you probably would regard the “successful find” rate as a reasonable metric of benefit.

It’s much better in my opinion to look at the load library EXCPs or SSCHs. After all those are what you’re trying to get rid of. I know this is old news but a few years ago I got into a dialogue with the LLA component owners. They suggested that installations start with a VLF specification of 128MB for LLA – and then worked up from there.

So that’s what we do – when it’s deemed relevant to look at this sort of thing. (I wrote this particular piece of our code in 1994.)

Coupling Facility Capacity Planning

(Originally posted 2005-04-26.)

A friend of mine from the ITSO in Poughkeepsie is contemplating a Red Paper on Coupling Facility Capacity Planning. I think this is a splendid idea – and would be willing to contribute myself.

But I’m wondering what y’all think of the idea. And of what you’d like to see in such a Red Paper.

(Red Papers are less formal than Red Books – which we all know and love.)

Innsbruck z/OS and Storage Conference – Day 4

(Originally posted 2005-04-14.)

Session TSP11: Performance of the TotalStorage DS8000, Speaker: Lee La Frese

Of course a new disk controller from IBM is going to appear awesome. 🙂 DS8000 is no exception.

Currently 2-way and 4-way POWER5 processors. Statement of Direction for an 8-way machine. POWER5 can go up to a 64-way.

I/O access density has levelled out over the last five years (2000 to 2005), having historically decreased (over the 1980 to 2000 period).

There’s a new adaptive cacheing algorithm (called ARC). Very effective at low hit ratios (for Open). Likely to have less benefit at higher cache hit ratios (eg z/OS workloads).

A small number of customers in the room have FICON Express2. There is a paper from Poughkeepsie on this.

Channel Measurement Block granularity has recently decreased from 128us to 5us. Which actually has been known to cause an increase in response times reported by RMF. But this doesn’t affect Lee’s performance numbers which come from the controller itself and aren’t subject to mainframe processor considerations.

PPRC (Sync Remote Copy) performance has improved dramatically. The big reductions were in ESS. But a 0.38ms cost for DS8000 over 8KM has been measured. This is not really showing what happens at distance. 303km distance showed as about 4ms and 0km as about 0.5ms. Lee noted that you could interpolate reasonably well as the elongation with distance is pretty much linear. The above numbers are at low load. Load doesn’t alter the picture very much – until you hit the knee of the curve.

Session Z16: Sharing Resources in a Parallel Sysplex, Speaker: Joan Kelley

Actually this was rather more about CF performance than actual resources. But it was very useful nonetheless.

Joan talked about shared CFs. Her “classic” scenario is a “Test” CF whose performance is not important sharing with a “Prod” CF where is important. Recall: each CF needs its own Receiver links – so need to configure accordingly.

Delayed requests can even break Duplexing – as a consequence of timing out a request.

Dynamic Dispatch puts the CF to sleep for 20ms at low utilisations, less time for higher utilisations.

A good design point for this sharing case is to turn Dynamic Dispatch on for “Test” and off for “Prod”. (D DYNDISP shows whether Dynamic Dispatch is on for a CF. (Obviously you can also infer that from RMF as well.))

She had a good foil on responsiveness for different “Test”/”Prod” LPAR weight regimes. It showed that at low Test request rates the weights don’t matter much. At higher rates, though, weights of eg 5/95 produce much better responsiveness than eg 30/70. With System-Managed Duplexing you should set weights so that the Prod (duplexed) CF is dispatched 95% of the time – to avoid the timeouts I mentioned earlier.

With Dynamic ICF Expansion the CF can expand to additionally use a shared engine. If the shared engine is totally available – i.e no other LPAR wants it – the performance is close to that additional engine being dedicated.

Because ICF engines, IFL engines and IFA/zAAP engines share the same pool it is possible for an IFL or IFA to acquire cycles at the expense of the ICF LPARs.

There were several foils on CF links. I’m going to have to learn up on this as well: It’s got a lot more complicated. 🙂

Session ZP12: DB2 Memory Management in a 64-Bit World, Speaker: Me

This went reasonably well. I did get one question which was about what value to set MEMLIMIT at (which I think translates into what REGION value to have for the DBM1 address space). At present the answer has to be “I don’t know.” 😦 That’s because I don’t know if Extended High Private users can expand down into memory below the REGION value. If that makes any kind of sense. 🙂 I clearly need to research how REGION interacts with Extended High Private (which is what DB2 uses – mainly).

Session Z20: What’s New in DFSORT?, Speaker:Me

A small but reasonably engaged audience. One good question was essentially “if I’ve set my DFSORT parameters in particular ways years ago which do I need to change now?” Basically it would take a review of the parameters but most of them wouldn’t need to change. But some of newer ones are worth looking at to see how helpful they could be.

Innsbruck z/OS and Storage Conference – Day 3

(Originally posted 2005-04-13.)

Session ZP05: Parallel Sysplex Tuning, Presenter: Joan Kelley

Joan gave a good summary of CF levels and general-but-detailed tuning recommendations.

Enhanced Patch Apply on z890 and z990 allows different CF levels on the same CEC. But a customer suggested that in their environment re-IPLing one of the CF LPARs brought it automatically up to the newer level. Joan will try to reproduce in the lab.

CF Level 14 dispatcher changes are to help with scaling to larger request rates. For example duplexing CF to CF signalling does not get stuck behind other requests.

SMF 74-4 CF utilisation no longer includes polling for work. This is a more realistic view of CF busy. SMF 70 CF
LPAR busy is not changed.

CF busy ROTs based on at what utilisation the service time doubles.

Joan positioned duplexing as having the most benefit for structures that don’t support rebuild. Also for IXGLOGR
structures duplexing reduces the need to have staging (disk) data sets, which should improve Logger performance.

Joan offered a rule of thumb of duplexing recovery being 40 times faster compared to log recovery, and 4 times
faster compared to rebuilding the structure. Of course this comes at a cost but Joan reiterated what we already
know: Smart installations have some “white space”, especially on CPU and memory for the failure takeover case.

ISCLOCK is a good structure to do performance tests with as it’s the fastest structure because requests carry no
actual data.

IC Links (internal to the processor complex) are processed by any physical processor – whether Pool 1 (CPs) or Pool
2 (ICF/IFL.zAAP). Two links are usually enough. Limit the number to the sum of Pool 1 and Pool 2 engines – 1. More
than that can cause performance degradation.

If you increase the ISGLOCK structure and get no more entries look at APAR OA10868. It fixes a bug.

Joan mentioned that XCF traffic is increasing in general – because of new exploiters. So it might be worth keeping a
closer eye on it.

Session TSP08: New RMF Measurements for the DS6000 and DS8000, Speaker: Siebo Friesenborg

Siebo used to work in our worldwide consulting team but now he works for the Storage product division.

If you haven’t seen Siebo present I suggest you do – whatever the topic. Just take your sense of humour along with you. It’s wonderful to listen to / watch him. And Siebo was on top form today.

If you specify ESS in RMF 74-8 records will get written with a series of counters for ECKD, PPRC or SCSI. NOESS is
the default. Siebo strongly recommends you specify ESS.

74-5 enhanced Raid Rank statistics to give numbers for each physical volume – for DS6000 and DS8000 (2107).

Session G10: zSeries vs UNIX vs AMD/Intel Platforms: Which one is best?, Speaker: Phil Dalton

This was a cheering marketing presentation but with lots of good information. Here are a just a couple of factoids:

In the last few years the zSeries application portfolio has grown dramatically (particularly if you include Linux on
zSeries).

Phil knows of customers with in excess of 8,000 Linux servers under z/VM on a single machine. Obviously it’s not
nearly as many as the “gee whiz” demo we used to do. But it is a real customer having a mind-boggling large number
of Linux servers on a single footprint.

Session TSS18: DFSMS Catalog Overview and Diagnostics, Speaker: Becky Ballard


This was for me a good review of ICF Catalogs. (It’s been a while.)

Catalog Address Space (CAS) normally is ASID 000C or 000D. If not it’s not necessarily a problem but it will
indicate a restart. Restarting CAS actually is only minimally disruptive: It shouldn’t harm users that already have
a data set open. Catalog requests will be redriven by CAS on restart if they were in-flight.

Becky uttered the words “experience the IDC3000I message”. I have nothing against that particular message – I just
thought it was a lovely turn of phrase.

LISTCAT basically formats the VVRs (VSAM Volume Records). There are two types of VVR: Primary for the cluster and
secondary for each component. LISTCAT draws information from both.

F CATALOG can be very useful in checking maintenance levels, performance information, etc. Also can actually modiFy things. 🙂 You can even modiFy the CAS service task limit – but I’m not clear why you’d need to.

Informational APAR II10752 gives good advice on Catalog performance.

Innsbruck z/OS and Storage Conference – Day 2

(Originally posted 2005-04-13.)

Session z05: z/OS Sysprog Goody Bag, Presenter: Bob Rogers

Bob is always good value. This time was no exception.

Consoles Restructure undertaken because a problem cause analysis for Parallel Sysplex showed Consoles as being the
weakest link. Also there was a need to remove some constraints, such as the 1-byte console ID, plus the need to reduce the scaling inhibitor of metadata in large sysplexes. z/OS R.7 introduces Phase 1B. This is the last release to support programs that use 1-byte console IDs, but it will only support them if they were assembled on a previous release. R.5 provided a program to track 1-byte console ID usage.

In R.7 Program Objects’ metadata can be compressed using hardware compression. The program itself isn’t, which allows the program executable to be run on prior releases. But you can’t reprocess (using the Binder APIs) on a prior release – because this metadata couldn’t be decompressed. Also the Binder can resolve relative branches between CSECTs in Program Objects. These are instructions like BRC, BRAS,BRCL, BRASL and LARL. This became important
with the introduction of “Long Relative Branch”. BRC and BRAS are of course “old” 2-byte immediate operand
instructions. The others use 4-byte immediate operands. LARL is in fact not a branch but a Load Address Relative
(Long). Note: This is “Program Object” only.

In R.7 SVC Dump does not dump unreferenced pages – which improves its performance. Also it puts out performance statistics. This is obviously in support of the fact that we’re going to have to dump more and more data.

In R.7 SDSF has a new JES2 Resources dialog. This lets you display resource (eg BERT) usage, see the limits in force
(and change them) and see historic detail. It should help avoid running out of resources.

Peter Relson (who I know to be excellent) worked for 6 months to add robustness to the Health Checker (which was

missing from the “free download” version). So the R.7 version should be more robust. Also a “check interface” was
added that should allow other products (both IBM and OEM) to add their own checks. SDSF now has a panel to display
and control active checks. New checks are in the areas of GRS, Consoles, SVC Dump, XCF, Unix System Services, RACF.

Bob asked if anyone was playing with zAAPs. At this point no-one in his audience admitted to it. Which I think chimes
with my view that people will later this year.

Bob confirmed that 32-way images will be supported from June 2005. DB2 will need PTFs for Query Parallelism – so
that its code to split queries according to the number of engines will cope with more than 16 engines.

Bob is hearing that some other software vendors are not including zAAPs in the capacity they base software licence
charges on. He did not name them but I mention this because it means that it’s not just IBM that’s doing this.

In R.6 TSO TEST has been enhanced, after a long period of inactivity. For example it now uses an IPCS-supplied table
to ensure current instructions are supported. Similarly new bits in the PSW (bits 12 and 31, in particular) are
supported. Its instructions are also displayed in a more readable format.

An installation can now specify the SMF buffer size using BUFSIZMAX(nnnnM), which ranges from 128MB to 1GB. Also
BUFUSEWARN(percent) in range from 10 to 100% so you can vary when buffer usage warning message is produced.

Bob also discussed the R.6 implementation of additional Linkage Indexes (LX) for z890 and z990. MQ and DB2 have PTFs to support this. A subsequent stage is Reusable LX’s. R.6 has the support but it will take a little longer for the subsystems to take advantage of this. Basically a sequence number is associated with a given LX so that the right version is used.

Also mentioned was the R.6 WLM enhancement to support nested DB2 Stored Procedures better: WLM attempts to start
server regions more aggressively if dependent requests have to wait. I’ve been involved in a couple of situations
this would’ve helped with. Also WLM (rolled back into R.4) has enhancements for placement of Websphere stateful
sessions.

1 customer admitted to running DB2 Version 8. I think the same question next year would have many more takers.

In a previous release Format 0 virtual channel programs were translated into Format 1 real channel programs. In R.6
Format 1 virtual channel programs are also translated into Format 1 real channel programs.

In R.6 C/C++ applications can now exploit 64 bit virtual.


Session TSS09: VSAM RLS 64 Bit Buffering Enhancement, Speaker: Terri Menendez.

A good presentation on work in progress to convert RLS from using the SMSVSAM dataspace to 64-bit virtual. You will
be able to implement this selectively. Also the management algorithms are enhanced.


After lunch I have to admit I took off for the hills. Innsbruck has a nice bus service that gets you out of the city into the countryside very quickly. And after walking a mile or so Innsbruck disappeared behind some hills. So you can get some solitude and beautiful views very quickly.

Innsbruck z/OS and Storage Conference – Day 1

(Originally posted 2005-04-11.)

Here’s a summary of MY first day at the z/OS and Storage conference in Innsbruck. I’ll be reporting from a PERSONAL point of view (using Notepad to capture thoughts). Hopefully some of these items will be of some use to you. I admit this is all a bit raw.

Already I’ve run into several UK customers and a lot of developer friends (and fellow presenters I’ve known a long time).

Session Z01: What’s New In z/OS? Speaker: Garry Geokdjian

This was an introductory session to z/OS R.6. It also touched on R.7. As a performance person it’s hard to keep up with the more minor details of each release. So this was a good chance to fill in the gaps.

Garry previewed the “New Face of z/OS” initiative which proposes a Web User Interface which is consistent across

tasks. These tasks would be automated and simplified tasks and would have integrated user assistance.

z/OS Load Balancing Advisor is a new feature in R.7 Improved Dynamic Virtual IP Addressing in R.4 and R.6. Can
rename LPAR without IPL in R.6 (z890 and z990 required.)

There is a PTF to R.4 to change CPU speed of z800 and z890 without an IPL.

In R.6 you can reserve LPARs with a “*” for the LPAR Name.

A 32-way z/OS image previewed as a PTF for z/OS R.6 for later in 2005.

64-bit Java 2 1.4.1 was made available September 2004.

IBM Communication Controller for Linux (CCL) for zSeries 1.1 emulates a 3745 Comms Controller, so you can run ( most
of) NCP under Linux. Already available – but this is an announcement I’d not spotted.

Initial z/OS support for Enterprise Workload Manager (EWLM) was made available December 2004. This today provides
reporting of business performance objectives and breaks down response times across the whole environment. In the
future it will provide workload balancing recommendations.

From R.7 the root File System will be zFS.

z/OS Load Balancing Advisor will use SASP protocol to provide routing recommendations to a SASP-compliant router to help with load balancing.

IBM HealthChecker for z/OS and Sysplex has been very successful and will be incorporated into R.7. Additional checks will be added.

XRC+ makes System Logger more attractive in GDPS environment.

Statement of Direction for The VSAM Connector for z/OS, a JDBC connector for VSAM.


Session TSS06: VSAM RLS Overview, Speaker: Terri Menendez

With RLS, SHAREOPTIONS(2,x) allow some level of sharing between RLS (can read/write) and non-RLS (for read).

SMSVSAM address space is RLS. Control blocks and buffer pools are in a dataspace.

To use RLS you have to have a CF even for single-system operation. RLS uses cache structures and a lock structure
(IGWLOCK00). The default lock structure sizing is generous. The cache structures might be a little small. SMS assigns data sets to Cache Sets, each of which is associated with a CF cache structure.

RLS Development has put a very great deal of effort into Reliability Availability and Serviceability (RAS). Much of
this has come out through APARs.

RLS does lock detection and can supply a bad return code to the caller in the event of a deadlock.

D SMS,SMSVSAM is a useful operator command for displaying the status of the RLS infrastructure.

Catalog calls SMSVSAM to delete a data set, because there might be retained locks associated with the data set. Likewise DFSMSdss.

Automation between CICS and VSAM RLS. eg F cicsname,CEMT SET DSN(RLSADSW>VFA1D.*),QUI to quiesce on all CICS regions
sharing the data set. (It was news to me you could do a CEMT by modiFying the CICS address space.)


Session G04 zSeries Processors Migration Considerations, Speaker: Parwez Hamid

ESCON channels cannot be spanned across LCSS’s. FICON can but needs the same CHPID for each LCSS.

z990 I/O Configuration “Plan Ahead” can be used to install additional cages when installing the z990 – to avoid
outages when upgrading the I/O configuration later on.

I’m reminded that the Bimodal Accomodation Offering is not available for z/OS R.5 and subsequent releases.

z/OS R.4 z990 Exploitation code supports more than 1 LCSS, more than 15 LPARs and 2-digit LPAR IDs. WSC Flash 10236
describes this.

Parwez reminded us of the the good reasons why LSPR comparisons between eg 9672 and z990/z890 are not directly
comparable.

z990 GA3 allows conversions of engines to Unassigned and between types.

Adding another z990 book (perhaps for more memory) and POR’ing is quite likely to cause PR/SM to re-evaluate which
physical engines to use and hence it’s pretty likely some engines on the new book will be used.

z990 GA3 is required for dynamic LPAR renaming (mentioned above).

If going to CF Level 14 use the CFSIZER tool to determine if you need more CF storage – it’s quite likely you
will.

Session ZP03: Much Ado About CPU, Speaker: Me 🙂


I feel I rushed this a little – but it WAS the first time I’d delivered the material. There were a couple of
questions. One related to not being able to treat multiple clusters as one. I think the customer has multiple
parallel sysplexes which touched the one machine and therefore more than one cluster. IRD will not manage between
clusters. The other question was a comment that for bureaux the need is to limit an LPAR’s CPU consumption. My only
answer to that is that LPAR design needs to take that carefully into account.

p>

DB2 UDB for z/OS Performance Topics

(Originally posted 2005-03-09.)

I always look forward to the publication of the “DB2 Performance Topics” red book for each release of DB2. This time was no exception. Except: I was lucky enough to participate in reviewing this one – for Version 8. Most of the credit, though, goes to a great team of residents and developers (my role being minor).

So here’s a link to a draft:

http://w3.itso.ibm.com/redpieces/abstracts/sg246465.html

It’s expected to be published by the end of March, but is only a draft at this stage.