Mainframe, Performance, Topics

Another Way Of Looking At CICS Monitor Trace Data

(Originally posted 2012-06-30.)

The CICS Monitor Trace SMF Record (Type 110) has got to be one of the most complicated SMF records in existence – and for good reason.

Which is precisely why I’m not going to attempt to process the raw records in my code. (And why PMCICS doesn’t support releases after a certain point.)

But the "and for good reason" hints at the fact I think this is a tremendously valuable type of instrumentation – because it goes down to the individual transaction instance level.

You’ll have seen from He Picks On CICS I think CICS is a very interesting product to work with, performancewise. (And one day soon I hope to get to play with it from an Application Programming perspective.)

In that post I briefly mentioned CICS Performance Analyzer (CICS PA) – which I’ve spent a lot of time with recently. I like it a lot and the reports (and CSV files) it produces have been very helpful. Not least because it helps sort out the nested components of transaction response time.

In this post, however, I want to outline another approach. If you don’t have (and can’t get) CICS PA this might help you. It might also inspire you if you want to do some custom reporting that CICS PA doesn’t do. (In skilled hands maybe CICS PA will do what my experiment below did. If so it’ll probably do it better.) This approach is to use standard CICS tools and DFSORT to process individual 110 records…

Standard CICS Tools

I’m using two CICS tools. They are both distributed in the SDFHLOAD load library, so if you have CICS you have them. They are:

DFHMNDUP – which creates CICS Dictionary records, essential to processing Monitor Trace records. This is in SDFHLOAD but not SDFHSAMP but I actually don’t need to know how it works or to modify it. It’s A Kind Of Magic 🙂
DFH$MOLS – which reads (and decompresses) Monitor Trace records and provides a sample report. It also can create a one-output-record-per-input-record data set. It is in SDFHSAMP – which proved crucial as I needed to know how it works.

DFSORT

We all know (and I hope love 🙂 ) DFSORT and ICETOOL.

Putting It All Together

DFH$MOLS needs Dictionary Records for every CICS region – because these records tell it how to process the records. Because you can tailor Monitor Trace records from a region using its Monitor Control Table (MCT) DFH$MOLS can’t assume they’re in a totally standard format. (Actually I see fewer customers tailoring their MCTs, perhaps because there’s less emphasis on space minimisation.) I said at the outset the record is complicated and this is part of the good reason why.

So, prior to running DFH$MOLS I ran DFHMNDUP once for each CICS region. (From Type 30 I already know the jobnames for the regions and the CICS region names happen to be the same – as is often the case.) Though you can feed in the MCT for a region I didn’t need to – because the set of data I’m working with is from a customer who doesn’t tailor their MCTs. So I ended up with five Dictionary Record data sets – each with one record in. (This customer has a single TOR, three AORs and a FOR.)

When running DFH$MOLS I concatenated these five Dictionary Record data sets ahead of the Monitor Trace data set. You have to do it in this order so the Dictionary Record for a region precedes all the Monitor Trace ones for the same region.

While DFH$MOLS can produce formatted reports I wanted to create raw records with the fields in fixed positions. (You can tell where this is headed.) So I specified UNLOAD LOCAL. The "LOCAL" is quite important: It ensures timestamps are in the local timezone. Otherwise timestamps are in GMT. (I actually did go down this "blind alley" first time round.)

Processing was surprisingly swift and the end-result was a data set with records reliably laid out the way I needed them: I’m out of the game of worrying about record formats and record compression etc. Result! 🙂

The goal of this experiment was to produce a file with one record per minute – for a single transaction in a single region. The record contains three pieces of information:

Time of the minute – e.g. "15:30".
Transactions per second – averaged across that minute.
Average response time – in milliseconds.

In fact I could’ve formatted these all differently (and summarised differently). So I could’ve used tenths of milliseconds (almost standard – "decimils" or "hectomics"?) and summarised by the second, as two examples.

As I mentioned before I needed to understand some aspects of how DFH$MOLS works – so I read the Assembler source. I’ll admit to not being very familiar with the details of Insert Under Mask (ICM) but it turned out to be important…

The response time for a transaction is given by subtracting the raw record’s start and end times – according to the code. These turn out to be the middle 4 bytes of a STCK (not "E") value. Following the chain of logic further back – and I won’t bore you with the details – it turned out the value is in 16 microsecond units. With DFSORT Arithmetic operators it’s easy to translate that into milliseconds – but best done after summing and piding by the number of transactions.

The timestamp for a transaction is of the form X’02012100′ followed by a value in hundredths of seconds. Again quite easy to turn into something useful in DFSORT. (This time I didn’t need the DFH$MOLS source to tell me this – I browsed the DFH$MOLS output in Hex mode and it was readily apparent.) As I said before my first go was without the "LOCAL" option so the timestamps were from a different time of day and it was obviously GMT. With "LOCAL" I could see it all coming right.

There Be Dragons

The above is quite straightforward – even if the bits about ICM and timestamps left you cold. 🙂 But there are some issues worth noting…

I used the CICS 650 (externally TS 3.2) libraries. This was fine as the records are from TS 3.2 CICS regions. This isn’t generally a stable thing to do – as everything in the process could change from release to release. So run the right DFHMNDUP – if you need that step. Also run the right DFH$MOLS.

In CICS TS 3.2 timestamp fields were widened from 8 bytes to 12. As this web page says DFH$MOLS got it right and the UNLOAD format is still the same as before. Critical, therefore that you have the right DFH$MOLS release.

While I don’t expect the timestamps to get widened again any time soon, there might be further similar changes going forwards. So, to repeat, make sure you’re using the right DFH$MOLS (and DFHMNDUP).

I’m sure there are other dragons but they haven’t singed my cardie yet. 🙂

What Next?

I started to create a DFSORT Symbols mapping of the output of DFH$MOLS UNLOAD but – for reasons of time – only managed to get the first few dozen fields mapped. Sufficient to do my "day job". I could extend that – and it might make a nice sample.

I could publish my DFSORT experimental case as a sample – possibly building on the Symbols deck. I could try and keep track of the twists and turns of the UNLOAD records – using (for example) IFTHEN processing. I’m really not missioned to do that. 😦

I could develop some more examples. Again, I’m not missioned to do that. 😦

I hope this post has encouraged some of you to experiment with this approach.

What I’m going to really do next is explain the following:

For the first 45 minutes or so the transaction rate stays more or less constant but the response time is all over the place, including some minutes where it spikes to huge (but realistic) values. Then, suddenly, the response time settles down to a low value – and the transaction rate remains more or less the same as it was before. And stays that way to the end of the data.

I think I have an explanation: From RMF I can see certain things awry until about that point (with the standard 15-minute granularity). To really explain it at the transaction level would take a more detailed analysis. Fortunately I can do this with the DFH$MOLS output. And CICS PA has already told me which fields are likely to be the most useful…

Two Kinds Of Pipes

(Originally posted 2012-06-24.)

It’s been a while since I last posted here. And that’s been more a matter of being incredibly busy than anything. Some of you will have seen me "on stage" – there having been several conferences and user groups recently. Some of you were with me at the zChampions meeting and some of you I’m actively working on customer situations with.

(By the way one of the rules of zChampions isn’t "nobody mentions zChampions". 🙂 But it’s true to an extent that "what happens at zChampions stays in zChampions”. 🙂 But only to an extent.)

I won’t signal a let up in activity by suggesting I’ll post more frequently: That’s probably unrealistic.

But in this post I want to talk about z/OS Unix Pipes in JCL vs BatchPipes/MVS. The reason for doing this is that in some circumstances z/OS Unix Pipes in JCL can play a similar role to BatchPipes. But there are caveats. In what follows I’ll generally use the term “Unix Pipes” ¹ as shorthand for “Unix Pipes in JCL” (except for the next heading where I think it’s clearer to write it in full).

Unix Pipes in JCL

Here are two sample pieces of JCL you can play with:

First a writer job. Let’s call it "W":

//WRITE1  EXEC PGM=IEBGENER 
//SYSPRINT DD SYSOUT=K,HOLD=YES 
//SYSIN   DD DUMMY 
//SYSUT1  DD * 
ABCD 
/* 
//SYSUT2  DD PATH='/u/userhfs/pmmpac/pipe1',DSNTYPE=PIPE,
//        LRECL=80,BLKSIZE=3200,RECFM=FB, 
//        PATHOPTS=(OWRONLY,OCREAT), 
//        PATHMODE=(SIWUSR,SIRUSR), 
//        PATHDISP=(DELETE,DELETE)

And now a reader job ("R"):

//READ1   EXEC PGM=IEBGENER 
//SYSPRINT DD SYSOUT=K,HOLD=YES 
//SYSIN   DD DUMMY 
//SYSUT1  DD PATH='/u/userhfs/pmmpac/pipe1',DSNTYPE=PIPE,
//        LRECL=80,BLKSIZE=3200,RECFM=FB, 
//        PATHOPTS=(ORDONLY,OCREAT), 
//        PATHMODE=(SIWUSR,SIRUSR), 
//        PATHDISP=(DELETE,DELETE) 
/* 
//SYSUT2  DD SYSOUT=K,HOLD=YES

The pipe here is given by the PATH name “/u/userhfs/pmmpac/pipe1” (and you’ll have to figure out what it should be in your environment).

In this example I’m just copying from instream data (the “ABCD” line) to the pipe and copying from the pipe to the SPOOL. Simple stuff you can get working and build on.

I experimented with PATHOPTS, PATHMODE and PATHDISP and the ones in the sample JCL decks are the ones that worked best. In particular PATHDISP=(DELETE,DELETE) seemed to be necessary to get the pipe deleted after use.

When I run these they appear to perform as a BatchPipes/MVS pipe would: Overlapped and apparently without real I/O² – which are the key benefits.

To those who know BatchPipes/MVS this should all look quite familiar…

Unix Pipes vs BatchPipes/MVS

As I said, they’re quite similar. But there are differences that are worth noting. The most notable ones are:

While BatchPipes can “add value” with fittings in the pipe-end definitions Unix Pipes can’t – you’d have to have a separate intermediary job (I tend to call “F” for “Filter” / “Fitting”. You can write it in whatever you like. Suggestions?
You don’t get the equivalent of SMF 91 – which has handy things to enable you to e.g, balance the pipe.
You don’t have a subsystem – and that’s probably a good thing. Instead you’re using the Unix file system – as the pipe name suggests.
To convert a non-piping usage to a piping one requires significantly more JCL “surgery”. In particular you can’t use the data set name for the pipe name. And you can see things like DISP are completely different. So it’s more work.
BatchPipes/MVS has the BatchPipePlex capability to pipe between Parallel Sysplex members using Coupling Facility structures. I don’t think you can do anything like that with Unix Pipes.

There are probably other differences but these are the most noteworthy to me – as someone who’s a lot of experience with BatchPipes/MVS. Some of them definitely can be worked around – such as the lack of Fittings support. Some of them are more difficult. But if you just want a simple inter-job piping capability have a go with Unix Pipes – perhaps starting with the two JCL samples above. And one key advantage of this approach might be that it’s free and available to all (I think) – so long as your jobs are allowed to use Unix System Services³.

And as to posting frequency: I feel under no pressure to publish – whether timing-wise or content-wise. If there’s stuff I think is interesting (and not damaging to a client) you’ll see it here – when I get the chance.

¹ Yes, I know Unix shells have their own built-in piping capabilities. Discussing those is beyond the scope of this post.

² I’m told no Unix piping implementation would perform physical I/O – so this isn’t a queue on disk (with the issues that would raise: I/O time and potential physical limits).

³ And I note the perpetual discussion on the IBM-MAIN Listserver group about whether “USS” is an acceptable abbreviation for “Unix System Services”… 🙂

IBM System z Technical University 21-25 May, Berlin

(Originally posted 2012-05-12.)

With just over a week to go I’ve got my presentation materials in for this great conference: IBM System z Technical University 21-25 May in Berlin. I hope to see many friends – old and new (old and young) 🙂 there.

For the record my three sessions are:

zZS08 – I Know What You Did Last Summer
zZS18 – Optimizing z/OS Batch (repeats)
zZS21 – Parallel Sysplex Performance Topics

And, as well as seeing me present (which I presume you’d want to else why are you reading this blog?) 🙂 there are lots of great sessions – at all levels of complexity on all kinds of topics.<

See you there

(And if you’re not going all of these are, I think, on Slideshare.)

He Picks On CICS

(Originally posted 2012-04-29.)

If you think this title is obscure bear in mind the original working title was "Send In The Hobgoblins". ¹ 🙂

When I started to write – actually before the "mind mapping" stage – it was going to be all about inconsistency in the way bits of systems are named. You'll see some of that reflected in the finished article (pun intended) but the post has mostly gone in a different direction.

I'd maintain this one is a slightly less obscure title. But I accept it depends on your pronunciation of "CICS". I've heard many nice variants ² but I'm depending heavily on just one. (And, obviously, it's my preferred one.)

I thought it'd be interesting to do a "thought experiment" ³ on what you can glean about CICS from SMF. This is a necessarily brief discussion – though it might be worth working up into a presentation one day – and I've probably touched on some of this before. If I have I hope I don't contradict myself too badly here. (Strike One for consistency.) 🙂

I'm going to do this two different ways: I'll talk about

Data
Themes

This isn't meant to be an exhaustive survey but is more intended to get you thinking. And in particular in the Themes section you can probably think of your own themes.

Data

As with every application address space, CICS regions can be looked at using standard SMF 30 Interval records.⁴:

Most notably, you can identify CICS regions from the program name – DFHSIP – and can establish usage patterns such as CPU and memory.
From RMF Workload Activity Report data (SMF 72 Subtype 3) you get WLM setup and goal attainment information. The SMF 30 record also contains the WLM workload, service class and report class names so you can easily figure out which CICS regions are in which service class, etc..

Obviously generic address space information can only get you so far. To go further you need more specific information. I'm going to divide it into three categories:

CICS-Specific
Other Middleware
I/O

CICS-Specific

CICS can create SMF 110 records at the subsystem and the transaction level – both of which can be reported on by specialist tools using CICS Performance Analyzer (CICS PA) or more general SMF reporting tools.

Such information contains subsystem performance information, response time components for transactions and virtual storage.

Other Middleware

You can get very good information about when CICS transactions access other middleware:⁵

For DB2 SMF 101 Accounting Trace gives you lots of information about application performance – as we all know. For CICS transactions the Transaction ID is the middle portion of the Correlation ID (QWHCCV) and the Region is the Connection Name (QWHCCN).⁶
Similarly, Websphere MQ writes application information in the SMF Type 116 record, which can be related to specific CICS regions and transactions.

I/O

Most performance people know about SMF 42 Subtype 6 Data Set Performance records. For data sets OPENed by the CICS region, these records are cut on an interval basis and when the data set is CLOSEd. (This obviously isn't true, for example, for DB2 data.) These records can be used with the File Control information in CICS 110 to see how, for example, LSR buffering and physical I/O performance interact for a VSAM file.

Themes

That was a very brief survey of the most important instrumentation related to CICS. Much of it is not produced by CICS itself. I kept it brief as it's perhaps not the most interesting part of the story: I hope some of the following themes bring it to life.

Naming Convention

(Strike Two for Consistency coming up.) As someone who doesn't know your systems very well it's interesting to me to figure out what your CICS regions are called. And which service classes they're in. etc.

So, to take a recent example, a customer has two major sets of CICS regions cloned across two LPARs. In one case SYSA has CICSAB00 to CICSAB07 and SYSB has clones CICSAB08 to CICSAB15. In the other case SYSA has CICSXY1, 3 and 5 while SYSB has CICSXY2, 4 and 6. Each of these happen to be in their own service class.⁷

You'll've spotted what I like to call "consistency hobgoblins" 🙂 in this:

One alternates between systems. The other has ranges on each system.
One starts at zero. The other starts at 1.

The customer took my teasing them about this inconsistency very well – so I don't think they'll mind me mentioning it here (particularly as, apart from them, nobody will recognise the customer).

And actually it doesn't matter – with one minor exception: The application that uses ranges (rather than alternating) would have to perform a naming "shuffle up" if they were ever to add clones. And this is not just a hypothetical scenario.

AOR vs TOR vs QOR vs DOR

You may well be able to tell this from SMF 30 – from the "lightness" of the address space. But it's better to use some of the other instrumentation:

Certainly there are "footprints in the sand" for things like File Control in SMF 110 so you could detect a File-Owning Region (FOR).
A CICS region that shows up in DB2 Accounting Trace obviously uses DB2 and looks more like a Data-Owning Region (DOR).
Likewise for SMF 116 and a Queue-Owning Region (QOR).

Now, regions come in all shapes and sizes and the terms "TOR", "AOR", "FOR" and "DOR" strike me as informal terms – and regions could be playing more than one of these roles so these terms aren't mutually exclusive. But the data is there.

XCF traffic (from SMF 74 Subtype 2) can be interesting:⁸ I noticed one application's CICS regions showed up in the job name field for XCF group DFHIR000, but not for the other application. I was informed there was a VSAM file this application shared – using CICS Function Shipping I guess.

With most topologies there is a unique correlator passed for the life of a transaction through the CICS regions. This correlator (in mangled form) even shows up in DB2. So you can tie together transactions and regions: CICS PA can apparently do this and the next time I get some CICS data in I'm going to learn how to do this. In any case transaction names like "CSMI" (the CICS Mirror transaction) tend to suggest Multi-Region Operation (MRO).

Virtual Storage

I'm reminded of this because in a customer I was able to demonstrate that while both of two applications had Allocated virtual storage of 1500MB the memory backed in one was half that and in the other almost all that. You might deem the former region set moderately loaded and the latter heavily loaded.

The virtual storage numbers – actually both 24-bit and 31-bit – come from Type 30 Interval records. The real storage numbers also from the same records but with some "interpretational help"⁹ from RMF 72-3 records.

But Allocated is a z/OS virtual storage concept: As with DB2 DBM1 address space virtual storage it is generally not the same as used. If it were it'd indicate a subsystem or region in trouble. So we need better information on which to make judgements. Fortunately we have it in the CICS 110 Statistics Trace records: You can do a good job of analysing and managing CICS virtual storage with this (just as you can with IFCID 225 data for DB2).

For one of these two applications virtual storage may well be the thing that determines when the regions need to be split.

Workload Balancing

You can see workload balancing in action at a number of levels:

At the region level (given a naming convention that lets you identify clones, as above) you can see in Type 30 even CPU numbers, EXCPs etc. If you don't, given supposed clones, you can conclude there isn't some kind of balancing or "round robin" in action – but some other kind of work distribution.
From CICS SMF 110 (Monitor Trace) you can see transaction volumes and can aggregate by Transaction ID. So an imbalance could be explained – perhaps because the supposed clones run different transactions or some transaction is present in all but at different rates in each clone. Or some other explanation.
Even without SMF 110 (which a lot of installations don't collect) DB2 Accounting SMF 101 could give you a similar picture (as might MQ's SMF 116).

So the "work distribution and balancing" theme can be addressed readily.

QR TCB vs Others

I mentioned above that virtual storage can sometimes drive the requirement to split CICS regions (whether cloned or not). The Quasi-Reentrant (QR) TCB can be another driver.¹⁰

Traditionally all work in a CICS region ran on the single QR TCB therein. And then File Control was offloaded from it. And the rest, as they say, is history.¹¹

Then as now, if the QR TCB approaches 70% of a processor performance can begin to degrade markedly. For this reason TCB times are documented in the SMF 110 CICs Statistics Trace record. I regularly see CICS regions with more than 70% of an engine (from SMF 30) but to do this an installation needs to understand (using the 110) how much is really the QR TCB.

Without the 110's, again you could work with SMF 101 and 116 for DB2 and MQ, respectively. In fact I often do.

So, I've tried to give you a flavour of what you can learn about a CICS installation from SMF. i.e. without going near the actual regions themselves. This is indeed just a flavour.

On the "inconsistency" point, consistency isn't vital but good naming conventions have real value. It's an old joke that goes "we like naming conventions so much we have lots of them, some of which contradict each other". 🙂

There are plenty of other examples where there are inconsistencies. A good one is LPAR / z/OS system names. I've seen several customers with the following kind of scenario: "Our systems are called things like A158, SYSC, DSYS, Z001 and MVS1." And it's not just LPAR names and CICS region names, of course.

The inconsistencies in installations often reflect history. And a notable category is Mergers and Acquisitions. (The LPAR names example above is often caused by this.) I'm really impressed at what customers manage to achieve when they do something like this: Getting it to work reliably is the most important thing. Homogenisation of names should be and is secondary.

I really like to see traces of the history in the systems I examine. Some of you reading this have been with me on the journey of your systems' lifetimes for a long time now: I wonder how much history we each remember. 🙂 Next time you see me ask me to pull out some slides from previous engagements: When I do this people are astonished by how much hasn't changed and how much has.

As you possibly spotted that was "Strike Three" for consistency in this post so I guess I'm out. 🙂 This was indeed going to be a post about consistency but took a different direction, as I said. I hope you found the "CICS nosiness" aspect interesting and useful. If you do I might well turn it into a set of slides and add some more material. If you have anything to add I'd be interested in hearing about it – whether you're from Hursley¹² or not.

Footnotes

¹ The reference here is, of course, 🙂 to Ralph Waldo Emerson's essay "Self-Reliance" where he wrote "a foolish consistency is the hobgoblin of little minds".

² Such as "kicks", "chicks", "thicks", "six" and "sex" (no, really). 🙂 And my least preferred one is "see eye see ess".

³ If you think I'm self-consciously channelling Einstein here you'd be wrong: It's actually Mao. 🙂 Because the thought experiment is no substitute for experience – according to "On Practice".

⁴ Actually I doubt the utility of SMF 30 Interval records for batch jobs.

⁵ I believe you can get data from IMS relatable to CICS transactions – but I know relatively little about IMS.

⁶ And you can tell a CICS-related 101 record because the value of the QWHCATYP (Connection Type) is QWHCCICS. Further, you can tell things about sign ons from the QWACRINV field value.

⁷ You might not know this but the SMF 72-3 record has the Service Class Description character string – from the WLM policy. I'm slowly evolving my charting to use the description. Time to clean it up, folks. 🙂

⁸ While you get member name in 74-2 (and I'm proud to say I got job name in as a more useful counterpart) you don't get "point to point" information: You just get the messages sent from and to the XCF member. Figuring the actual topology out by matching message rates is fraught. I'd love an algorithm that was effective (or efficient) at this.

⁹ What I mean by this will have to await another post – some time.

¹⁰ 26 years ago I worked on CICS Virtual Storage at a Banking customer. Not a lot has changed. 🙂 20 years ago I was involved in enabling customers to take advantage of multiple processors by splitting regions as described in this section. Again, not a lot has changed. 🙂 But this is unfair because the Virtual Storage and CPU pictures have changed a lot.

¹¹ Or is it hysteria? 🙂

¹² Home of CICS and Websphere MQ Development

Guest Post – z/OS Release 13 ISPF Editor Enhancements

(Originally posted 2012-04-12.)

I was pleased when Julian Bridges (who I worked with in IBM Global Services for a number of years) told me he had access to a z/OS Release 13 system. He agreed to write a blog post on the enhancements to the ISPF editor in Release 13 and this is that blog post. Enjoy!

Julian Bridges

It comes as a surprise to many how flexible the ISPF editor can be. Many times sitting with clients typing away with them at your shoulder you hear, “I didn’t know you could do that”. It’s certainly worth hitting F1 in the edit screen or reading “ISPF Edit and Edit Macros” and spending a while trying to understand the power of the commands available.

Whilst much of the power is in the primary commands, in the past few releases of z/OS functionality has been added to the line commands as well.

First is simply the ability to (C)opy or (M)ove data to multiple lines. Previously you could copy or move lines to a single destination but since z/OS 1.10 this has been extended to allow multiple destinations.

For example, I’ve missed a comma from the end of the SYSUT2 and then repeated the line and hence the mistake. I can now use the move overlay line command to add a comma in to each of the lines with the error as follows:

m 0100                                                   , 
000700 //PACK     EXEC PGM=AMATERSE,PARM='PACK'           
000800 //SYSPRINT DD   SYSOUT=*                           
000900 //SYSUT1   DD   DISP=SHR,DSN=JULIAN.TZOSC01.DUMP   
ok 100 //SYSUT2   DD DISP=(,CATLG),DSN=JULIAN.TZOSC01.TRS 
001200 //         SPACE=(CYL,(1000,1000),RLSE),VOL=(,,,3) 
001300 //*                                                 
001400 //PACK     EXEC PGM=AMATERSE,PARM='PACK'           
001500 //SYSPRINT DD   SYSOUT=*                           
001600 //SYSUT1   DD   DISP=SHR,DSN=JULIAN.TZOSC02.DUMP   
ok 700 //SYSUT2   DD DISP=(,CATLG),DSN=JULIAN.TZOSC02.TRS 
001800 //         SPACE=(CYL,(1000,1000),RLSE),VOL=(,,,3) 
001900 //*                                                 
002000 //PACK     EXEC PGM=AMATERSE,PARM='PACK'           
002100 //SYSPRINT DD   SYSOUT=*                           
002200 //SYSUT1   DD   DISP=SHR,DSN=JULIAN.TZOSC03.DUMP   
ok 300 //SYSUT2   DD DISP=(,CATLG),DSN=JULIAN.TZOSC03.TRS 
002400 //         SPACE=(CYL,(1000,1000),RLSE),VOL=(,,,3) 
002500 //*                                                 
002600 //PACK     EXEC PGM=AMATERSE,PARM='PACK'           
002700 //SYSPRINT DD   SYSOUT=*                           
002800 //SYSUT1   DD   DISP=SHR,DSN=JULIAN.TZOSC04.DUMP   
o 2900 //SYSUT2   DD DISP=(,CATLG),DSN=JULIAN.TZOSC04.TRS 
003000 //         SPACE=(CYL,(1000,1000),RLSE),VOL=(,,,3) 
003100 //*

Note the addition of the “k” on the overlay command to indicate the multiple destinations. The last destination in the file is indicated by missing this “k” and is just the normal overlay “o”. The same is true for “a” after and “b” before destinations as well.

Of course, in this case, it would probably be easier just to type the comma in the correct place but you get the idea.

Secondly, with z/OS 1.13, the ability to write you own line command macros has been made available.

This does involve a few steps but basically now the ability to do pretty much anything you wish is available:

Define an ISPF table to associate a line command with a macro.
Write your macro.
Associate the defined table with your edit session.
Run the macro.

Define An ISPF Table To Associate A Line Command With A Macro

Fortunately the ISPF table utility, option 3.16, has been enhanced to make this straightforward. An option at the bottom of the screen now asks if this “Table is an EDIT line command table”

When selected it creates the table in the necessary format and you just have to fill in the blanks. The examples below show what the options mean for existing line commands.

User command – The line command.
MACRO – The macro which will run when you run this line command.
Program Macro – Is this a program macro.
Block format – Does this macro allow you to select multiple lines by repeating the last char of the command e.g. CC? CC would copy a block of text.
Multi line – Does this macro allow you to select multiple lines by providing a numeric suffix on the end of the command e.g. C6 will copy the next 6 lines.
Dest Used – Does this macro allow a destination e.g. C or M must have a destination whereas R doesn’t.

e.g.

User     MACRO    Program  Block    Multi    Dest     
Command           Macro    format   line     Used     
----+--- ----+--- ----+--- ----+--- ----+--- ----+--- 
CL       CLINE    N        Y        Y        Y

This table must then be saved to a table library allocated to your ISPTLIB concatenation.

Write Your Macro

A few things to bear in mind. You have to use the PROCESS macro instruction to populate the range and destination variables within the macro. This is best illustrated by an example.

/* REXX */                     
Address ISREDIT                 
"macro NOPROCESS"               
"process range CL"             
dw = 72                         
"(srange) = LINENUM .zfrange"   
"(erange) = LINENUM .zlrange"   
do i = srange to erange         
  "(LINE) = LINE " i           
  line = centre(strip(line),dw) 
  "LINE " i " = (LINE)"         
end

This macro will centre the lines selected.

Process takes the arguments range, dest or both and the line command being entered. It gives return codes if when called a range or dest is missing.

This macro should then be saved in your SYSEXEC or SYSPROC concatenation.

Associate The Defined Table With Your Edit Session

Select ISPF option 2 and enter the name of the table in the “Line Command Table” field at the bottom of the screen.

This is now remembered whether you edit via option 2 or using “E” from 3.4.

Run The Macro

Single line

****** ***************************** Top of Data ****************************** 
cl 100 I wandered lonely as a cloud                                             
000200 That floats on high o'er vales and hills,                               
000300 When all at once I saw a crowd,                                         
000400 A host, of golden daffodils;                                             
000500 Beside the lake, beneath the trees,                                     
000600 Fluttering and dancing in the breeze.                                   
****** **************************** Bottom of Data ****************************

Results in

****** ***************************** Top of Data ****************************** 
000100                       I wandered lonely as a cloud                       
000200 That floats on high o'er vales and hills,                               
000300 When all at once I saw a crowd,                                         
000400 A host, of golden daffodils;                                             
000500 Beside the lake, beneath the trees,                                     
000600 Fluttering and dancing in the breeze.                                   
****** **************************** Bottom of Data ****************************

Block format

****** ***************************** Top of Data ****************************** 
cll 00 I wandered lonely as a cloud                                             
000200 That floats on high o'er vales and hills,                               
000300 When all at once I saw a crowd,                                         
000400 A host, of golden daffodils;                                             
cll 00 Beside the lake, beneath the trees,                                     
000600 Fluttering and dancing in the breeze.                                   
****** **************************** Bottom of Data ****************************

Results in

****** ***************************** Top of Data ****************************** 
000100                       I wandered lonely as a cloud                       
000200                That floats on high o'er vales and hills,                 
000300                     When all at once I saw a crowd,                     
000400                       A host, of golden daffodils;                       
000500                   Beside the lake, beneath the trees,                   
000600 Fluttering and dancing in the breeze.                                   
****** **************************** Bottom of Data ****************************

Multi line

****** ***************************** Top of Data ****************************** 
000100 I wandered lonely as a cloud                                             
000200 That floats on high o'er vales and hills,                               
cl99 0 When all at once I saw a crowd,                                         
000400 A host, of golden daffodils;                                             
000500 Beside the lake, beneath the trees,                                     
000600 Fluttering and dancing in the breeze.                                   
****** **************************** Bottom of Data ****************************

Results in

****** ***************************** Top of Data ****************************** 
000100 I wandered lonely as a cloud                                             
000200 That floats on high o'er vales and hills,                               
000300                     When all at once I saw a crowd,                     
000400                       A host, of golden daffodils;                       
000500                   Beside the lake, beneath the trees,                   
000600                  Fluttering and dancing in the breeze.                   
****** **************************** Bottom of Data ****************************

Have a play and see how you get on.

You Might Just Be A Clone If…

(Originally posted 2012-03-25.)

As previously discussed I’m often in a situation of trying to make sense of a set of job-related SMF data. Even though it may be your own installation’s data, you’re probably confronted with what I like to call “a journey of discovery” occasionally, too.

I’m always looking for what I can discern from the data.¹ And, when confronted with a set of data about batch jobs, I go into overdrive. 🙂

This post is about how to tell if a set of batch jobs really are clones of each other. It’s an exercise in pattern definition, albeit loosely.

But first, why would you want to know what’s a clone set of jobs? Remember these are near-identical jobs that run in parallel against subsets of the data. Firstly, if something’s cloned you might be able to clone it further.² Second, if it isn’t cloned you need to recognise that and think about the effort involved to even start with cloning.³

The process of detecting clones is easy to describe but not so easy to do. Here are the steps:

Look for similarities in SMF 30 Step- and Job-End records.
Likewise in SMF 101 DB2 Accounting Trace.
And similarly for data access.

Steps 2 and 3 could be done in either order. And indeed Step 2 would be only be relevant for DB2.

Let’s think about these in a bit more detail…

Step-End And Job-End Evidence

I would expect cloned jobs to run more-or-less alongside each other – though they might be set off in groups. Of course imbalance between the clones would mean they wouldn’t end at the same time.

Additionally the jobs would have the same “step profile”. By this I mean the number of steps is consistent, the same steps in each job are the big ones. The program names are the same. And the performance profile of each step is similar across the clones, so the CPU intensiveness and the EXCP counts are similar.

I would expect also to see a sensible job-naming convention. For example “all the jobs beginning PLCD50 are clones and the suffix is 00, 01, 02 and so on”. From this you get job names like PLCD5000, PLCD5001 etc.

Generally I spot groups of jobs meeting these criteria pretty easily – using SMF Type 30 subtypes 4 (Step) and 5 (Job).

DB2 Invocation Evidence

For DB2 jobs I’d expect corroboration from DB2 Accounting Trace (SMF Type 101):

Plan names and package names⁴ should be the same.

In many cases I’ve seen a single DB2 plan name for an entire application, and sometimes crossing application boundaries. Similarly packages are sometimes widely used – for example in the “I/O module” or Stored Procedure cases. Taken together this is a necessary but not sufficient condition.
DB2 Accounting Trace, as you probably know, can give a very detailed breakdown of where a step’s time goes⁵ – down to the package level. Again, you’d expect to see a similar profile across all the clones.

For any serious DB2 Batch analysis I’d be looking at this data anyway. I’ve written extensively about DB2 Batch, most recently here.

Data Access Evidence

This is where consistency is slightly less to be expected: Most probably DD names will be the same across the cloned jobs. But very often the data set names are slightly different. For example the clone stream number might be encoded in the data set – probably in one of the lower level qualifiers.

For DB2 it’s more difficult to assess which tables a job step access – and probably you need to look at the DB2 Catalog for insight. When you do you may well find the cloned jobs accessing partitions of the same table (in some cases).

There is other evidence of interest here:

In many cases clone jobs (or streams) are preceded by a job whose role is to split the data to feed the clones. Similarly there’s often a follow-on job to merge the results. Detecting these – in the non-DB2 case is usually pretty straightforward. (Even in the DB2 case the scheduler should tell you.) My point here is there’s value in seeing how cloning is working, not least from why there might be imbalance between the clones.

As I said at the outset it’s useful to figure out which jobs in a suite or a window are part of a cloning implementation. And as I hinted in a couple of places there’s also value in understanding balance (or imbalance). In this post I’ve given some tips on the kinds of patterns to look for. Some of this could be codified, I’m sure. In any case the human mind is a wonderful instrument for pattern recognition⁶.

¹ I’ve talked about this sort of thing before. Most recently in Published on Slideshare: I Know What You Did Last Summer.

² Recall my recommendation to clone 2, 4, 8, 16 … or else 3, 6, 12, 24… – unless you know differently.

³ See this part and this part especially of the ‘I Said "Parallelise" Not "Paralyse"’ series of blog posts for more on this.

⁴ You only get package-level statistics if you specify Accounting Trace classes 7 and 8.

⁵ You only get the detailed break down if you specify Accounting Trace classes 1, 2 and 3. (And see ⁴.)

⁶ This footnote is a wholly gratuitous reference to the excellent Pattern Recognition, a novel by the excellent William Gibson. 🙂

Drawing The Line

(Originally posted 2012-03-23.)

You’d think it would be pretty simple to draw a line. Right?

This post discusses an enhancement I’d like to make to my current reporting – and I’m pretty sure that technically I can do it. The question is whether I should.

Consider my current "Memory by address space within Service Class" graph. Here’s a sample:

And here’s what I think I might like it to look like:

Obviously the line’s been drawn on by hand. I haven’t written any code to achieve the enhancement. And, yes, the data’s real – apart from the drawn-on line. I feel pretty safe (on behalf of the customer) in showing you this as it’s VERY generic. But, no, I can’t promise the drawn-on line’s in the right place.

Let’s talk about:

Motivation and Usage
Mechanics

Motivation and Usage

When I throw graphs at you I see myself as "story telling". Hopefully an accurate story, certainly one I believe in. So, when working on my code I ask the question "how does this affect the story telling?"

Here’s how I normally tell the (e.g) CPU story:

Talk about CPU usage by processor pool by LPAR¹ and stacked up to give the machine view.
Break down CPU usage by WLM Workload and the Service Class² – again by pool.
Likewise by address space within a Service Class.
Possibly break down address space CPU to e.g. Transaction – assuming CICS or DB2 are "in play".

When you’ve done that you certainly know where the CPU is going. You do the same thing for memory – right until you get to Step 4.

The concept of "capture ratio" is well known and bridges the gap between Step 1 and Step 2 – for CPU³. It doesn’t make sense to draw the proposed line for this case.

To bridge between the Service Class level and the Address Space level (Step 2 to Step 3) I think a different treatment is required. There are a number of reasons for this:

Some service classes have no address spaces. And hence no memory. "Capture Ratio" may be 100% but unlikely to be computed that way. 🙂
The chart I’m proposing has up to 15 address spaces on it. (We could make it more but then it becomes markedly less readable.) So, for a Service Class with more than 15 address spaces we miss some – as in this particular example. I’d like to show we had good (or bad) coverage of the "headline" Service Class number in these 15 address spaces. This works fine for CPU, memory and EXCPs.
Type 30 memory numbers behave badly and it would be nice to see how badly compared to the Service Class total. (Type 30 CPU numbers don’t behave badly.)

So I think the line that says what the total "should" be is ideal for this. Hence my proposal⁴.

Mechanics

Today the data is in two tables: A Service Class (Period) table and an Address Space table – both summarised at an interval level⁵. The former comes from RMF SMF 72 Subtype 3. The latter comes from SMF 30 Subtypes 2 and 3. It’s always interesting handling two different data sources as if they might magically corroborate each other. How naive. 🙂

I use standard SLR “PRINT CHART” and similar commands against these tables. Not so long ago I learnt how to drive GDDM graphing direct from REXX. Because I can do other things in the REXX (like adjust address space names to add e.g. “CICS”) I might take that route rather than using PRINT CHART. And there are some other cases I would want REXX’s sophistication to take care of – like either the 30’s or the 72’s being missing.

In your case you can probably bring the two together quite neatly. Anyone know if MXG already does this?

Conclusion

So, why am I blogging about this? Two reasons:

Because you might want to try the same depiction idea.
Because I’d like to know if you think this is a good idea.

So I’d like your input on this. (Commenting here would be fine or any other way you want.) And maybe next time I crunch your data the story will be told just that little bit better. At least that’s the plan. 🙂

¹ Nowadays those pools are: GCP, ICF, zIIP, zAAP, and IFL.

² I’ve not found much value in breaking CPU usage down by Service Class Period.

³ For memory I handle it differently – because there are reported-on memory usages that are outside of the Workload / Service Class hierarchy. And I explicitly calculate an "Other" category – which has never turned out to be negative.

⁴ Today I’d be showing you two charts and inviting you to do the comparison. I hope my proposal makes this quicker and smoother.

⁵ This interval may be different in the SMF 30 and 72 records but it’s summarised to the same interval in the code. This might be 15 minutes, 30 minutes or (most usually) 1 hour. And that’s all summarised at the "shift" level for even broader brush work.

Whocasting?

(Originally posted 2012-03-22.)

Don’t reach for your dictionary to look up the single word in the title: I just made it up. 🙂 But I hope it’ll make sense as a term once you’ve read this post.

Once in a while I like to post on Social Media. Looking back it seems to be about every 6-9 months. Some recent examples are:

In this post I want to talk about what I’ll call "communities". You might prefer the term "consituencies". So I’ll use them interchangeably. Take a look at the following graphic. It’s a pair of Venn diagrams. It’s a gross oversimplification but I hope it illustrates a point or two.

Let’s continue but from a different standpoint: Why is it that what I say only sometimes resonates with you? (I’m doing well if most of what I say hits home.) 🙂

This question and the Venn diagrams are related:

You’ll see I’m in all the purple sets and you’re in one of them. And conversely with the green ones: You’re in all of them and I’m in only one.

Each set represents a community. So you might be in my "school friends" constituency – and the converse is probably true. When I talk about z/OS and SMF that probably won’t resonate with you. And when you talk about skateboarding it probably won’t resonate very strongly with me – as I’m not a skateboarder. But each of these topics undoubtedly resonates with several people – because they are in the appropriate community. It’s a happy accident if both "z/OS and SMF" and "skateboarding" resonate with the same person.

Which brings me on to some interesting points about the diagram:

It doesn’t describe all our contacts – otherwise it would be impossibly crowded.
It doesn’t – as drawn – admit to the possibility of us being in multiple communities we’re both members of. But that’s just because I chose to keep it simple.
As I mentioned before, it’s possible for communities (topologists would perhaps call them "neighbourhoods"¹) of 2 different people to unexpectedly interest a third person. That’s, of course, very common. When it happens it’s magic, I think.

So far so "ho hum": We all know this stuff. And it’s really laying the groundwork for what I really want to talk about: Who we communicate with and why. Hence the "whocasting" neologism.

It’s actually my friend Bill Seubert who prompted me (perhaps unwittingly) to write this post: He talks a lot about interactions between communities, often disjoint or worse. At least that’s my "take home" of some of what he says.

I think it’s a fairly common experience to have someone say "I didn’t understand a word of that last post". I think we all get feedback somewhat along those lines. (I also think we get quite a lot of "I liked the way you put that".) I’m sure it’s not just me and (perhaps defensively) I claim it’s not (always) 🙂 just a complaint about obscurity: I think it’s a fact of life when you have so many constituencies.

If I start thinking about all the constituencies I have it’s quite a long list. Some would be: Family, School Friends, College Friends (and a subset The Pi Collective), IBM Training Stream Friends, Mainframers, Social Media Fellow Conspirators :-), and so on. That’s pretty diverse. And, again, I’m sure that’s not just me.

I think sometimes it’s fun to figure out exactly for whom a message is created. Sometimes it’s only one person. Sometimes it’s a well defined group. Sometimes it’s an ill-defined group. And, my favourite, sometimes it’s just tossed into the void to see who bites / giggles / reacts / whatever. While it might be "narrowcasting" or "broadcasting" it often isn’t either of these. The "whocasting" relates to the game of figuring it out.

Let’s talk about constituencies / communities some more:

For a start you don’t see my communities as sharply as I do. And vice versa. (I’ll leave the debate as to whether a community really has sharp boundaries out of this post, for brevity².) If that’s true then that might making the game of spotting which community someone else is communicating with a whole lot trickier.

What’s very interesting to me is the dynamics between communities:

Groups of communities can "triangulate" on you. If you were now in a community whose position were opposite to that of a community you grew up in that would be interesting, wouldn’t it? 🙂
If those two communities got to fighting with each other – and I’ve seen it happen to other people – that might be stressful and destructive. I’d sell ringside tickets to that one. 🙂 If one were conciliatory one might exclaim "couldn’t we all just get a bong?"³ 🙂
If two communities came together through their common member(s) that’d be really good to see.

The fact that these inter-community effects are real and happen quite often reinforces my view that there’s nowhere to hide. All you can do is be yourself and speak with your "authentic voice". This is very much like a good party: There are lots of conversations going on and it’s really rather noisy. You can duck in and out of the conversations and generally nobody much minds if you invite yourself into a conversation. If there is a difference it might be in who can overhear the conversations: I’d say it’s easier to overhear in Social Media than in many parties. We’ve even got tools that help us do that.

So, as I said, there’s really nowhere to hide. But that’s, in my opinion, really very nice. Yes, there are a few awkward moments, but generally it’s good. I was at one point going to call this post "Not Afraid"⁴ but I think I’ve said that already. What might I be afraid of? I suppose criticism and looking like a fool. I think I can learn from the former and it’s perhaps too late for the latter: Living openly means looking like a fool is inevitable and learning is usually the result. Oh, and fear itself. 🙂

To sum up, it’s important in Social Media to consider your constituencies and keep track of which communities you’re in, centred on each contact. And to use this information to cultivate what flows from it. But when I say that, like so many things in Social Media, it’s very easy to over-track and analyse. Perhaps, in this post, I’ve done just that (but without the tracking). Oh well. 🙂

¹ The allusion here is that in topology neighbourhoods are always neighbourhoods of an element. (I won’t stretch the analogy to consider the other parts of the definition of a neighbourhood as it’s perhaps not useful here.)

² "For brevity" is about as credible as when a mathematician says "clearly". 🙂

³ An old joke which will make some people giggle, some get annoyed, and some just fail to understand it. Which rather makes my point, doesn’t it?

⁴ I realise a reference to Eminem will get up some noses. If it does I’d invite you to look beyond the (potential) offensiveness and see the deftness with which he operates and to consider that much of his bile is directed at himself. I also realise he isn’t cool and doesn’t make me look cool. 🙂 Anyway, the reference is to this song (lyrics here.) In particular I think it’s the line "Holla if you feel you’ve been down the same road" that resonates with this post.

C’mon In The Water’s Lovely :-)

(Originally posted 2012-03-14.)

It’s not often I write a blog post that’s essentially a link to a web page. But on this occasion I will.

Here’s the link: Packer Advocates the Human Side of Social Business.

I hope you also read Willie Favero’s similar piece: Favero Shares His Secrets for Social Media Success.

I don’t know whether it’s the done thing to point to an article about you. Anyhow, I hope you enjoy it – and find it encouraging. I hope you’re not put off by the style that uses the surname rather than the first name. That’s not my personal style but is in keeping with the magazine.

Published on Slideshare: I Know What You Did Last Summer

(Originally posted 2012-03-10.)

I’ve just published this presentation on Slideshare. You can get it from here.

Normally I’d give a presentation at least once before publishing it. Unfortunately the event I was going to present it at earlier this week was cancelled. So I’m experimenting a little by publishing it first. As with all presentations it’ll probably evolve. What I’ve not done before is let it begin the evolution before I present it.

I hope, having seen the slides, you’re more inclined to hear me present it: There are quite a few things that you will either scratch your head at in the slides or will know are going to come alive when I present.

Actually the whole thing’s been an experiment in some ways. For a flavour of this see:

Anyhow, it’s been interesting to think and write about some slightly different stuff – and at a less detailed level. It’ll be even more interesting than usual to give the presentation because of this. I’m certainly doing it at IBM System z Technical University in Berlin May 21-25.