(Originally posted 2016-01-24.)
DDF and Batch sound like two opposite ends of the spectrum, don’t they?
Well, it turns out they’re not.
I said in DDF Counts I might well have more to say about DDF. I was right.
I’ve known for a long time that some DDF work can come in from other z/OS DB2 subsystems, but not really thought much about it.
Until now. And I don’t really know why now. 🙂 Maybe it’s just because I’m “in the neighbourhood”.
Why Is Batch DDF An Important Topic?
We look at batch jobs in lots of ways but until now we’ve not considered the case where a batch job goes to DB2 for data but the data is really in a different DB2.1
But if a DB2 job does go elsewhere for its data the performance of getting it clearly affects the job’s run time.
There are at least two different aspects to this:
- The network traffic.
- The remote DB2 access time.
How Do You Understand A Job’s Remote DB2 Performance?
First you have to detect an external DB2 batch job. Then you need to analyse its performance.
The latter is just the same as any other DB2 batch job, so I won’t dwell on it here. So let’s consider how to detect batch jobs that come in through DDF.
Detecting An External DB2 Batch Job
Let’s assume you have a bunch of SMF 101 (DB2 Accounting Trace) records with QWHCATYP of QWHCRUW or QWHCDUW – denoting DDF.
If field QMDAATYP contains “DSN” the DDF 101 record relates to a remote z/OS system. But these records could be, for example, from a remote CICS transaction.
You can detect remote batch jobs from the SMF 101 record by observing when field QMDACTYP contains “BATCH”. Typically QMDACNAM might contain “BATCH” or “DB2CALL”.
If it is Remote DB2 Batch the first eight characters of the remote Correlation ID (QMDACORR)2 are the job name.
Obtaining the step number and name can be done by using timestamp analysis, comparing this record’s timestamps to SMF 30 for the job on its originating system.
One snag with the term “originating system” is that the 101 record doesn’t actually tell you the originating system’s SMF ID. But it will give you some network information, from which you can probably work it out.
Now We Have Two Records To Analyse. Is This Better Than One?
So now we have two SMF 101 records for the job3:
- The one on the job’s originating system.
- The DDF one on the system it connects to via DDF.
As I pointed out at the end of this discussion thread in 2005 the originating job’s 101 record might contain substantial DB2 Services Wait Other time – which would be the time spent over in the system whose data it accessed.
So I would advocate a two step process:
Analyse the job’s home DB2 101 to discover the big buckets of time and tune down – as usual.
If the DB2 Services Wait Other time is substantial then understand the time buckets in the other 101 record (the one on the system it connects to via DDF).
Actually there is a third aspect: If your concern is actually the CPU time this job causes on the system it connects to via DDF then obviously the DDF 101 is the one you want.
So I think you can do good work with the pair of 101 records – so long as you’re collecting 101s from both DB2 subsystems and processing them appropriately.
What About The Network Traffic?
While you can’t directly see the network time you can see the traffic: The QLAC section in the 101 record gives you such things as SQL statements transmitted, rows transferred, bytes transferred etc.
I think this is useful information – and you might actually be able to do something about it.
Part of the purpose of this post was to sensitise Performance people to the possibility that their batch might be using DDF (and indeed that some of the DDF traffic might be coming from remote z/OS batch jobs).
The other part of the purpose was to outline how you might go about analysing the performance of such batch jobs.
In my code I have a new report that covers this ground. Naturally it’ll evolve – and I expect I’ll be asking customers whose DB2 Batch I study for SMF 101 data from any DB2 subsystems they think it accesses remotely.