Every so often I get to work with a SAP customer. I’m pleased to say I’ve worked with four in recent years. I work with them sufficiently infrequently that my DDF SMF 101 Analysis code has evolved somewhat in the meantime.
The customer situation I’m working with now is a good case in point. And so I want to share a few things from it. There is no “throwing anyone under the bus” but I think what I’ve learnt is interesting. I’m sure it’s not everything there is to learn about SAP, so I won’t pretend it is.
The Structure Of SAP Db2 Correlation IDs
In Db2 a correlation ID (or corrid) is a 12-character name. Decoding it takes some care. For example:
- For a batch job up to the first 8 characters are the job name.
- For a CICS transaction characters 5 to 8 are the transaction ID.
In this set of data the correlation ID is interesting and useful:
- The first three characters are the Db2 Datasharing group name (or SAP application name).
- The next three are “DIA” or “BTC” – usually. Occasionally we get something else in these 3 positions.
- Characters 7 to 9 are a number – but encoded in EBCDIC so you can read them.
I wouldn’t say that all SAP implementations are like this, but there will be something similar about them – and that’s good enough. We can do useful work with this.
Exploring – Using Correlation IDs
Gaul might indeed be divided into three parts. (“Gallia est omnis divisa in partes tres”). So let’s take the three parts of the SAP Correlation ID:
Db2 Datasharing Group Name / Application Name
To be honest, this one isn’t exciting – unless the Datasharing Group Name is different from the SAP Application Name. This is because:
- Each SAP application has one and only one (or zero) Datasharing Groups.
- Accounting Trace already contains the Datasharing Group Name.
In my DDF SMF 101 Analysis code I’m largely ignoring this part of the Correlation ID, therefore.
BTC Versus DIA
The vast majority of the records have “BTC” or “DIA” in them, and this post will ignore the very few others. Consider the words “have “BTC” or “DIA” in them”. I chose my words carefully: these strings might not be at offsets 3 to 5. Here’s a technique that makes that not matter.
I could use exact equality in DFSORT. Meaning a specific position is where the match has to happen. However DFSORT also supports substring search.
Here is the syntax for an exact match condition:
Here I’ve had to remap the ID field to map positions 4 to 6 (offsets 3 to 5). That’s a symbol I don’t really want and it isn’t flexible enough.
Here’s how it would look using a substring search condition:
This is much better as I don’t need an extra symbol definition and the string could be anywhere in the 12 bytes of the CORRID field.
If we can distinguish between Batch (“BTC”) and Dialog (“DIA”) we can do useful things. We can show commits and CPU by time of day – by Batch versus Dialog. We could do Time Of Day anyway, without this distinction. (My DDF SMF 101 Analysis code can go down to 100th of a second granularity – because that’s the SMF time stamp granularity – so I regularly summarise by time of day.) But this distinction allows us to see a Batch Window, or times when Batch is prevalent. If we are trying to understand the operating regime, such distinctions can be handy.
This is the tricky one. Let’s take an example: “XYZBTC083”
We’re talking about the “083” part. It looks like a batch job identifier within a suite. But it isn’t. For a start, such a naming convention would not survive in a busy shop. So what onis it?
There are a few clues:
- “XYZBTC083” occurs throughout the set of data, day and night. So it’s not a finite-runtime batch job.
- In the (QWHS) Standard Header the Logical Unit Of Work ID fields for “XYZBTC083” change.
- the “083” is one value in a contiguous range of suffixes.
What we really have here are SAP Application Server processes, each with their own threads. These threads appear to get renewed every so often. Work (somehow) runs in these processes and, when it goes to Db2, it uses these threads. It’s probably controllable when these threads get terminated and replaced – but I don’t see compelling evidence in the data for that control.
This “083” suffix is interesting: In one SAP application I see a range of “XYZDIA00” – “XYZDIA49”. Then I see “XYZBTC50” – “XYZBTC89”. So, in this example, that’s 50 Dialog processes and 40 Batch processes. So that’s some architectural information right there. What I don’t know is whether lowering the number of processes is an effective workload throttle, nor whether there are other controls in the SAP Application Server layer on threads into Db2. I do know – in other DDF applications – it’s better to queue in the middle tier (or client application) than queue too much in Db2.
IP Addresses And Client Software
Every SMF 101 record has an IP Address (or LU Name). In this case I see a consistent set of a small number of IP addresses. These I consider to be the Application Servers. I also see Linux on X86 64-Bit (“Linux/X8664”) as the client software. I also see it’s level.
So we’re building up a sense of the application landscape, albeit rudimentary. In this case client machines. (Middle tier machines, often – if we’re taking the more general DDF case than SAP.)
Towards Applications With QMDAAPPL
When a client connects to Db2 via DDF it can pass certain identifying strings. One of these shows up in SMF 101 in a 20-byte field – QMDAAPPL.
SAP sets this string, so it’s possible to see quite a high degree of fine detail in what’s coming over the wire. It’s early days in my exploration if this – with my DDF SMF 101 Analysis code – but here are two things I’ve noticed, looking at two SAP applications:
- Each application has a very few QMDAAPPL values that burn the bulk of the CPU.
- Each application has a distinctly different (though probably not totally disjoint) set of QMDAAPPL values.
I’ve looked up a few of the names on the web. I’ve seen enough to convince me I could tell what the purpose of a given SAP application is, just from these names. Expect that as a future “stunt”. 🙂
I think I’ve shown you can do useful work – with Db2 Accounting Trace (SMF 101) – in understanding SAP accessing Db2 via DDF.
SAP is different from many other types of DDF work – and you’ve seen evidence of that in this long post.
One final point: SAP work comes in short commits / transactions – which makes it especially difficult for WLM to manage. In this set of data, for instance, there is relatively little scope for period aging. We have to consider other mechanisms – such as
- Using the Correlation ID structure to separate Batch from Dialog.
- Using DDF Profiles to manage inbound work.
- (Shudder) using WLM resource groups.
And, as I mentioned above,
- Using SAP’s own mechanisms for managing work.
I’ve learnt a fair bit from this customer situation, building as it does on previous ones. Yes, I’m still learning at pace. One day I might even feel competent. 🙂
And it inspires me even more to consider releasing my DDF SMF 101 Analysis code. Stay tuned!