I loftily talk about “architecture” a lot. What I’m really getting at is gleaning an understanding of an installation’s components – hardware and software – and some appreciation of what they’re for, as well as how they behave.
When I started doing Performance and Capacity – many years ago – I was less sensitive to the uses to which the machines were put. In fact, I’d argue “mainstream” Performance and Capacity doesn’t really encourage much understanding of what I call architecture.
To be fair, the techniques for gleaning architectural insight haven’t been well developed. Much more has been written and spoken about how to tune things.
Don’t get me wrong, I love tuning things. But my origin story is about something else: Perhaps programming, certainly tinkering. Doing stuff with SMF satisfies the former (though I have other projects to scratch that itch). Tinkering, though, takes me closer to use cases.
What’s this got to do with Db2?
First, I should say I’ve been pretending to know Db2 for almost 30 years. 🙂 I used to tune Db2 – but then we got team mates who actually did tune Db2. And I never lost my affinity for Db2, but I got out of practice. And the tools I was using got rusty, some of them not working at all now.
I’m heading back into Db2 because I know there is an interesting story to tell from an architectural point of view. Essentially, one could morph tuning into asking a simple question: “What is this Db2 infrastructure for and how well suited is the configuration to that purpose?” That question allows us to see the components, their interrelationships, their performance characteristics, and aspects of resilience.
So let me give you two chunks of thinking, and I’ll try to give you a little motivation for each:
- Buffer pools
I am, of course, talking mainly about Db2 SMF. I have in the past also taken DSNZPARM and Db2 Catalog from customers. I expect to do so again. (On the DSNZPARM question, Db2 Statistics Trace actually is a better counterpart – so that’s one I probably won’t bother asking customers to send.)
I’m experimenting with the structure in my two examples. For each I think two subsections are helpful:
- Technique Outline
If this structure is useful future posts might retain it.
He’re we’re talking about both local pools (specific to each Db2 subsystem) and group (shared by the whole datasharing group, but maybe differentially accessed).
Some Datasharing environments comprise Db2 subsystems that look identical. If you see one of these you hope the work processed by each Db2 member (subsystem) in the group is meant to be the same. The idea here is that the subsystems together provide resilience for the workload. If the Db2 subsystems don’t look identical you hope it’s because they’re processing different kinds of work (despite sharing the data).
I think that distinction is useful for architectural discussions.
More rarely, whole Datasharing groups might be expected to resemble each other. For example, if a parallel sysplex is a backup for another (or else shares a partitioned portion of the workload). Again, a useful architectural fact to find (or not find).
Db2 Statistics Trace IFCID 202 data gives a lot of useful information about individual buffer pools – at the subsystem level. In particular QBST section gives:
- Buffer pool sizes
- Buffer pool thresholds – whether read or write or for parallelism
- Page frame sizes
At the moment I’m creating CSV files for each of these. And trialing it with each customer I work with. I’m finding cases where different members are set up differently – often radically. And also some where cloning is evident. From SMF I don’t think I’m going to see what the partitioning scheme is across clones – though some skew in terms of traffic might help tell the story.
Let me give one very recent example, which the customer might recognise but doesn’t expose them: They have two machines and each application group has a pair of LPARs, one on each machine. On each of these LPARs there is a Db2 subsystem. Each LPAR’s Db2 subsystem has identical buffer pool setups – which are different from other applications’ Db2’s.
Db2 Statistics Trace IFCID 230 gives a similar view for whole Datasharing groups. Here, of course, the distinction is between groups, rather than within a group.
IDAA is IBM’s hardware accelerator for queries, coming in two flavours:
- Stand-alone, based on System P.
- Using System Z IFLs.
The purpose of IDAA servers is to speed up SQL queries (and, I suppose, to offload some CPU). Therefore I would like to know if a Db2 subsystem uses an IDAA server. Also whether Db2 subsystems share one.
IDAA is becoming increasingly common so sensitivity to the theme is topical.
Db2 Statistics Trace IFCID 2 has a section Q8ST which describes the IDAA servers a Db2 subsystem is connected to. (These are variable length sections so, perhaps unhelpfully the SMF triplet that describes them has 0 for length – but there is a technique for navigating them.)
A few notes:
- The field Q8STTATE describes whether the IDAA server is online to the Db2 subsystem.
- The field Q8STCORS is said to be core count but really you have to divide by 4 (the SMT threads per core) to get a credible core count – and hence model.
- There can be multiple servers per physical machine. But we don’t have a machine serial number in Statistics Trace to tie the servers on the same machine together. But some fields behave as if they are one per machine, rather than one per server. So we might be able to deduce which servers are on which machine. For example Q8STDSKA – which also helps distinguish between generations (eg 48TB vs 81TB).
I’m sure there’s much more I can do with SMF, from a Db2 architecture point of view. So expect more posts eventually. Hence the “Part 1” in the title. And, I think it’s going to be a lot of fun continuing to explore Db2 SMF in this way.
And, of course, I’m going to keep doing the same thing for non-Db2 infrastructure.
One other note: I seem to be biased towards “names in frames” rather than traffic at the moment. The sources of data do indeed allow analysis of eg “major user” rather than “minor user”. This is particularly relevant here in the case of IDAA. One should be conscious of “uses heavily” versus “is connected to but hardly uses at all”. That story can be told from the data.
I’m continuing with the idea that the “making of” might be interesting (as I said in Coupling Facility Structure Versions). It occurs to me it might show people that you don’t have to have a perfect period of time and place to write. That might be encouraging for budding writers. But it might be stating “the bleedin’ obvious”. 🙂
This one was largely written on a flight to Toronto, for a customer workshop. Again, it poured out of my head and the structure naturally emerged. There might be a pattern here. 🙂
As an aside, I’m writing this on a 2021 12.9” iPad Pro – using Drafts. I’m in Economy – as always. I’m not finding it difficult to wield the iPad, complete with Magic Keyboard, in Economy Class. I’m certain I would find my 16” MacBook Pro cumbersome in the extreme.
And, of course, there was tinkering after I got home, just before publishing (but after a period of reflection).