(Originally posted 2015-02-15.)
In He Picks On CICS I mentioned XCF traffic and CICS. This post is about a customer situation where looking at this traffic was important.
Often I’m looking for topology (maybe “tourist information” to some of you). This time I have another motivation: Performance. In this customer saving z/OS CPU is important. 
So the important question is “which XCF groups and members are driving this traffic, and this CPU?”
But we are in “the point, however, is to change it”  mode.
So, hard on the heels of the first question is another one: “What conversations between members are driving the traffic”. This question is a prelude to discussions about how to actually reduce the traffic – which is the eventual aim.
Customer Case Study
I’m going to simplify the customer situation without, I hope, any loss of fidelity.
The major members of the DFHIR000 group on each system are:
- On SYSA is a region I’m calling CICSF. From SMF 30 I can see it does a lot of I/O. From SMF 74–2 I can see a lot of traffic between it and SYSB in group DFHIR000.
- On SYSB are regions I’m calling CICS1, CICS2 and CICS3. These are the ones with the vast majority of the DFHIR000 traffic to SYSA. They also perform very little I/O.
I don’t see much traffic from SYSA or SYSB to other systems in the DFHIR000 group.
The graph below plots, across a day, the traffic for these four members (and it’s genuine, from the data).
We can make the following observations:
- CICSF traffic more or less matches the sum of traffic for CICS1, 2 and 3. But not quite. And it tracks well across the day.
- CICS1 and 3 traffic is pretty evenly matched. So they can be viewed as clones.
- CICS2 has much more traffic than CICS1 and 3. So it’s doing something different. (Or at least more)
- The traffic has peaks at specific times of day. This might be significant.
- The CICS regions don’t go down overnight.  They merely slow. 
One thing the graph doesn’t show, but the 74–2 data does, is XCF traffic is even in each direction.
So here are some, admittedly tentative, conclusions:
- CICSF is in some sense data owning. The others aren’t.
- CICS1 – 3 ship requests for data to CICSF. The one-to-one ratio of inbound to outbound requests supports that: A request for data followed by the data being returned.
- While the traffic match is pretty good there are probably other CICS regions involved.
- We wouldn’t see requests to CICSF from other regions on SYSA – as they wouldn’t be using XCF.
This Is Not Topology
We can’t claim we’re seeing the whole topology this way, for two reasons:
- The traffic doesn’t entirely match, as the graph shows.
- Traffic from other CICS regions on SYSA to CICSF isn’t detectable.
Yes, these are a restatement. The first one is perhaps resolvable with more processing of the 74–2 data. The second would require different sources of data. Maybe a (guessed) naming convention could help me here. It has before. 🙂
We could, for example, be only seeing part of this topology:
In the above the dashed lines are not XCF. But we could probably guess them just from the existence of regions CICS4, 5, 6 – especially if they behaved like CICS1, 2, 3. 
The purpose of this, remember, is to begin to discuss tuning actions that can reduce the XCF traffic between SYSA and SYSB and hence the cost.
I think “begin” is right: Obviously deeper discussions on which region should own the data, for example whether VSAM RLS is the answer, and so on are needed. But at least this is better, I claim, than just saying “try to get DFHIR000 XCF traffic down”.
And note the matching I did is what I call a “guessing game”. It really is. But one day I’d like some code that helps me do the guessing. Maybe I’ll have to build it myself. 🙂
For which customers isn’t that the case? 🙂 ↩
Actually reducing Coupling Facility CPU would be handy for the customer, but it isn’t the primary goal. ↩
Called “XCFAS” in fact. ↩
One of my reports uses XCF traffic (all groups) to determine which systems really talk to each other. ↩
There are other systems in the sysplex. But their XCF traffic to SYSA and SYSB is minimal, especially in group DFHIR000. ↩
This is a customer on Eastern Standard Time servicing multiple timezones across North America and further field. Make what you will of the traffic pattern by time of day. ↩
I have code that finds all the CICS regions in the data, with enough information in the report about each of them to make such matching feasible, so this is not far fetched. ↩