I Said “Parallelise” Not “Paralyse” Part 4 – Implementation

(Originally posted 2012-03-08.)

Now with free map :-), this is the concluding part of a four part series on batch parallelisation, with especial focus on cloning.

In previous parts I discussed:

  1. Motivation
  2. Classification
  3. Issues

This part wraps up with thoughts on implementation. I’m going to break it down into:

  1. Analysis
  2. Making Changes
  3. Monitoring

While there probably are iterations of this, this is the essential 1-2-3 sequence within the cycle.

I’m repeating the example from Part 3, partly because I raised some issues in relation to this diagram I want to cover here:

Analysis

Finding good places to use cloning is the same as finding good jobs to tune, with one further consideration: Because cloning is riskier and more difficult to do you’d want to be sure it was the right tuning action.

If you can find an easier tuning action that makes the batch speed up enough for now, while being scalable for the future, do it in preference to cloning. If you’re "future proofing" an application to the extent where other tuning methods aren’t going to do enough then consider cloning.

Modifying this advice only slightly, consider that you might be able to postpone cloning for a year or two. In this case keep a list of jobs that might need to be cloned eventually.

A couple of examples of where cloning might be indicated are:

  • Single-task high CPU burning steps
  • Database I/O intensive steps

Of course, feasibility of cloning comes into it. I’d view this as the last stage in the analysis process. As I like to pun: "the last thing I’m going to do is ask you to change your program code". While there may be some cases where application1program change can be avoided, the majority of cases will require code surgery. The cases where surgery isn’t required are where the data can be partitioned and the existing program operates just fine on a subset of the data.

Making Changes

(This whole post is "Implementation" but this is the bit where the real implementation happens.)

Let’s divide this into six pieces, with reference to the diagram above:

  • Splitting the transaction file
  • Changing the program to expect a subset of the data
  • Merging the results
  • Refactoring JCL
  • Changing the Schedule
  • Reducing data contention

Splitting

As noted in Part 3, the transaction file drives the loop: Each cycle round it is triggered by reading a single record from this file. Suppose we wanted to clone "4-up" i.e. to create four identical parallel jobs. There are a number of ways we could do this:

  1. Use a "card dealer" like DFSORT’s OUTFIL SPLIT to deal four hands.
  2. "Chunk" the file, perhaps with DFSORT’s OUTFIL with STARTREC and ENDREC.
  3. Split based on criteria, You could use DFSORT OUTFIL with INCLUDE= or OMIT=. Or else you could use an application program.

There are considerations with all of these:

  • The card dealer (1) ensures (practically) equal numbers of records in each transaction file, but there is no sense of logical portitioning. So it could provide balance but at the expense of cross-clone contention.
  • Neither 2 nor 3 guarantee balance across the clones. For example, Method 3 might divide records into those for North, East, South and West regions – where that division could be decidedly unequal.
  • Method 3 might not be scalable to 8-up or 16-up, simply based on the difficulty of finding 8-way or 16-way split criteria.
  • Method 2 could allow some clones of the original application program to start earlier than others. In some cases this is a good thing, in others a problem.
  • Method 2 would need occasional adjustment to rebalance.
  • Method 3 implies non-trivial application coding to effect the split but provides the best chance of minimising contention between streams. (One neat coding shortcut if you’re using DFSORT to do the split is OUTFIL SAVE – which provides a "non of the above" bucket.) Whether you use DFSORT or a home-grown split program depends on the precise split logic – but DFSORT is much simpler and scales slightly more easily to e.g. 8-way and 16-way.

Changing Programs To Expect Subsets Of The Data

In our example the original program processed all the data. It could make the "I am the Alpha and the Omega" assumption. If we split the transaction file we forego this. The most obvious result is that any report the program would have written needs to be rethought: We probably will only be able to write out a file that feeds into a new report writer (Which we’ll talk about below.)

Merging

Batch steps produce, amongst other things, output transaction files and reports. For the sake of (relative) brevity let’s concentrate on these two:

  • Output Transaction Files

    Somehow we need to merge these files (though an actual sort is unlikely). It’s important to know what the sensitivity is and cater for it.

  • Reports

    Reports usually require some calculations, extractions to form headings, and so on. Sometimes a simple merge of the report output from the cloned program is enough. My expectation, however, is that serious reworking is usually required. Totalling and averaging would be typical examples of where it gets complex (but not impossible).

    I would remove the reporting from the original program and think about where it fits best in the merge. There are advantages to separating the data merge from the "presentation": If today the report is a flat file (14032 format?) you could enhance the report to also3 produce a PDF or HTML version. That might be a nice "modernisation".

Refactoring the JCL

JCL Management is not my forte but it’s obvious to me it’s worth examining the JCL for any job that’s going to be cloned to see how it can best be managed.

It might not be feasible to keep a single piece of JCL in Production libraries for the schedule to submit as multiple parallel jobs. If you can then parameterisation is the way to go. For instance it wouldn’t be helpful to have the job name hardcoded in the JCL. Similarly data set names which differ only by the stream number need care, as would control cards you pass into a program.

Changing The Schedule

You have to change the schedule to accept new job names – for the clones of existing jobs – , insert new jobs (for splitting, merging and reporting), and wire this all up with a reworked set of dependencies.

There are decisions to make, such as whether (in TWS terms) each stream should be its own Application, and what the dependencies should be. For instance, do you keep the streams in lockstep?

One of the key things is planning for recovery: Whereas, in our example, it was all one job step you now have three (or maybe) four phases of execution. Where do you recover from?

Reducing Data Contention

In our example, File A and File B were originally each read by the original application program. If they were keyed VSAM, for example, buffering might’ve been highly effective – particularly with VSAM LSR (Local Shared Resources) buffering. Four clones reading these two data sets will have to do more physical I/O. In the VSAM LSR case some hefty buffer pools could help reduce the contention going 4-up might introduce. In a database manager like DB2 things ought to be better: Data is buffered for the common good.

Dare I mention Hiperbatch? 🙂 For the smallish4 Sequential or VSAM NSR (Non-Shared Resources) case this might work well – but it would be a very uncommon approach.

As I hinted above, one of the things that might condition how you split the transaction file is what effect it would have on data contention. If you found a split regime where all the data the clones processed was split the contention could be very low.

If you got to the point where the split regime was "universal" or at least widespread enough some of the contention (and indeed Merging) issues would disappear completely.

Tape is particularly fraught: You can’t have two jobs read the same tape data set at the same time. I’ll indulge myself by mentioning BatchPipes/MVS here 🙂 as it provides a potential solution: A "tape reader" job (probably DFSORT COPY OUTFIL) copying the data to the clones through pipes.

However you do it, the point is you have to manage the contention you could introduce with cloning.

Monitoring

Monitoring isn’t terribly different from any other batch monitoring. You have the usual tools, including:

  • Scheduler-based monitoring tools – for how the clones are progressing against the planned schedule.
  • SMF – for timings, etc.
  • Logs

If you can develop a sensible naming convention for jobs and applications your tools might be easier to use.

One other thing: You need to be able to demonstrate that the application still functions correctly. This is not a new concept, of course, but application testing is going to be challenged by the level of change being introduced.

This concludes the four-part series. If you’ve read all four all the way through thanks for your persistence! The purpose in each was to spur thought, rather then be a complete treatise. My next task is to turn this into a presentation – as the need has arisen to do so. One final thought: If this long (but necessarily very sketchy) post has put you off please re-read Part 1 as I talk there about why this could be necessary.


1 This usage of "application program" most centrally refers to programs written in programming languages such as COBOL. It could also refer to things like DFSORT invocations. The point is these are difficult things to understand and to change.

2 You do know about DFSORT’s REMOVECC, don’t you? It tells DFSORT to remove ANSI control characters – such as page breaks. When separating data preparation from presentation you may well find it useful.

3 I bolded "also" here because the original report probably has a consumer – whether human or not – who’d get upset if it didn’t continue to be produced but might like a more modern format. And if it doesn’t… 🙂

4 While technically still supported, Hiperbatch has functional limitations, such as not being supported for Extended Format data sets (whether Sequential or VSAM). Further, the only way to process Sequential data with Hiperbatch is QSAM. (For DFSORT you’d have to write an appropriate exit – E15, E32 or E35 – to read or write the data set.)

I Said “Parallelise” Not “Paralyse” Part 3 – Issues

(Originally posted 2012-03-04.)

Part 1 and Part 2 were, in my opinion a little abstract. But I think they needed to be:

  • They set the scene for why parallelising your batch can be important.
  • They gave some vocabulary and semantics to help structure our thoughts.

Now we need to go a little deeper.

Let’s start with how to think about the problem of making a job or set of jobs more parallel.

Hetereogeneous Parallelism

Here the trick is to remove dependencies – and that’s where most of the issues are.

I covered a lot of this in Batch Architecture, Part Two.

Homogeneous Parallelism (Cloning)

This is where it can get really tricky. And that’s why the bulk of this post is about the homogeneous case. (Some of the following will also have relevance to the heterogeneous case.)

There are a number of issues to work through when cloning jobs. Here are some of them:

  • Converting the serial bulk processing model to something more parallel.
  • Handling inter-clone cross-talk
  • Resource provisioning
  • Scheduling

Parallelising The Bulk Processing Model

The reasons for using batch include the advantage of using a “bulk processing model”: Doing the same thing to lots of data in one job is much more efficient than breaking it up into a huge number of one-datum transactions. But, just because it’s more efficient to process 10 million records in a single batch job doesn’t mean it’s much more efficient than doing it in 10 1 million record jobs.

The trick with cloning is to find a way of breaking up an (e.g.) 10 million record job into ten parallel 1 million record jobs.

Consider the following diagram1:

A lot of bulk processing looks like this. The salient features are:

  • Reading a Master file, one record at a time.

    This could be a sequential file, a concatenation of these, a VSAM file, rows returned by a DB2 query, or any one of a number of other similar “files”. The point is it’s a large number of records or rows – perhaps the 10 million mentioned above. And any serious attempt to parallelise this job is going to have to split this file.

  • FIle A is read to provide detail.

    Hopefully this is a keyed (direct and buffer able) read. You don’t want to have to read the whole file to find a match to the record from the Master file.

  • Likewise File B.
  • The detail that is being filled in – in this case totals – is held in memory by the program.
  • When the Master file has been completely processed a report is written – using the summarisation information in memory.

I find this notion of a circuit, driven by a Master file, useful. If you find you can’t draw it that tells you something in itself. I’m sure it’s not the only bulk processing pattern, but it’s a very common one.2

(It would be difficult to attach timings to the activities in the loop. A reasonable stab could be made, under some circumstances, at the data set accesses’ proportions of the overall run time using SMF 42 Subtype 6 records.)3

This is only an example but it illustrates some issues that cloning needs to resolve:

  • The Master file needs to somehow be split into 10.
  • The ten sub-reports need to be reworked to produce a coherent and correct final report.

We could talk about resolutions of these – and I probably will in Part 4 – but the important thing is to acknowledge these are issues that have to be addressed.

Resources

If you’re going to run more jobs in parallel you could easily “spike up” resource usage, most notably CPU consumption. Memory use might increase also, though some usage patterns (such as DB2) tend to have a noticeable memory impact. I/O bandwidth and initiators are two more things to think about. In the I/O case it could be seen as a case of this. In any case, we know how to monitor resource usage, don’t we?

Handling Inter-Clone Cross Talk

While logically it might be easy to clone a job, things like locking can make it really difficult.

For example, in (a modified version of) the case above, File A and File B might be updated. These updates might be one per record in the Master file, or just at the end. In either case cloning would introduce some locking issues. It might be possible to resolve these issues – perhaps through partitioning.

Even in the unmodified version clones reading from File A and File B might create I/O bottlenecks. In the DB2 case you’d hope this would have a happy ending.

Scheduling

If you’re going to run multiple copies of a job in parallel you need to adjust the schedule.

There are design decisions like whether you keep clones in lockstep:

Consider this example: Suppose you have a pair of cloned streams – consisting of A0, B0 and C0 in Stream 0 and A1, B1 and C1 in Stream 1. Each Bn logically follows its corresponding An and each Cn follows the corresponding Bn, based on data flows.

  • If you release B0 and B1 only when both An jobs have completed it’s more controlled but probably takes longer.
  • If you release each Bn when the corresponding An completes its’ less controlled but probably takes less time.

The term “Recovery Boundary” is probably useful here as recovering from job failures is the thing that makes the complexity introduced by cloning really matter.

I advocate cloning in powers of two: 2-up then 4-up then 8-up, and so on. A modification to this is 3-up, then 6-up, then 12-up – which has a fairly obvious appeal, when you consider typical data sets.

Automating cloning is a useful aim, whether you want to “dynamically” partition the work or just want to be able to move from, say, 4 streams to 8 without too much trouble. I put the word “dynamically” in quotes as realistically the latest you could decide on the number of clone streams is just before you kick them off. In reality it’s probably much earlier than that.

So you need to solve the problem of how you might do this: The first step would be to decide what’s realistic. The second would be to decide what’s needed.



In this post I’ve highlighted some of the issues – mainly for cloning. They’re not terribly different from this you’d encounter if you were pursuing homogeneous parallelism (which you may also need to do). Part 4 will round out the series with some thoughts on implementation.


1 Made with Diagrammix for Mac. I like this particular style, untidy thought it may be. A nice demo is here.

2 There probably are formal diagrams of this type, probably with the letters “UML” attached to them. I don’t claim to be the kind of person who would use one. I just think teasing out the circuit like this is helpful for cloning. It’s the circuit itself I’m attached to.

3 Quite apart from the incompleteness of this approach there are issues with overlap between the data sets (and with CPU). A discussion of double buffering, access methods and I/O scheduling is well beyond the scope of this post.

The Sign Of The Four – How Mind-Mapping Turned One Blog Post Into More

(Originally posted 2012-02-26.)

Given I’m not paid to blog, and given I’ve no real motivation to maximise my blog post count, the frequency of posting is just "what it happens to be".

In that spirit this post isn’t about how to "game" blogging statistics (and it itself isn’t a gratuitous attempt to increment the count by one). 🙂

What I want to convey is my experience with mind-mapping software, in the hope it’s useful to you. It’s also not a review of a specific piece of software, though it’s inevitable my take on one particular piece of software will come into it.

Let’s get the "piece of software" element out of the way…

I use MindNode Pro on both Mac and iPhone. It’s two separate pieces of software that talk to each other. (I think the iPhone software would work well on the iPad but I haven’t tried it so don’t know if it takes advantage of the better "screen real estate".) The "that talk to each other" piece is significant in that you can work on a mind map on the move on the iPhone and then transfer it to a perhaps more powerful environment via Wireless. And back again. Successful "round-tripping" is very important.

So what’s "Mind Mapping" and how am I using it?

To me a mind map is a hierarchical organisation of ideas – in a branching tree format. The term suggests literally mapping your mind which is a rhetorical stretch and a half. But to me it’s just an attempt at organising a set of thoughts. Not being a particularly linear thinker some structure like a tree is reasonable – but a more generalised net is probably better for me.

My experiment was to see if I could create a better-structured blog post: At the time the post I had in mind was, frankly, a mess. It may still be – but it’s a better-organised mess. 🙂

So I started MindNode Pro up on the Mac and proceeded to dump ideas – breaking each idea down into sub-ideas. In the diagrams that follow you’ll see each idea is expressed in very few words.

When I "folded" up sub-trees I ended up with this:

The result was unexpected – a set of four sub-trees, each of which it was clear to me could be its own blog post – but only if I had enough material to make each worthwhile. The whole point here is there’s a natural division into four.

If you "unfold" the second sub-tree you get:

This is a fairly typical sub-tree – although the biggest one is sub-tree one (which creates web real estate issues so I chose the second one instead). So it actually re-inforced my view the material would be too much for a single blog post. Hence the four-parter.1

Trees are all very well but how do you make a post flow? This is part of what I’m going to term "Mind Map Debugging". That’s because there’s a little more to it than that:

  • Flow

    With (at least) the software I’m using you can cross link between tree nodes . I contemplate doing that – perhaps using dash lines – to see if I traverse all the important nodes with a reasonable flow. (This is sounding awfully close to the Seven Bridges of Königsberg Problem but I think that’s taking it too seriously. 🙂

    What I actually did was to eyeball the mind map and see if it flowed. Each sub-tree seems to flow well and the entirety feels right, too.

  • Rebalancing

    While doing formal tree balancing is pretty much pointless here, I do think that I might’ve needed to "rebalance" the four sub-trees if they were seriously imbalanced. Otherwise the four blog posts might’ve been one large one and three small ones.

    But this isn’t arbitrary data but rather blog posts fitting into a conceptual whole: The root node needs to stay the root node, in this case.

You can see the way my mind is working here:

  • The tree paradigm has some (admittedly) weak contribution to make – in thinking about mind mapping.
  • You could apply mind mapping software to other forms of tree depiction: I might well do that for data centres with machines in, which in turn have LPARS, then workloads, then address spaces, then transactions…

So there are lots of possibilities here.

I actually did use the ability to transfer mind maps between Mac and iPhone – adding a small number of nodes on the phone. This post was made possible by the fact you can selectively export parts of the mind map to a bitmap. So, I think the idea of mind maps and this particular implementation worked well. Next stop: FreeMind, which I’ve installed on my Linux work laptop.


1 and hence the "The Sign Of The Four" reference in this post’s title. It’s taken from Sir Arthur Conan Doyle’s second Sherlock Holmes novel. (I’ve tended to call him "SirACD" on Twitter.) 🙂

I Said “Parallelise” Not “Paralyse” Part 2 – Classification

(Originally posted 2012-02-26.)

I hope you don’t get the idea I’m overly into rigour, talking about Classification. But I think it has to be done – to provide terminology for this series of posts.

This is the second of four posts on Batch Parallelism, following on from Motivation.

If I think about how parallelism works in batch it broadly falls into two camps:

  • Heterogeneous
  • Homogeneous

(If you look these two terms up in Wikipedia (possibly for the spelling) 🙂 you get to see under a rather tasty 🙂 graphic the words "Clam chowder, a heterogeneous material".) 🙂

Let me explain what I mean by these two, in terms of batch classification.

Heterogeneous

Almost all customers run more than one batch job at a time. Personally, I’ve never seen anyone feeding through a single job at a time.

But a lot of the time it’s separate suites (or applications, if you prefer). Or certainly it’s running dissimilar jobs alongside each other.

You can further divide this case – in a way which actually makes it less abstract: 1

  • Unlinked

    This would be the case with totally separate suites, possibly from different lines of business.

  • Weakly Linked

    Again, these are separate suites, but this time the suites feed into each other – at least occasionally. These are less likely to be from separate lines of business – though a thoroughly integrated enterprise might have more cross-suite linkages.

  • Strongly Linked

    This would typically be the case of a single suite – where the whole point is to do related things, such that data flows between the jobs (and even steps).

By "linked" I’m mainly talking about data flows, though it could be operational cohesiveness.

Homogeneous

This is the case where work is very strongly related. There are two subcases:

  • Cloning

    It’s quite common for applications to be (re-)engineered so that identical jobs run against subsets of the data. This is commonly termed "cloning".

  • Within-Step Parallelism

    An example of this is DB2 CP Query Parallelism – where DB2 splits the task up into, effectively, clones – but manages them as a single unit of work.

    Not quite the same, but possibly best fitting here, is substep parallelism.

Which Do YOU Do?

I think most customers do "heterogeneous" to a very considerable degree. That’s because it comes naturally and is the way the business has grown and driven things.

Less common (and I was recently pressed to give a view on how common) is "homogeneous". That’s because it takes real effort.

The answer I gave was something along the lines of "I don’t know for certain but I guess about 30% of customers do homogeneous". 2The reason I gave that answer is because I suspect homogeneous parallelism gets added to applications to make them perform.

It’s my view that applications and going to have to become more homogenously parallel in the future – because of the dynamic I described in Part 1: Over time the speed up required of individual actors (typically batch jobs) is likely to outstrip that delivered by technology.

To become more homogeneously parallel we’re going to have to understand the batch applications much better. (Actually that’s true also of efforts for more heterogeneous parallelism as well. Parts 3 and 4 of this series will address some of this understanding – and provide some guidance on what’s going to need to be understood. And hopefully will make this classification seem less dry and more helpful. 🙂


1 There’s probably a rule that says the leaf nodes of classification schemes yield a higher proportion of concrete examples.

2 The "I don’t know for certain" part of it is because I recognise I see a "self-selecting group" or "biased sample" of customer situations: Those that are particularly thorny or exceptionally critical.

I Said “Parallelise” Not “Paralyse” Part 1 – Motivation

(Originally posted 2012-02-19.)

I have enormous trouble pronouncing "parallelise" right – and not saying "paralyse". It’s true, and I bet many of you have the same trouble (sober or not). It’s on a par with "red lorry yellow lorry" or "the Leith Police dismisseth us". 🙂

But it’s a word I think we’re going to have to get used to pronouncing right. And this post will explain why.

This looks to me like a 4-part series of blog posts on increasing Batch Parallelism. (It started off looking like one but the way it turned into four is perhaps material for another post.)

So, why will parallelising batch become increasingly important? There are really three main reasons:

  1. Increased "Window Challenge"
  2. Resilience
  3. Taking Advantage Of Capacity

There is some overlap between these but I think they’re sufficiently distinct to draw out separately – which is what the rest of this blog post does.

Increased "Window Challenge"

From a business perspective this is the big one. I’m seeing a number of business trends that are leading to one inexorable conclusion: The delivered growth in speed of "single actors" (batch jobs) will be outstripped – over time – by the need. In other words, you can’t long-term just buy yourself out of trouble, whether we’re talking about processor speed, disk subsystem or tape speed, transmission line speed, or anything else for that matter.

It’s true this varies by installation, and even between applications (or suites) or business lines in an individual organisation. But this is the general pattern. It’s also a fact that the pressure comes in waves – because of the nature of the underlying business requirements.

Amongst the business drivers I’ve seen:

  • Business volume increases.

    Hopefully these are driven by success.

  • Mergers and acquisitions.

    Typically I’m seeing the same application having to cope with more data as one or other party’s application is adopted.

    A similar trend is "standardisation of procedures" where existing lines of business come together to use a single application

  • More processing.

    In the merger scenario above I’ve seen cases where taking two companies’ data and passing it through the "ongoing" applications means these applications have to be modified (with generally greater pathlength). And decommissioning the "offgoing" applications is another complicating factor.

    External pressures such as regulation often lead to more work per unit of business volume.

    Modern techniques such as Analytics get injected.

    And of course our old friend "just because" i.e. processing grows for all sorts of reasons.

  • Shortened Window.

    Much has been said about running batch and online concurrently. But shortening the batch window itself remains important for a number of reasons, amongst which are:

    • Even if you overlap everything there are still only 24 hours in the day.

      In other words the work still has to get done in the cycle, whatever that cycle may be.

    • Running online and batch together increases the aggregate resource requirement.
    • Batch jobs taking locks (or causing database I/O) can still interfere with transactions.
    • Batch and online concurrency is still a difficult feat to achieve.
    • There are often deadlines within the batch and sometimes these get tightened up.

Resilience

With a single-threaded job stream just one broken application data record can hold up the whole thing. Or the loss of an LPAR or DB2 subsystem or VSAM file.

Partitioned data can mean an increase in resilience. For example:

  • If the data were processed by geographic region (and you had, say, 5 regions) the damage of a broken record is limited to that region.

    This, of course, depends on region-level separation. And, naturally, any failure is unwanted – but the business impact could be much reduced.

  • If the LPAR were to fail in a correctly-set-up multi-image environment, again the impact could be limited.

    There’s a lot to this one. For example, retained locks by a DB2 datasharing member could limit the benefit.

Taking Advantage Of Capacity

Businesses have tended to size machines by online day requirements and there remains the view that generally it is online that’s the peak use of resources. My experience is that about half of installations have batch as the real CPU driver (but probably not the memory driver) and more than half have a bigger I/O bandwidth challenge overnight than during the day.

Where the online day is still the main resource driver an increase in parallelism can usefully absorb the spare capacity overnight.

Where Next?

I contemplate this being a four-part series of blog posts. This part has concentrated on business drivers, almost to the exclusion of technology. The other three posts I expect to be, in order:

  1. Classification.
  2. Issues.
  3. Implementation.

The titles and scope might change a little bit as I flesh them out. I’ll leave you in suspense 🙂 as to what "Classification" might be.

I Know What You Did Last Summer – Some Structure At Last

(Originally posted 2012-02-13.)

Way back in April of last year I started to talk about a presentation I hoped to write: "I Know What You Did Last Summer" and I showed a brain dump of ideas. Then in June I blogged the abstract (complete with a revision in a subsequent comment). Despite the occasional comment on Twitter it all went quiet until today.

Now the more cynical among you will be remarking that I forgot all about it. Actually that’s not true. Two things needed to occur before I was going to make much progress:

  1. There needed to be a compelling deadline to work to. (Doesn’t there always?) 🙂
  2. I needed a narrative framework.

I sort of have 1 – this presentation really will have to be completed before the May timeframe if I’m to present it at a couple of conferences in Europe.

What I want to talk about today is the fact I have 2 – a narrative framework that I think will work.

What I had all along was a message. It goes something like this:

"While we traditionally value the instrumentation on the z/OS mainframe for Performance and Capacity, there are other ways of using what we have – most notably for Inventorying, Gleaning System Understanding, and Talking to IT Architects."

That was the abstract notion I walked in with and, if anything, it’s amplified now rather than attenuated.

The following two graphics from the presentation are the first and last in a layered sequence that provides the narrative framework:

We start with a very high level "Physical Resources" view:

and proceed down until we reach a much more logical "Application Componentry" view:

I won’t spoil your page-loading enjoyment by showing the graphics for the intermediate layers in this blog post. Suffice it to say the colours represent layers. Let’s talk a little more about layers…

Untidiness Of Layering

The layers I present aren’t strictly hierarchical: Without padding out the presentation I’m not going to make them so. But here they are and you’ll see what I mean:

  1. Physical – Blue
  2. LPAR – Turquoise
  3. WLM Constructs – Red
  4. Address Space and Coupling Facility Structure and XCF Group / Member – Purple
  5. Application – Green

If I really did treat Layer 4 as three separate layers where would it end? It would certainly make the presentation more turgid.

What I can say is that all the elements of Layer 4 belong below Layer 3 and above Layer 5. And that when I look at systems I try to do it in this sequence.

Sparseness Of Style

You’ll notice a lack of words and a lack of connectors. In the real world, of course, there’d be things like CF links and LPARs would have names. But the message isn’t helped by adding any of these. And a certain sparseness of style feels right to me.

Gratuitous Graphics?

You might ask "Why have these graphics at all?" Generally that’s an acid test I apply – possibly to excess. Those of you who’ve seen me present know typically the only graphics in my presentations are graphs. In this case I think a sequence like this helps.

It should be noted I’m under no pressure to "jolly it up" with lots of pretty graphics. In fact this isn’t a commissioned presentation at all: It’s one I think is important. So it gets whatever style I choose to give it, perhaps with advice from others such as you.

Flexibility Of Timing

I joked today on Twitter:

Question: “How long is a piece of string?”

Answer: “Fifty minutes plus questions to One Hour plus questions, depending”.

OK, not a very funny joke but it makes a point:

When I present I generally get 1 hour slots or 1 hour 15 ones. For any presenter it’s tough taking a presentation and shrinking / stretching it appropriately. This structure gives me quite a lot of flexibility, I think. I foresee no difficulty adjusting to any time slot.

Conclusion

This structure enables me to survey the ground in a structured fashion – drawing on instrumentation from a diverse set of sources. And then it provides me a launch pad to make the other points.

For example, the “Inventorying” and “Talking to IT Architects” points flow naturally from this.

So now I’ve got a structure I can get going with the rest of the presentation. I think at last I can say I actually have a show. The rest is just details, inspiration and perspiration. And believe what you will about the proportions of the last two. 🙂

Now if anyone can tell me how in OpenOffice.org to make it honour a PNG file’s transparency I’d be grateful. The original graphs were made using Diagrammix on a Mac and exported as PNG files with a transparent background. When composing this post Firefox was entirely happy to honour that but it seems OOo isn’t. 😦

Would You Like More WLM Information In DB2 Accounting Trace – And How Would You Use It?

(Originally posted 2012-02-06.)

I was lucky enough to be in Silicon Valley Lab for DB2 BootCamp last week. There I ran into a DB2 developer I’ve worked very successfully with in the past – John Tobler.

(He’s the guy I look to for questions and issues with DB2 SMF data.)

We had a good discussion about something I’d personally like to see in DB2 Accounting Trace – more WLM information – and this post is as a result of this conversation.

Two salient pieces of information:

  1. Accounting Trace already has a field for WLM Service Class (QWACWLME) but it’s only filled in for DDF work.
  2. As Willie Favero pointed out in APAR Friday: WLM information is now part of the DISPLAY THREAD command the command now has some WLM information in it.

Putting these two together you come to the conclusion it might technically be possible to get more WLM information into Accounting Trace. That, of course, doesn’t mean it’s going to happen. I have to stress that before going any further. But it’s worthwhile thinking about what’s needed and how useful that would be to customers.

What Should Be Added?

Uncontroversially, I think, QWACWLME should be filled in with Service Class for all work types. I say "uncontroversially" because – if it can be done cheaply – it’s just using space that’s already in the record. I don’t know if it can be done cheaply, though.

More controversially because, taken together, they represent 18 additional bytes in each 101 record are:

  • WLM Workload
  • WLM Report Class
  • WLM Service Class Period

I think I could live without Workload but it seems a shame to exclude it.

As Willie points out Performance Index (PI) is also in the DISPLAY THREAD command but I think we can get that from RMF Workload Activity Report (SMF 72) and that’s probably a better place to get it from.

But the key question is “how useful and important would this extra information be to you?”

Let me outline three areas of use I can immediately see…

Understanding Not Accounted For Time

This time bucket is what you get when you subtract all the time buckets we know about from the headline response time. The two most important causes for this are CPU Queuing and Paging Delay.

If we calculate this time for a record and we know the (behaviour of) the WLM Service Class it’s in we can understand this time better. A bugbear of doing DB2 performance is just this: understanding whether work is subject to queuing or not. (For Paging Delay as a cause of Not Accounted For Time we could do much the same thing.)

Understanding The WLM Aspects Of DB2 Work

It would be useful to be able to break down the work coming into a DB2 subsystem by Service Class, Goal and Importance, wouldn’t it? In particular it would be nice to see the hierarchy of goals and importances, and to be able to relate the works’ WLM attributes to those of address spaces such as DIST and DBM1. (In the former case discovering that the TCB’s in the DIST address space were subject to pre-emption by the DDF work would be a blow.)

Correlating Service Class And Report Class For DDF Work

For non-enclave work I use the Report Class and Service Class in Type 30 to establish how these relate to each other (and what kind of work has which RC and which SC). I can’t do it for DDF work because there’s no usable Type 30 (i.e. with this kind of information in). If the 101 record had these both in you could extend the method.

(In case you wonder what I’m talking about see What’s In A Name?.)

This still doesn’t help us in the non-DDF enclave cases, of course.

Over To You

What do you think? I’ve listed three categories of value that immediately spring to mind (and that’s with the disbenefit of jetlag so maybe not that articulately expressed). But I’d really like to know if this would be of value to you – and to modify the proposal if you think you’d like something slightly different.

There’s no guarantee this will get done – and it’s a bit of an attempt at a “Social Requirements Gathering” process. But it’s worth debating in public, I think.

Haven’t We Been Here Before?

(Originally posted 2012-01-28.)

Well, some of us have. 🙂

Well before we announced zEnterprise I thought it would be rolled out and adopted in a similar manner to Parallel Sysplex (and to many other technologies – whether mainframe or otherwise).

Reading zEnterprise Use Cases Start Rolling In I still think I’m right. And I will admit I needed to see something encouraging like this.

Back in the mid 1990’s we introduced Parallel Sysplex. In fact we started with Sysplex and then added the "Parallel" elements to it.

Adoption of Parallel Sysplex took a while. And hence the folklore and confidence in the value proposition took a while to take root.

If I were to list the things that needed working on to make Parallel Sysplex mainstream you might mistakenly think the same list (or even a similar sized list) of "to do’s" applied to zEnterprise. You can’t draw that conclusion. You can draw the "appropriately speedy train coming" parallel but that’s all.

But let’s revisit (a subset of) that list:

  • Performance and efficiency improvements.
  • More exploiters
  • More function
  • Enhanced Availability
  • Extra Instrumentation
  • Field – whether IBMer or customer or consultant or third-party vendor – experience

As I said, don’t take that list as a template for the way zEnterprise is going to evolve. But if you "squint" at the list some familiar themes emerge.

And the referenced blog post addresses one of these: Customer experience. Though I don’t manage the agendae for conferences it wouldn’t surprise me if we saw some "customer experience" presentations soon.

As a young Systems Engineer in the late 1980’s I saw a number of considerably simpler product function introductions. As those of us who were around all know there was a hurry on – at least from IBM’s perspective: Our competitive differentiator (and new product vs old differentiator) was new function we hoped customers would adopt quickly and really value. You can think of Hiperbatch if you like. But if you do I’d prefer you to think of the MVPG instruction (the hardware function it relied on) which was used by a number of other functions to cut CPU. I’m thinking primarily of VSAM LSR Hiperspace buffers here. And, while we’re at it how about ADMF? Both MVPG and ADMF were used together by DB2 Hiperpools – again to cut CPU.

The reason for detailing MVPG and ADMF is they had clear advantages for many customers – and still they took in excess of 18 months from announcement to widespread adoption. I’d say they were simple to implement as well.

I don’t think anyone would claim Parallel Sysplex or zEnterprise full functionality are quick or simple to implement: If you’re looking at the sheer sweep of what we’re doing I think that’s appropriate.

So, I think we’re in good shape: We’re now seeing implementations and I’m sure we’re going to see many more. And I do think the Parallel Sysplex analogy is a good one – in terms of choreography of adoption.

Sometimes I think those of us have been around have only the “we’ve been here before” perspective to offer. Actually I think we do have that. But, of course, I think we have a lot else besides to offer: Thinking about Systems and value as well as the “calmness” 🙂 of knowing “this is how it goes”.

This is going to be fun – and fun soon. 🙂

A Better Calibre of Kindling

(Originally posted 2012-01-23.)

You might consider it showing off if I mention I got a Kindle for Xmas. Feel free to. 🙂 But I’d like to share my experience with you – as you might find it useful anyway.

First, I really like the Kindle as it stands. Mine is a Keyboard 3G one. I felt both the “keyboard” and 3G elements were important:

  • I surmised (correctly) I’d want to take notes.
  • I surmised (equally correctly) I’d want to be able to do things wherever I was that would need access to “Kindle Central”. (Actually, access at 35,000 feet will have to wait.)

I’ve found the basic act of reading on the Kindle to be at least as rewarding as reading paper books. I also appreciate putting an end to being engulfed by the rising tide of new books.

(In the house I seem to be the one that wants to keep books once I’ve read them. I’m also the one who doesn’t feel the need to complete a book if I’ve read it. So I have several books on the go at the same time on the Kindle and it’s kept track of where I am with them all. Yes, I know it’s called a bookmark so no distinct advantage there.)

I also appreciate the social aspect:

  • Sharing snippets via the Kindle website and posting links to them on Twitter. Some of you will have seen that – probably most of you given I propagate tweets to Facebook and LinkedIn.
  • I’m re-reading Terry Pratchett’s “The Colour Of Magic” and it’s nice to see “you and 5 people” against key quotes. I don’t know who these people are but already I feel kinship with them. 🙂

Book delivery is pretty swift – which is much more than can be said of ordering paper books. And I’ve used the “try a sample” capability several times: With both positive and negative buying outcomes. I’m using the Amazon “Wish List” as my queue for acquiring books so I don’t necessarily buy immediately.

Calibre

There isn’t much need for curation but my tool of choice for doing so is Calibre which is available for Windows, Linux and OS X. (I run it on Linux and OS X (though others in the house have Windows and there’s one other Kindle in the house). It’s free and it’s very good. One tip: If you’re using it on Linux it’s probably best to install it directly, rather than going through e.g. Debian repositories. I say this because it’s frequently updated and the repositories seem to be way behind.

I used Calibre with my old Sony PRS-700 eBook reader – which I found to be unusably slow and hard to read. (The Kindle is neither of these.)

Calibre does a number of things for me. Most notably it lets me:

  • Convert books from other formats e.g. EPUB.
  • Download RSS / Atom “news” feeds and convert them to MOBI so I can read them on Kindle.
  • Edit metadata for books – such as titles and authors. (Mainly this is worthwhile for books that weren’t from the Kindle Store – as some of them have dubious spellings etc.)
  • (I actually don’t feel the need to have Calibre back up my Kindle – though it will do that as well)

Calibre has a lot of sophistication built into its conversion. I’ve yet to fully explore what it can do, for instance, to tidy up conversion of PDF documents. Page footers, for one, need removing on conversion.

One other thing: You can use Calibre in Batch Mode. That might well help with automation.

Project Gutenberg

I’ve known for a long time about Project Gutenberg. To quote from their website:

“Project Gutenberg offers over 38,000 free ebooks: choose among free epub books, free kindle books, download them or read them online.”

Two good things to note:

  • Project Gutenberg has a rigorous copyright checking process – so everything is out of copyright or otherwise in the public domain. I’m against ripping off authors, so this is a good thing.
  • The books are well formatted: eBook quality can vary enormously, to the point where books can be frustratingly hard to read (in the worst case).

Without listing the catalog I’d say you can find many classics there. The “usual suspects” like Chaucer, Shakespeare and Oscar Wilde are represented (all of which I have on my Kindle), along with many others. (I wish Raymond Chandler were there but the absence of his works probably means they’re still under copyright protection.)

Distributed Proofreaders

So, where do Project Gutenberg books come from? I can’t say this is true of all of them but many come from Distributed Proofreaders. The idea of this is that people sign up to proofread OCR’ed pages – one page at a time. I signed up to do this and worked on the first proofreading of two books. I’d never heard of the books before and the actual process was good as I found the books interesting in their own right.

The OCR process was pretty accurate but the proofreading was absolutely necessary. I think it might be possible to codify many of the errors in the OCR process as they were repeated.

There are several rounds of proofreading and so the results – books in Project Gutenberg – are very good. There’s a lot of emphasis on not correcting the spelling or punctuation, and on not editorialising.

More volunteers are needed. As I say I’ve enjoyed doing it.

Hacking

If you connect a Kindle to a PC or Mac (and I’ve done both) the Kindle shows up as a removable drive. The most useful thing you can do with it is to extract the ‘My Clippings.txt’ file. This contains all your bookmarks and annotations. It’s reasonably hackable: While it’s not XML (and I really wish it were) it has a simple-to-understand and easy-to-parse format in plain text.

One challenge I’d like to see someone meet is processing this file and creating Evernote notes. True you can get at your annotations etc from Amazon but I think there’s value in easing getting marked up passages into Evernote. Indeed I’d be pleased if Amazon and Evernote worked together to provide a slick “clip to Evernote” function for Kindles other than the Fire.

I have other hacking challenges else I’d work on this one – processing the file – myself. I know that doing it for Windows (and Linux under Wine) and for OS X would mean two separate pieces of code.

So Why Am I Still Carrying Around Paper Books?

It turns out I still have a few books to get through in paper format before I go “all electronic”. I also expect there to be incidences where someone gives me a book. I consider those to be “beyond my control”. 🙂

One final thing: For another view (although a corroborative one) see Susan Visser’s blog posts on the subject.

Rough And Ready?

(Originally posted 2012-01-20.)

A couple of items from the world of music caught my attention recently – and there’s some commonality between them:

  • According to Dave Grohl of Foo Fighters’ blog post: Hey everybody, Dave here’

    “From day one, the idea for this record was to make something completely simple and honest, to capture that thing that happens when you put the 5 of us in a small room. No big production, just real rock and roll music: That’s why we decided to do it in my garage. We wanted to retain that human element, keep all of those beautiful imperfections: That’s why we went completely analog.”

and

Of course I have both the Foo Fighters album (Wasting Light) and Beyond Magnetic. I thoroughly enjoy them and their roughness in no way detracts from the value I get from them. In fact both these comments surprised me.

Now granted neither Foo Fighters nor Metallica are known for their subtlety. 🙂 But they are known for being amongst the best bands active today.

There is of course another band of exceedingly high effectiveness: Queen. Now they are known for their subtlety (mostly). 🙂 But they’ve not been all that active for many years – for obvious reasons. 😦

It turns out there’s quite a lot of stuff in the Queen vaults that never officially saw the light of day. The suggestion is it’s unfinished and therefore not to be released. I, like many other fans, have heard some of this. We tend to think most of it meets our releasability criteria. Take for instance a song called I Guess We’re Falling Out. If you listen to it it’s clearly unfinished but absolutely exquisite. Now whether it should be released finished or unfinished is a good question. But I think it should certainly see the light of day.

Now this post isn’t just a rail against Queen Productions. It is that 🙂 but it’s also about the wider point:

When is something good enough to see the light of day?

I’m obviously not advocating shoddy work – and none of these three examples from music represent that. But sometimes throwing something Rough And Ready (the title of this post, complete with pun) out there is the right way to go. And sometimes it’s not.

  • When I put together new analysis code it’s prototypical. And it’s the commitment to refine it in the light of experience that’s key here. As is the appropriate level of tentativeness involved.
  • When I’m doing something where quality is critical it’s a different matter entirely.

This post isn’t profoundly philosophical 🙂 but it’s an area I did some thinking about over the holiday season. This time no new code of any value emerged from the holiday. But this and a couple of other lines of thinking did. Maybe I’ll post about those soon.