XML, XSLT and DFSORT, Part Two – DFSORT

(Originally posted 2011-05-20.)

Following on from this post and this one, this post discusses the DFSORT piece.

The DFSORT code in this post parses the Comma-Separated Variable (CSV) file produced by XSLT processing. In this simple example it merely produces a flat file report, but the post has a few additional details you might find valuable.

First, here’s the SORTIN DD JCL statement. It’s not like a regular sequential file statement as it has to access the zFS file system we wrote the data to with Saxon:

JCL SORTIN DD Statement
//SORTIN    DD  PATHOPTS=(ORDONLY),RECFM=VB,LRECL=255,BLKSIZE=32760,  
//          PATH='/u/userzfs/myuserid/testXSL.txt',FILEDATA=TEXT 

Of particular note in this DD statement is the record format (VB), the logical record length (255) and the block size (32760). This is definitely VB data. I’ve found a LRECL greater than the maximum size Saxon has produced is fine. Similarly a sensible block size works. FILEDATA=TEXT is also needed.

Here’s the SYMNAMES file:

Contents of the SYMNAMES File
RDW,1,4,BI                                                            
Row,%01 
a,%02 

You’ll need an accompanying SYMNOUT DD – for the messages DFSORT (or ICETOOL) produce when the SYMNAMES file is processed.

I’m showing you this first so you can understand the main DFSORT control statements file: Everywhere you see the symbol "Row" in these statements you can interpret it as "%01", whatever that is. Similarly for "a" and "%02". The "RDW" symbol maps the Record Descriptor Word that we need for variable-length record processing. (DFSORT can convert from variable- to fixed-record format but we won’t do that here.)

Now for the control statements:

Contents of the SYSIN File
  OPTION COPY,VLSHRT                                                1
  INCLUDE COND=(1,2,BI,GE,+12)                                      1 
  INREC IFOUTLEN=70,                                                2
    IFTHEN=(WHEN=INIT,                                              3
      PARSE=(%01=(STARTAFT=C'"',ENDBEFR=C'",',FIXLEN=10),           4
             %02=(FIXLEN=8)),                                       5
      BUILD=(1,4,%01,%02)),                                         6
    IFTHEN=(WHEN=INIT,BUILD=(RDW,Row,X,a,SFF,EDIT=(I,IIT)))         7

This is a very simple case of using DFSORT. So, for example, there’s no SORT, no OUTFIL, nor any ICETOOL sophistication. It’s meant to show how you can get the data into a format DFSORT can use. Let me explain how it works:

  1. VLSHRT and the INCLUDE statement will, between them, remove the blank lines Saxon created.
  2. IFOUTLEN sets the output record length (from INREC) to 70 bytes.
  3. This WHEN=INIT parses the input (CSV) data.
  4. The %01 field is filled from after the first " and before the second (with comma) ". It becomes a fixed character field of length 10 bytes.
  5. The %02 field is filled with the remainder of the data in the record – for a length of 8 bytes.
  6. We write out the RDW and both parsed fields.
  7. This WHEN=INIT is used to produce the report lines. We print the %01 field ("Row"), a space, and the %02 numeric field ("a"). For the numeric field ("a") we parse the characters to extract the numeric value (with SFF) and then immediately reformat it (with EDIT=(I,IIT) ) to insert commas.

And here’s the output:

The Resultant Output
One            1                                                      
Two           12
Three        903

Of course we needn’t have just printed the data, as I’ve indicated. With a more interesting data set you could do a lot more.

The use of symbols ("Row" and "a") was largely gratuitous here. It just shows you can use them. If you’re a regular DFSORT or ICETOOL user you’ll know their value.

If you were to strip this down to the bare essentials the first WHEN=INIT does most of the work – parsing the data into fixed positions. (The one really useful thing the second WHEN=INIT does is to convert the numeric field into a packed decimal number.)

So, over these three posts I’ve shown how you can use XSLT to half tame XML data and DFSORT to complete the taming. I have a couple of other things I want to talk about in relation to this. But those belong in a separate post.

Published by Martin Packer

I'm a mainframe performance guy and have been for the past 35 years. But I play with lots of other technologies as well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: