(Originally posted 2011-05-20.)
Following on from this post and this one, this post discusses the DFSORT piece.
The DFSORT code in this post parses the Comma-Separated Variable (CSV) file produced by XSLT processing. In this simple example it merely produces a flat file report, but the post has a few additional details you might find valuable.
First, here’s the SORTIN DD JCL statement. It’s not like a regular sequential file statement as it has to access the zFS file system we wrote the data to with Saxon:
JCL SORTIN DD Statement |
//SORTIN DD PATHOPTS=(ORDONLY),RECFM=VB,LRECL=255,BLKSIZE=32760, // PATH='/u/userzfs/myuserid/testXSL.txt',FILEDATA=TEXT |
Of particular note in this DD statement is the record format (VB), the logical record length (255) and the block size (32760). This is definitely VB data. I’ve found a LRECL greater than the maximum size Saxon has produced is fine. Similarly a sensible block size works. FILEDATA=TEXT is also needed.
Here’s the SYMNAMES file:
Contents of the SYMNAMES File |
RDW,1,4,BI Row,%01 a,%02 |
You’ll need an accompanying SYMNOUT DD – for the messages DFSORT (or ICETOOL) produce when the SYMNAMES file is processed.
I’m showing you this first so you can understand the main DFSORT control statements file: Everywhere you see the symbol "Row" in these statements you can interpret it as "%01", whatever that is. Similarly for "a" and "%02". The "RDW" symbol maps the Record Descriptor Word that we need for variable-length record processing. (DFSORT can convert from variable- to fixed-record format but we won’t do that here.)
Now for the control statements:
Contents of the SYSIN File |
OPTION COPY,VLSHRT 1 INCLUDE COND=(1,2,BI,GE,+12) 1 INREC IFOUTLEN=70, 2 IFTHEN=(WHEN=INIT, 3 PARSE=(%01=(STARTAFT=C'"',ENDBEFR=C'",',FIXLEN=10), 4 %02=(FIXLEN=8)), 5 BUILD=(1,4,%01,%02)), 6 IFTHEN=(WHEN=INIT,BUILD=(RDW,Row,X,a,SFF,EDIT=(I,IIT))) 7 |
This is a very simple case of using DFSORT. So, for example, there’s no SORT, no OUTFIL, nor any ICETOOL sophistication. It’s meant to show how you can get the data into a format DFSORT can use. Let me explain how it works:
- VLSHRT and the INCLUDE statement will, between them, remove the blank lines Saxon created.
- IFOUTLEN sets the output record length (from INREC) to 70 bytes.
- This WHEN=INIT parses the input (CSV) data.
- The %01 field is filled from after the first " and before the second (with comma) ". It becomes a fixed character field of length 10 bytes.
- The %02 field is filled with the remainder of the data in the record – for a length of 8 bytes.
- We write out the RDW and both parsed fields.
- This WHEN=INIT is used to produce the report lines. We print the %01 field ("Row"), a space, and the %02 numeric field ("a"). For the numeric field ("a") we parse the characters to extract the numeric value (with SFF) and then immediately reformat it (with EDIT=(I,IIT) ) to insert commas.
And here’s the output:
The Resultant Output |
One 1 Two 12 Three 903 |
Of course we needn’t have just printed the data, as I’ve indicated. With a more interesting data set you could do a lot more.
The use of symbols ("Row" and "a") was largely gratuitous here. It just shows you can use them. If you’re a regular DFSORT or ICETOOL user you’ll know their value.
If you were to strip this down to the bare essentials the first WHEN=INIT does most of the work – parsing the data into fixed positions. (The one really useful thing the second WHEN=INIT does is to convert the numeric field into a packed decimal number.)
So, over these three posts I’ve shown how you can use XSLT to half tame XML data and DFSORT to complete the taming. I have a couple of other things I want to talk about in relation to this. But those belong in a separate post.