(Originally posted 2006-05-13.)
(Scroll down if you’re not interested in DB2)
This past week I’ve been updating our SQL Analysis code to cope with Unicode SQL statements. (If you bind a package or plan once you’ve gone to DB2 Version 8 New Function Mode the statement text is stored in the Catalog in Unicode.) These statements are in Unicode UTF-8 which is extremely similar to ASCII. So I snaffled some existing code of mine that converts from ASCII to EBCDIC and pressed it into service…
If the first 10 bytes of the statement text (minus the 2 4-byte header fields) contain X’40’ (an EBCDIC null (space)) then I consider the text to be in EBCDIC. Otherwise I deem it to be ASCII – and do the conversion. (You might ask “what about statements like ROLLBACK?” Well, it turns out that in SYSPACKSTMT there is always a trailing blank on the statement text – so that’s alright. 🙂
(In fact my code decodes all the structural information in the SYSPACKSTMT STMT column – including extracting information on host variables.)
Now, my colleagues sometimes just want to take a look at the statement text sometimes – without bothering to run my code. So here’s what I learned when creating a DFSORT job I set up for them to run, converting the statement text from ASCII to EBCDIC.
Converting ASCII to EBCDIC With DFSORT
First, a disclaimer: Certain codepoints might need slight adjustment in some countries – for example if you have dollar versus pound sterling issues.
First you code an ALTSEQ statement:
ALTSEQ CODE=(0000,0101,0202,0303,0437,052D,062E,072F, 0816,0905,0A25,0B0B,0C0C,0D0D,0E0E,0F0F, 1010,1111,1212,1313,143C,153D,1632,1726, 1818,1919,1A3F,1B27,1C22,1D1D,1E35,1F1F, 2040,215A,227F,237B,245B,256C,2650,277D, 284D,295D,2A5C,2B4E,2C6B,2D60,2E4B,2F61, 30F0,31F1,32F2,33F3,34F4,35F5,36F6,37F7, 38F8,39F9,3A7A,3B5E,3C4C,3D7E,3E6E,3F6F, 407C,41C1,42C2,43C3,44C4,45C5,46C6,47C7, 48C8,49C9,4AD1,4BD2,4CD3,4DD4,4ED5,4FD6, 50D7,51D8,52D9,53E2,54E3,55E4,56E5,57E6, 58E7,59E8,5AE9,5BAD,5CE0,5DBD,5E5F,5F6D, 6079,6181,6282,6383,6484,6585,6686,6787, 6888,6989,6A91,6B92,6C93,6D94,6E95,6F96, 7097,7198,7299,73A2,74A3,75A4,76A5,77A6, 78A7,79A8,7AA9,7BC0,7C4F,7DD0,7EA1,7F07, 8043,8120,8221,831C,8423,85EB,8624,879B, 8871,8928,8A38,8B49,8C90,8DBA,8EEC,8FDF, 9045,9129,922A,939D,9472,952B,968A,979A, 9867,9956,9A64,9B4A,9C53,9D68,9E59,9F46, A0EA,A1DA,A22C,A3DE,A48B,A555,A641,A7FE, A858,A951,AA52,AB48,AC69,ADDB,AE8E,AF8D, B073,B174,B275,B3FA,B415,B5B0,B6B1,B7B3, B8B4,B9B5,BA6A,BBB7,BCB8,BDB9,BECC,BFBC, C0AB,C13E,C23B,C30A,C4BF,C58F,C63A,C714, C8A0,C917,CACB,CBCA,CC1A,CD1B,CE9C,CF04, D034,D1EF,D21E,D306,D408,D509,D677,D770, D8BE,D9BB,DAAC,DB54,DC63,DD65,DE66,DF62, E030,E142,E247,E357,E4EE,E533,E6B6,E7E1, E8CD,E9ED,EA36,EB44,ECCE,EDCF,EE31,EFAA, F0FC,F19E,F2AE,F38C,F4DD,F5DC,F639,F7FB, F880,F9AF,FAFD,FB78,FC76,FDB2,FE9F,FFFF)
Then you refer to this translation table in your control statements:
In this case the field STMT (which happens to be a DFSORT Symbol name) is translated using the ALTSEQ symbol table. (There is one translation table but it can be used by selected fields – those for which TRAN=ALTSEQ is specified.)
TRAN=ALTSEQ has been supported by DFSORT for a number of years. As documented in the Reformatting: Translation section of Beyond Sorting also mentions TRAN=UTOL and TRAN=LTOU, which allow translation between lower and upper case.
In our code I have placed the ALTSEQ translation table in a data set pointed to by a (separate) DFSPARM DD. That makes it easier to manage. Indeed, if you do have code page issues you can address them using several such data sets, each with their own customisation for “local difficulties”.:-)
One final note: In Production I’ll probably use IFTHEN to automate the “blank in the first 10 bytes” test I mentioned above – as you can, with IFTHEN (UQ95213 / UQ95214), reformat records in different ways dependent on the input data. That includes choosing whether to use TRAN=ALTSEQ on individual fields.