(Originally posted 2011-12-12.)
As I said in this post I recently came across the need to handle Unicode when processing DB2 Accounting Trace (SMF 101). I was astonished not to have run into it before in all my many sets of customer data. So I had two things to do:
- Understand the circumstances under which it happens – which isn’t just "be on Version 8 and it will happen automatically."
and
- Figure out how to handle it when I see it. (i.e. when QWHSFLAG has the value x’80’ as I mentioned in the other post).
As you’d expect, I asked the customer what they had done to cause the generation of 101 records containing Unicode fields. The answer is that they’ve set parameter UIFCIDS in DSNZPARM to "YES". It turns out my friend Willie Favero had mentioned it in this blog post some time ago. Because the DB2 Catalog has Unicode in it in Version 8 it actually takes cycles to create 101 records without Unicode in: All the fields marked "%U" in the mappings in SDSNMACS have to be translated from Unicode to EBCDIC. If you code "UIFCIDS=YES" you avoid the cost of the translation.
But there’s an obvious downside: Any reporting against those fields (the ones marked "%U") needs to take that into account. But if you never (or rarely) look at e.g. the Package-level stuff you might prefer to write it in Unicode (or suppress IFCID 239 entirely). It’s probably a net saving in CPU, albeit a small one.
Which leads on to the second part of this: How did I handle the translation into something readable (EBCDIC being my primary encoding, at least on z/OS)? My interim take is to fix up my reporting REXX to check QWHSFLAGS and do the right thing. You can readily do that with the built-in TRANSLATE instruction. There is code knocking around on the Internet for the purpose. That got me through this study and I have reusable code I can use wherever I need to do the translation.
But this is not the only way, and perhaps not the best: My code reformats records (for historical reasons, mainly) in an assembler exit – as part of my database build process. It’s entirely feasible to do the translation there and then my database has everything readable. To do that you can use the TR (Translate) instruction. Of course you can use the same translation table (albeit with different syntax – one or so edit away) as in the REXX. But that’s a whole load more effort and potential fragility. I think I’ll defer that.
One other thing: Unicode isn’t necessarily 1-byte characters. But the sample I have is. Neither REXX TRANSLATE BIF nor TR instruction will handle multi-byte characters. So I could eventually come unstuck. And, no, I don’t know if these fields will contain multi-byte characters any time soon. Anyone in a position to comment?
| Fixed a glitch in the bulleted list at the top.