Sorting In REXX Made Easy

(Originally posted 2013-07-15.)

In REXX sorting is a fraught exercise – and you tend to have to resort to other programs to do it:

  • DFSORT for the “heavy lifting”
  • UNIX Sort

but you might decide your sort is low volume enough to do in REXX. The trouble is it’s difficult to write a routine that isn’t specific on how the data is sorted, or rather how two items are compared – as we shall see.

Following on from Dragging REXX Into The 21st Century? here’s a technique that is quite general. It uses the REXX INTERPRET instruction.


The problem of sorting comprises two challenges:

  1. Comparing two items.
  2. Deciding which two items to compare when, and acting on the results of the comparison. Much has been written about this, perhaps most notably in Donald Knuth’s Sorting & Searching.

This post is mostly about Challenge 1, though the code presented also addresses Challenge 2.

Comparing Two Items

It’s easy to do a straight numeric comparison in REXX, such as

if a<b then do
  ...
end

but more complex comparisons are, ahem, more complex. šŸ™‚ But, more to the point, a generalised comparison is difficult to do, which is where INTERPRET comes in.

Sample Sort Code

In what follows the program works. That isn’t to say the sorting algorithm is the most efficient over all sets of data, but this post isn’t really about the sort’s efficiency.

/* REXX */ 

/* Set up stem variable to sort */ 
sortin.0=5 
sortin.1="AXXX XS" 
sortin.2="HXXXX BSS" 
sortin.3="FXXX IS" 
sortin.4="XSSSSSSSS ZS" 
sortin.5="A JSS" 

/* Sort the stem variable */ 
call sortStem "comp1" 

/* Print the results */ 
do s=1 to sortin.0 
  say sortin.s 
end 

exit 

/* Sort comparison routine based solely on length of second word */
comp1: procedure 
parse arg x,y 

/* Calculate lengths of second word of x and y */ 
lx=length(word(x,2)) 
ly=length(word(y,2)) 

select 
when lx=ly then return 0 
when lx>ly then return 1 
otherwise return -1 
end 

sortStem: 
parse arg comparison_routine 
flip_flop=0 
do i=sortin.0 to 1 by -1 
  flip_flop=1 
  do j=1 to i-1 
    /* Decide whether to swap */ 
    interpret "compRes="comparison_routine"('"sortin.i"','"sortin.j"')"
    if compRes<0 then do 
      should_swap=1 
    end 
    else do 
      should_swap=0 
    end 

    if should_swap=1 then do
      /* Swap values */ 
      xchg=sortin.j 
      sortin.j=sortin.i 
      sortin.i=xchg 

      flip_flop=0 
    end 
  end 
end 
return

The item comparison routine in this example is, of course, somewhat artificial – and not one I’m likely to use. It compares the length of the second token in each stem variable. I chose it as an example of something that’s not easy to do in, say, DFSORT. The real point is you can compare items using arbitrarily complex criteria. If your collection of stem variables was more complex than this – essentially a list of objects – you’d want the sophistication this approach allows.


Incidentally when it comes to INTERPRET performance I think it depends on which statements are actually interpreted and which have already been tokenised:

  • If your “heavy lifting” is the code passed to the INTERPRET routine that would be expensive.
  • If you pass a call to a routine to INTERPRET that isn’t – so long as the actual routine is “static”, meaning appears in your source code where it can be tokenised.

In the above example code you’ll notice I’m using a hard-coded set of stem variable names. In my actual production code I’ve generalised to allow any stem variable names – so I’m actually much heavier in my use of INTERPRET.

I’m leaning towards INTERPRET in more cases – provided it doesn’t obscure the meaning.

At the start of this post I mentioned DFSORT and Unix sort as alternatives. I think you should consider them, though their limitations and the need to work hard at “marshalling” to use them will limit their appeal. So, I hope you’ll find the technique presented in this post useful for the cases where they’re not appealing.

The most important piece of this post – and the main reason for writing it – is the notion of using a single sorting routine and handing it the name of an item comparison routine. This should lead to code that’s more maintainable. Its long been the staple of other programming languages. You can do it with REXX, too!

Published by Martin Packer

.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: