2-19 Exchanging Registers

A very old trick is that of exchanging the contents of two registers without using a third [IBM]:

graphics/02icon117.gif

This works well on a two-address machine. The trick also works if is replaced by the logical operation (complement of exclusive or), and can be made to work in various ways with add's and subtract's:

graphics/02icon118.gif

Unfortunately, each of these has an instruction that is unsuitable for a two-address machine, unless the machine has "reverse subtract."

This little trick can actually be useful in the application of double buffering, in which two pointers are swapped. The first instruction can be factored out of the loop in which the swap is done (although this negates the advantage of saving a register):

graphics/02icon119.gif

Exchanging Corresponding Fields of Registers

The problem here is to exchange the contents of two registers x and y wherever a mask bit m_i = 1, and to leave x and y unaltered wherever m_i = 0. By "corresponding" fields, we mean that no shifting is required. The 1-bits of m need not be contiguous. The straightforward method is as follows:

graphics/02icon120.gif

By using "temporaries" for the four and expressions, this can be seen to require seven instructions, assuming that either m or m?/span> can be loaded with a single instruction and the machine has and not as a single instruction. If the machine is capable of executing the four (independent) and expressions in parallel, the execution time is only three cycles.

A method that is probably better (five instructions, but four cycles on a machine with unlimited instruction-level parallelism) is shown in column (a) below. It is suggested by the "three exclusive or" code for exchanging registers.

graphics/02icon121.gif

The steps in column (b) do the same exchange as that of column (a), but column (b) is useful if m does not fit in an immediate field but m?/span> does, and the machine has the equivalence instruction.

Still another method is shown in column (c) above [GLS1]. It also takes five instructions (again assuming one instruction must be used to load m into a register), but executes in only three cycles on a machine with sufficient instruction-level parallelism.

Exchanging Two Fields of the Same Register

Assume a register x has two fields (of the same length) that are to be swapped, without altering other bits in the register. That is, the object is to swap fields B and D, without altering fields A, C, and E, in the computer word illustrated below. The fields are separated by a shift distance k.

graphics/02icon122.gif

Straightforward code would shift D and B to their new positions, and combine the words with and and or operations, as follows:

graphics/02icon123.gif

Here, m is a mask with 1's in field D (and 0's elsewhere), and m' is a mask with 1's in fields A, C, and E. This code requires nine instructions and four cycles on a machine with unlimited instruction-level parallelism, allowing for two instructions to load the two masks.

A method that requires only seven instructions and executes in five cycles, under the same assumptions, is shown below [GLS1]. It is similar to the code in column (c) on page 39 for interchanging corresponding fields of two registers. Again, m is a mask that isolates field D.

graphics/02icon124.gif

The idea is that t₁ contains B D in position D (and 0's elsewhere), and t₂ contains B D in position B. This code, and the straightforward code given earlier, work correctly if B and D are "split fields"—that is, if the 1-bits of mask m are not contiguous.

Conditional Exchange

The exchange methods of the preceding two sections, which are based on exclusive or, degenerate into no-operations if the mask m is 0. Hence, they can perform an exchange of entire registers, or of corresponding fields of two registers, or of two fields of the same register, if m is set to all 1's if some condition c is true, and to all 0's if c is false. This gives branch-free code if m can be set up without branching.