Rather than coding algorithm magic, we can provide a table that gives the magic numbers and shift amounts for a few small divisors. Divisors equal to the tabulated ones multiplied by a power of 2 are easily handled as follows:
1. Count the number of trailing 0's in d, and let this be denoted by k.
2. Use as the lookup argument d/2k (shift right k).
3. Use the magic number found in the table.
4. Use the shift amount found in the table, increased by k.
Thus, if the table contains the divisors 3, 5, 25, and so on, divisors of 6, 10, 100, and so forth can be handled.
This procedure usually gives the smallest magic number, but not always. The smallest positive divisor for which it fails in this respect for W = 32 is d = 334,972, for which it computes m = 3,361,176,179 and s = 18. However, the minimal magic number for d = 334,972 is m = 840,294,045, with s = 16. The procedure also fails to give the minimal magic number for d = -6. In both these cases, output code quality is affected.
Alverson [Alv] is the first
known to the author to state that the method described here works with complete
accuracy for all divisors. Using our notation, his method for unsigned integer
division by d is to set the shift amount and the multiplier
and then do the division by
(that is, multiply and shift right).
He proves that the multiplier m is less than 2W + 1, and that the method gets
the exact quotient for all n expressible in W bits.
Alverson's method is a simpler variation of ours in that it doesn't require trial and error to determine p, and is thus more suitable for building in hardware, which is his primary interest. His multiplier m, however, is always greater than or equal to 2W, and thus for the software application always gives the code illustrated by the "divide by 7" example (that is, always has the add and shrxi, or the alternative four instructions). Because most small divisors can be handled with a multiplier less than 2W it seems worthwhile to look for these cases.
For signed division, Alverson suggests
finding the multiplier for |d| and a word
length of W - 1 (then 2W - 1 m < 2W),
multiplying the dividend by it, and negating the result if the operands have
opposite signs. (The multiplier must be such that it gives the correct result
when the dividend is 2W - 1,
the absolute value of the maximum negative number). It seems possible that this
suggestion might give better code than what has been given here in the case
that the multiplier m
2W. Applying it to signed division by 7
gives the following code, where we have used the relation -x = x?/span> + 1 to avoid a branch:
abs牋 an,n
li牋?M,0x92492493?Magic number, (2**34+5)/7.
mulhu q,M,an牋牋牋?q = floor(M*an/2**32).
shri?q,q,2
shrsi t,n,31牋牋牋?These three instructions
xor牋 q,q,t牋牋牋牋 negate q if n is
sub牋 q,q,t牋牋牋牋 negative.
This is not quite as good as the code we gave for signed division by 7 (six vs. seven instructions), but it would be useful on a machine that has abs and mulhu but not mulhs.