In order to compute rn efficiently, we write
n = dm * 2m + dm-1 * 2m-1 + ... + d1 * 2 + d0 * 1.where each di is either 0 or 1; in other words, the values di are the digits in the base 2 expansion of n.
Now it follows that
rn = rdm * 2m * r dm-1 * 2m-1 * ... * r d1 * 2 * r d0 * 1.Now we have to compute the m-1 powers r2 through r2m, and we have to multiply together those values for which di is not zero.
For example, to compute r43 we note that
43 = 32 + 8 + 2 + 1hence
r43 = r32*r8*r2*r1and our calculation might actually go:
result := 1 temp := r result := result * temp (result = r) temp := r * r result := result * temp (result = r3) temp := r * r temp := r * r result := result * temp (result = r11) temp := r * r temp := r * r result := result * temp (result = r43)for a total of 9 multiplications.
Letting m=int(log2(n)), you should see that we need m-1 multiplications to compute the higher powers, and in the worst case, we might need m steps to multiply each power into the result. Therefore, this method requires about 2*log(n) multiplications to compute an integer power.
Except possibly for a few small values of n, this method beats the pants off the simple-minded method of repeated multiplication by a fixed factor r. However, surprisingly, this method is not guaranteed to be optimal. For instance, you should see a simple way to compute r9 in 3 multiplications whereas the method we described above takes at least 4 and possibly 5, depending on whether you're clever about initializing the accumulator to 1 or r.
As I was explaining the details of this last puzzle to my officemate Greg, he interrupted me and said, "But why are you doing it that way? Can't you keep all the intermediate products and use them whenever you want?" I realized that we were thinking of the problem's unspoken restrictions differently. Greg assumed that we had lots of memory, as though we were writing our calculations on paper, and could refer back to an older result at any time. I was used to the programming idea that you have limited memory, and you should overwrite old results with the most recent ones, unless you are willing to set aside special storage for a few valuable quantities. How you choose to understand the restrictions on the problem will certainly affect your results!
An interesting extension of this puzzle occurs when you allow r to be a matrix. First of all, this makes it much harder to justify storing all the intermediate results. It's more likely that you're going to have to commit to keeping 2 or 3 temporary matrices, but not an arbitrary set.
A related puzzle concerns the problem of efficiently multiplying a sequence of rectangular matrices of varying (but compatible) dimensions. Matrix multiplication is associative. To compute the value of A*B*C, for instance, you can compute either A*(B*C) or (A*B)*C. However, the total amount of work done, measured by the number of numerical multiplications, can vary enormously depending on the order chosen. Can you determine a pattern to this work, or reasonable rules for trying to minimize the cost of the multiplication?
Back to The Power Puzzle.