When we defined the Rational
class, we didn't
reduce the fractions, so we ended up with numbers like 10/8
and 4/6. To avoid this, we could reduce the fractions in the
constructor by dividing both the numerator and the denominator
by the GCD of the two.
The mathematical definition is:
The greatest common divisor (GCD) of two integers m and n is the greatest integer that divides both m and n with no remainder.So, we'd like a procedure
Thus, the problem is:
Given integers m and n such that m >= n > 0, find the GCD of m and n.To solve this problem, the mathematical definition isn't enough. It tells us what we want, but not how to get it. Before we can write a procedure, we need to know how to compute the value.
We need an algorithm: a method for computing process.
PROCEDURE | vs. | ALGORITHM | vs. | PROCESS |
step by step description of the activity | the idea of how to perform the activity | activity |
So far, the procedures we have written contained only simple formulae and occasional conditional statements. We have essentially translated the specifications directly into code. Now we need to come up with an algorithm, a way to compute the results that does not fall out immediately from the statement of the problem.
Before considering possible GCD algorithms, let's design
algorithms for some simpler problems.
Example:
Recall that n! = n*(n-1)*(n-2)*...*2*1, and that 0! = 1. In
other words,factorial
We want to build a procedure
Let's try to devise an algorithm straight from the mathematical definition.
int factorial(int n) { if (n == 0) return 1; else return (n * factorial(n-1)); }Will this work? Let's try using the substitution model.
factorial(0) => 1 factorial(3) 3 * factorial(2) 3 * 2 * factorial(1) 3 * 2 * 1 * factorial(0) 3 * 2 * 1 * 1 => 6This looks good. We've done a reduction of the factorial problem to a smaller instance of the same problem. This idea of reducing a problem to itself is known as recursion.
Let's pick apart the code:
int factorial(int n) { if (n == 0) // Termination condition (base case) return 1; // Result for base case else return (n * factorial(n-1)); // ^ -------------- // | ^ // | |___ Recursive call // | // |___ Combining step (used to combine other values with // results of the recursive call) }Hints in writing recursive procedures:
For example,
This corresponds very closely to what actually happens on the execution stack in the computer's memory.
Example:
Let's do another example. Suppose that addition were not
built into Java, and all you had to use was the increment
operator, add
++
, and decrement operator,
--
. (++j
increments j
by 1.) Can you build an addition procedure?
Let's assume i
>= 0. Then we can rewrite
i+j
as (i-1)+(j+1)
. This way, we
bring i
closer and closer to 0 until it reaches
0. Then we can return j
.
Algorithm idea: At each step, subtract one from
i
and add one to j
until
i
is 0. Then return j
. More
formally,
int add(int i, int j) { if (i == 0) return j; else return add(--i, ++j); }Using the substitution model,
add(3, 7) add(2, 8) add(1, 9) add(0, 10) => 10Notice that there is no combining. The result of the recursive call is the final result. This is known as tail recursion.
Algorithm idea: BRUTE FORCE -- test every integer from 2 up to n-1 to see if any divides n with no remainder.
In order to try values in the range 2 to n-1, we'll need a procedure that tests for factors of n in a range, say k to n-1.
boolean factorInRange(int k, int n) { if (k >= n) // is the range empty? return false; // the range is empty so there are // no factors in the range else if ((n % k) == 0) // is n divisible by k? return true; // yes, we found a factor, namely k else // k is not a factor return factorInRange(k+1, n); // so let's see if there's a factor // among the values in the rest // of the range }Given this, we can write:
boolean isPrime(int n) { return ((n > 1) && !factorInRange(2, n)); } isPrime(16) (16 > 1) && !factorInRange(2, 16) true && !true => false isPrime(5) true && !factorInRange(2, 5) factorInRange(3, 5) factorInRange(4, 5) factorInRange(5, 5) false true && !false => true isPrime(2) true && !factorInRange(2, 2) true && !false => true isPrime(35) true && !factorInRange(2,35) factorInRange(3,35) factorInRange(4,35) factorInRange(5,35) true true && !true false isPrime(0) false (doesn't even call factorInRange)
The series is 0 1 1 2 3 5 8 13 21 ...
Let's write a recursive procedure for Fibonacci numbers directly from the definition:
int fib(int n) { if (n <= 1) return n; else return (fib(n-1) + fib(n-2)); }This combines results from 2 different recursive calls. This is sometimes known as "deep" recursion, or in other cases "divide and conquer."
Let's consider all the recursive calls for fib(5).
This returns the correct answer, but it takes a long time, since there are many calls. This is because there's a lot of repeated work. For example, the call to fib(4) repeats the calculation of fib(3) (see the circled regions of the tree). In general, when n increases by 1, we roughly double the work; that makes about 2n calls!
Let's try to think of another algorithm that is less wasteful. When we compute the series on paper, what do we do?
0 1 1 2 3 5 8 13 21 34 ...We look at the previous two numbers and add them to get the next number. We don't recompute the previous two, we just write them down (or remember them).
Let's use this algorithm. We'll start the implementation by writing a helper procedure whose parameters are n, a number k, and the values of fib(k) and fib(k-1). Think of k as the place we've come so far in writing out the series. When k reaches n, we're done.
int helpFib(int n, int k, int fibk, int fibk1) { if (n == k) return fibk; else return helpFib(n, k+1, fibk+fibk1, fibk); }Now we can write (assuming n>=1):
int fib(int n) { return helpFib(n, 1, 1, 0); } fib(6) helpFib(6, 1, 1, 0) helpFib(6, 2, 1, 1) helpFib(6, 3, 2, 1) helpFib(6, 4, 3, 2) helpFib(6, 5, 5, 3) helpFib(6, 6, 8, 5) => 8This is much better -- this tail recursive procedure needs n iterations to compute fib(n). That is much better than 2n.
The lesson here is that being clever about the algorithm can
yield significant savings.
Example: GCD
Now let's return to the problem of computing GCD's.
(Here, we assume m >= n > 0.) Recall: The
greatest common divisor (GCD) of m and n is the
largest integer that divides both m and n with
no remainder.
GCD Algorithm 1: Brute Force
The idea is to try all integers from n down until
finding one that divides m and n evenly.
First, define tryDivisor
that takes in m,
n, and a guess. If the guess works, then it returns
the guess. Otherwise, it tries a smaller guess.
int tryDivisor(int m, int n, int g) { if (((m % g) == 0) && ((n % g) == 0)) return g; else return tryDivisor(m, n, g - 1); }Now we can reduce GCD to
tryDivisor
:
int gcd(int m, int n) { return tryDivisor(m, n, n); // use n as our first guess } gcd(6, 4) tryDivisor(6, 4, 4) tryDivisor(6, 4, 3) tryDivisor(6, 4, 2) => 2This works, but for large numbers, this could take a while. Let's consider another algorithm.
The basis of the algorithm is the following fact:
Why is this true? We can rewrite m as follows:
Now any divisor d common to m and n must divide the first term with no remainder, since it is the product of n and an integer. Therefore, d must also divide the second term since d divides m and m is the sum of the two terms.
Since any divisor common to m and n must divide the remainder of m/n, we know that, in particular, the gcd does, since it is a common divisor. It just happens to be the greatest such divisor.
So by taking the GCD(n, remainder of m/n), we can "close in quickly" on the GCD of m and n. (This is extremely clever -- you wouldn't be expected to come up with something like this in an algorithm question on a CS101 exam.)
Now we can write:
int gcd(int m, int n) { if ((m % n) == 0) return n; else return gcd(n, m % n); } gcd(468, 24) gcd(24, 12) => 12 gcd(135, 19) gcd(19, 2) gcd(2, 1) => 1Euclid's GCD algorithm is very fast, but divison (taking remainders) is a more time-consuming operation than simple addition and subtraction. Can we devise a GCD algorithm that doesn't use division?
The idea: If m>n, GCD(m,n) is the same as GCD(m-n,n).
Why? If m/d and n/d both leave no remainder, then (m-n)/d leaves no remainder. (Again, this is clever. Most graduate students probably couldn't come up with this if they haven't already seen it.)
This leads to the following algorithm:
int gcd(int m, int n) { if(m == n) return m; else if (m > n) return gcd(m-n, n); else return gcd(m, n-m); }An example using the substitution model:
gcd(468, 24) gcd(444, 24) gcd(420, 24) ... gcd(36, 24) gcd(12, 24) (Now n is bigger) gcd(12, 12) (Same) => 12This does accomplish the calculation with no division.