SICP 1.2 - Procedures and the Processes They Generate, Part 2

Covering SICP Exercises 1.11 through 1.13

Table of Contents

Exercise 1.11 ✅

Exercise 1.11: A function f is defined by the rule that $f(n)=n$ if $n<3$ and $f(n)=f(n−1)+2f(n−2)+3f(n−3)$ if $n≥3$. Write a procedure that computes f by means of a recursive process. Write a procedure that computes f by means of an iterative process.

Recursive Process ✅

(define (fr n)
  (if (< n 3) n
      (+
       (fr (- n 1))
       (* 2 (fr (- n 2)))
       (* 3 (fr (- n 3))))))

Iterative Process ✅

Discussion

Let's start with (fi 3) (the "i" standing for iterative).

For (fi n) where n = 3:

(fi (- n 1)) = (fi 2) = 2

(fi (- n 2)) = (fi 1) = 1

(fi (- n 3)) = (fi 0) = 0

To figure out (fi 3), we want sum up (fi 2) (or 2), 2 multiplied by (fi 1) (or 2), and 3 times (fi 0) (or 0). We get a total of 4 from doing this.

Suppose we want to figure out (fi 4). To figure out (fi 4), we'll need the values of (fi 3), (fi 2), and (fi 1). When n = 3 for (fi n), those values are:

for (fi 3), the sum of (fi 2), 2 times (fi 1), and 3 times (fi 0). This is 4.

for (fi 2), the value of (fi (- n 1), which is 2.

for (fi 1), the value of (fi (- n 2), which is 1.

So within (fi 3), we can generate all the values we need for calculating (fi 4).

We could think of there being three key values. For a given n, there is the value of fi one step back (fi (- n 1), and the value of fi two step backs, and the value of fi three steps back ((fi (- n 2)).

We might call these 1ago, 2ago, and 3ago.

For (fi 3), these values are 2, 1, 0.

For (fi 4), these values are 4, 2, 1.

For (fi 5), these values are 11, 4, 2.

For (fi 6), these values are 25, 11, 4.

For (fi 7), these values are 50, 25, 11.

Notice that a new value is generated using the existing values in the first slot (for 1ago), and then that value recurs as we increase n by 1, first as 2ago, then as 3ago.

In order to have an iterative procedure that can solve Exercise 1.11 for arbitrary values, we will want to do a recursive call to that iterative procedure with appropriate transformations of these values.

So, suppose we start at a calculation of (fi 3) as our initial starting place with initial values of 2, 1, 0 for 1ago, 2ago, 3ago. How will we transform these values into a suitable form for calculating (fi 4)?

As our new 1ago argument, we'll want to provide (+ 1ago (* 2 2ago)(3 3ago)). This is 4.

As our new 2ago argument, we'll want to provide our current 1ago argument, which is (fi 2) or 2.

As our new 3ago argument, we'll want to provide our current 2ago argument, which is (fi 1) or 1.

With these values provided, our procedure will be able to calculate (fi 4) by engaging in the appropriate transformations of them per the rule specified in the exercise.

First Attempt (wrong) ❌

I went awry partially cuz I was doing unnecessary operations because I hadn't thought in sufficient detail about which operations were necessary. He's some mistaken code that that I wrote:

;bad wrong code; not the answer!

(define (fi n)
  (if (< n 3) n
      (fi-helper 2 1 0 3 n)))

(define (fi-helper 1ago 2ago 3ago current max)
  (if (= current max)
      (+ 1ago (* 2 2ago) (* 3 3ago))
      (fi-helper
       (+ 1ago 2ago 3ago) ;WRONG!
       (* 2 1ago)   	  ;WRONG!
       (* 3 2ago)         ;WRONG!
       (+ current 1)
       max)))

Multiple problems here.

One is that, for some reason, I'm doing a different operation on the numbers in my recursive call to fi-helper (specifically, (+ 1ago 2ago 3ago)) than I do as the consequent for the if (specifically, (+ 1ago (* 2 2ago) (* 3 3ago)). It was only after going through and writing all the stuff above that I realized clearly that these should be the same operation, and specifically that they should be (+ 1ago (* 2 2ago) (* 3 3ago)).

The second issue is that I am unnecessarily multiplying the 1ago and 2ago values by 2 and 3, respectively, in providing them as arguments for my recursive call. Going through the above analysis helped me realize that these values should be left untouched.

Working Solution ✅

Below is a fixed and functioning fi program. Note that I also keep track of which n value we're currently calculating (using current) and the final n we need to reach (with max).

(define (fi n)
  (if (< n 3) n
      (fi-helper 2 1 0 3 n)))

(define (fi-helper 1ago 2ago 3ago current max)
  (if (= current max)
      (+ 1ago (* 2 2ago) (* 3 3ago))
      (fi-helper
       (+ 1ago (* 2 2ago) (* 3 3ago))
       1ago
       2ago
       (+ current 1)
       max)))
Looking at Other Solutions

I compared my solution to other people's.

Here's one that's similarly structured:

define (f n)
  (if (< n 3) n
      (f-iter 2 1 0 (- n 2))))

(define (f-iter n1 n2 n3 left)
  (if (= left 0) n1
      (f-iter (+ n1 (* 2 n2) (* 3 n3))
              n1
              n2
              (- left 1))))

Note, though, that instead of using two separate current and max values and increasing the current value until it is equal to the max value, this guy just has 1 value, left, that increments down to 0. That's less complicated, so it's better.

Other solutions nested the helper procedure inside the main procedure, like Anne's and Sebastien's. I'm not sure how I feel about that style yet.

Exercise 1.12 ✅

The following pattern of numbers is called Pascal’s triangle.

The numbers at the edge of the triangle are all 1, and each number inside the triangle is the sum of the two numbers above it. Write a procedure that computes elements of Pascal’s triangle by means of a recursive process.

Discussion

I found the book's specification of the problem really vague. I found a Reddit thread by a guy who said he had a Bachelor's in math and was struggling about what it was asking for and gave a bunch of possibilities, and the reply he got was basically that you're supposed to figure it out and that's part of the exercise.

I watched Brian Harvey talking about a Pascal's triangle procedure a bit but didn't really look at the body of his procedure and stopped the video when it came on screen (cuz I wanted to figure things out for myself, just needed a bit of a hint to get started).

Harvey said that the pascal procedure he was demonstrating takes a row and a column. If column = 0, then the you're on the left edge of the triangle and the answer is 1. That makes sense since the 0 column will always be the first column in any given row.

He also says that if row = column, then you're on the right edge of the triangle and the answer is 1.

I think that the "tip" of the triangle - the 1 at the summit - would be defined by such a procedure as being located at (0, 0).

So anyways from this description, I got the sense that basically Harvey's procedure would take a column and row value and give you the index at that location. I decided to try implementing this approach in my ownprocedure.

So let's assume the format is (column, row). For the row that appears below the summit...

...the left 1 would be at (0, 1) and the right 1 would be at (1, 1).

And then the next row down would have the range of values from(0, 2) through (2, 2), and the row after that would have the range of possible values from (0, 3) through (3, 3), and so on.

So let's say we want to figure out a particular value at a point. Let's say we want (1, 2):

While, for some values, there will be lots of recursive calls, at any particular step, we're only going to have to worry about 2 values.

So to figure out that, we've got to look at the values in row 1.

So we are starting at (1, 2) and we're going to want to look at (0, 1) (the left 1) and (1, 1) (the right 1).

So we started at an initial index we can call (c, r). To calculate the value at that index, we look to two places:

the value at the place where column and row are both reduced by 1, and

the value at the place where just the row has been reduced by 1.

And then we want to add these two values.

Another way of saying this is to calculate some given (c, r) value we want to figure out the value at the index where c = (- c 1) and r = (- r 1), and also figure out the value at the index where c is the same and r = (- r 1).

There are "edges" that to our triangle that are relevant to our calculaions. Suppose we tried to get the value of (0, 1) in our hypothetical Pascal's triangle procedure, which would be the 1 in the lower left on the following diagram:

(Side note: this site has decent visuals on Pascal's triangle)

If we follow the method I laid out and try to get the value of the index where c = (- c 1) and r = (- r 1) for a (c,r) of (0, 1), then we wind up trying to calculate the value of (-1, 0), which represents the location of the 0 in the upper left of the diagram. I think whenever we have a negative value in our index values, we want to return 0 to have everything come out okay. One thing that came out in testing was that we also want to return 0 whenever column exceeds row. A negative column indicates we've gone "off" the left side of our triangle in looking for something to add together to get the value underneath, but a column exceeding the row indicates we've gone "off" the right side in the same way, so we want to return 0 for that.

In the diagram above, the 0 depicted on the right side would be at index (2, 1), but the only valid indices in that row are where the 1 values are - these would be indices (0,1) and (1,1) respectively. If we don't handle this kind of case, then we'll have problems.

So, boiling down the above stuff to a simple statement in English:

  1. If either column or row are less than 0, or if column is greater than row, we're looking "past" the "edge" of our triangle, and so we want to return 0 as the value of such a location.
  2. In the summit or peak, at location (0,0), we want to return 1.
  3. For all other indices at some (c, r), we determine their value by summing the value of the indices directly "above" the current value. The first of these indices is calculated by subtracting 1 from both the current c and r, and the second of these values is calculated by using the current c and subtracting 1 from the current r.

Solution

In code, this is:

(define (pascal column row)
  (cond  ((or(< column 0)
             (< row 0)
             (> column row)) 0)
         ((and (= row 0)(= column 0)) 1)
        (else (+ (pascal (- column 1)(- row 1))
                 (pascal column (- row 1))))))

Looking at Other Solutions

I looked at other people's solutions and noticed some differences. I was curious if I could come up with a solution that was more precise, so I investigated what the differences were.

Here's aelanteno's:

(define (pascal row column)
  (cond ((= column 1) 1)
        ((= column row) 1)
        (else (+ (pascal (- row 1) (- column 1))
                 (pascal (- row 1) column)))))

This solution starts out the counting at 1 and has the arguments in a different order.

The ((= column 1) 1) defines the left boundary of the triangle. Whenever the column is 1, indicating that you're at the left most side of the triangle, then the value returned should be 1.

((= column row) 1) defines the right boundary of the triangle - column and row being equal indicates you're at the rightmost side of the triangle in whatever row you are in and thus that the value should be 1.

The recursive case is essentially identical to mine except for the order.

I tried "translating" this procedure to something closer to mine by changing the order of the arguments and having it start at 0, so I could compare the solutions "side by side" more and examine the differences. Here's my "translated" version:

(define (pascal column row)
  (cond ((= column 0) 1)
        ((= column row) 1)
        (else (+ (pascal (- column 1) (- row 1))
                 (pascal column (- row 1))))))

And here is my current solution:

(define (pascal column row)
  (cond  ((or (< column 0)
             (< row 0)
             (> column row)) 0)
         ((and (= row 0)(= column 0)) 1)
        (else (+ (pascal (- column 1)(- row 1))
                 (pascal column (- row 1))))))

So we can see that the recursive case is identical, and the only real differences are in the non-recursive cases.

So I think a short summary of how I was thinking of the problem is helpful. My first cond clause, the big or, is all about defining no-go zones "outside" the triangle for which the procedure should return 0. I think ruling out negative numbers for column and row makes intuitive sense, and as I said earlier, if column is greater than row then that indicates we've gone off the edge. After that, in the second line, I explicitly defined the summit of the triangle. with ((and (= row 0)(= column 0)) 1). So my conception of the problem was this: we have the summit. That's the really significant/important base case. Then we have some no-go zones outside the triangle's boundaries where we want to return 0. And for everything else, we'll return a value based on the recursive definition of pascal in else, with the number values ultimately "flowing down" from recursive calculations that ultimately originate in that summit with the value of 1.

A different way of thinking about the problem, which leads to a simpler implementation in code that seems more common than how I went about it, is to define the entire left and right boundaries of the triangles as 1 explicitly by returning a 1 when the indices of column or column+row indicate that you're at the boundary of the triangle. So the triangle's entire boundaries become the base case. These base cases include the summit. And you don't have to worry about going "outside" the triangle because whenever you hit an edge you just return 1, full stop.

So one thing worth noting after this comparison is that while in my head I was thinking of the summit of the triangle as the "important" base case, what I effectively did was define the coordinates just outside the triangle's edge as another base case - one for which I would return 0. This led to a bit of extra complexity. OTOH, my program is more robust at handling out-of-range inputs without looping, since while I had the coordinates just beyond the "edge" of the triangle in mind as the important case, I actually handled a whole range of other cases in how I implemented my solution.

Exercise 1.13 ❌🤯

(Needed this one explained to me in detail and at length - thanks Max. While I'm not giving myself "credit" for having figured this out without help, I persisted quite a lot and figured a bunch of stuff out 🙂).

For a discussion of mathematical induction in the context of a simpler example, see this earlier post.

The definition of Fibonacci numbers:

Starting with the Hint

The problem starts by asking you to prove that $Fib(n)$ is the closest integer to $\frac{\psi^n}{\sqrt5}$ but then gives you a big "hint". We start with the hint in our analysis. So we start with using mathematical induction to try to prove that $Fib(n)= \frac{\phi^n - \psi^n}{\sqrt{5}}$.

Some Initial Facts

So we start with the definition of Fibonacci numbers:

$Fib(n) = Fib(n - 1) + Fib (n - 1)$

And we also know

$\psi = \frac{1 - \sqrt{5}}{2}$

$\phi = \frac{1 + \sqrt{5}}{2}$

Some other stuff we can figure out:

$\phi^2 = \phi + 1$

This is shown from:

$\phi^2 = (\frac{1 + \sqrt{5}}{2})^2 $

$= \frac{1 + 2\sqrt{5} + 5}{4}$

$= \frac{6 + 2\sqrt{5}}{4}$

$= \frac{3 + \sqrt{5}}{2}$

$= \frac{1 + \sqrt{5}}{2} + 1$

$= \phi + 1$

And we can by similar means show that

$\psi^2 = \psi + 1$

So those are some things we can know up front.

Base Cases

With Max's help I moved onto evaluating some base cases.

$Fib(1) = 1$

So we want to show that

$\frac{\phi^1 - \psi^1}{\sqrt{5}} = 1$

Step 1: $\frac{\phi^1 - \psi^1}{\sqrt{5}}$

Step 2: $\frac{1}{\sqrt{5}}(\frac{1 + \sqrt{5}}{2} - \frac{1 - \sqrt{5}}{2})$

Step 3: $\frac{1}{\sqrt{5}}{(\frac{2\sqrt{5}}{2})}$

Step 4: 1

So $Fib(2) = 1$ per our definition of $Fib(n)$.

So we want to show that:

$\frac{\phi^2 - \psi^2}{\sqrt{5}} = 1$

Step 1: $\frac{\phi^2 - \psi^2}{\sqrt{5}}$

Step 2: $\frac{(\phi - \psi)({\phi + \psi})}{\sqrt{5}}$

Step 3: $\frac{1}{\sqrt{5}}{(\phi - \psi)({\phi + \psi})}$

Step 4: $\frac{1}{\sqrt{5}}{( \frac{1 + \sqrt{5}}{2} - \frac{1 - \sqrt{5}}{2})({\frac{1 + \sqrt{5}}{2} + \frac{1 - \sqrt{5}}{2}})}$

Step 5: $\frac{1}{\sqrt{5}}{(\frac{2\sqrt{5}}{2})({\frac{2}{2}})}$

Step 6: 1

Inductive Hypothesis

This is where I started to have lots of difficulty.

Suppose that $n$ is some $k$. Assuming that $Fib(k)= \frac{\phi^k \,- \,\psi^k}{\sqrt{5}}$ where $k = n - 1$ and $k = n - 2$, show that it works when $k = n$.

In other words, show that the following is true:


$\frac{\phi^{k-1} \,- \,\psi^{k - 1}}{\sqrt{5}} + \frac{\phi^{k-2} \,-\, \psi^{k - 2}}{\sqrt{5}} = \frac{\phi^{k} \,-\, \psi^{k}}{\sqrt{5}} $

Inductive Step

$\frac{\phi^{k-1} \,-\, \psi^{k - 1}}{\sqrt{5}} + \frac{\phi^{k-2} \,-\, \psi^{k - 2}}{\sqrt{5}}$

$= \frac{\phi^{k-1} \,-\, \psi^{k - 1} \, + \, \phi^{k-2} \,-\, \psi^{k - 2}}{\sqrt{5}}$

$= \frac{\phi^{k-1} \, + \, \phi^{k-2} \,-\, \psi^{k - 1} \,-\, \psi^{k - 2}}{\sqrt{5}}$

$= \frac{\phi^{k-2}\phi^1 \, + \, \phi^{k-2} \,-\, \psi^{k - 2}\psi^1 \,-\, \psi^{k - 2}}{\sqrt{5}}$

(note: the step above follows from the fact that $a^m = a^{m + 1 - 1} = a^{m - 1} \times a^1$)

$= \frac{\phi^{k-2}(\phi^1 + 1) \,-\, \psi^{k - 2}(\psi^1+\, 1)}{\sqrt{5}}$

$= \frac{\phi^{k-2}(\phi^1 + 1) \,-\, \psi^{k - 2}(\psi^1+\, 1)}{\sqrt{5}}$

Next, per some stuff we partially proved in "Some Initial Facts",

$= \frac{\phi^{k-2}(\phi^2) \,-\, \psi^{k - 2}(\psi^2)}{\sqrt{5}}$

$= \frac{\phi^{k-2+2} \,-\, \psi^{k - 2 + 2}}{\sqrt{5}}$

$= \frac{\phi^{k} \,-\, \psi^{k}}{\sqrt{5}}$

Doubling Back

Ok, now that we've shown that $\frac{\phi^{k-1} \,- \,\psi^{k - 1}}{\sqrt{5}} + \frac{\phi^{k-2} \,-\, \psi^{k - 2}}{\sqrt{5}} = \frac{\phi^{k} \,-\, \psi^{k}}{\sqrt{5}} $, we can turn back to the initial question, which asked us to prove that $Fib(n)$ is the closest integer to $\frac{\phi^n}{\sqrt{5}}$.

So for every n we can give to $Fib(n)$, we get an integer out, cuz that's how $Fib(n)$ is defined - it adds up integers. Now, if we want to say that $Fib(n)$ is the closest integer to $\frac{\phi^n}{\sqrt{5}}$, we have to show that the absolute value of the difference between $Fib(n)$ and $\frac{\phi^n}{\sqrt{5}}$ is less than or equal to 1/2. At this point I'm going to link a solid explanation I found for Exercise 1.13 which appears in a Google Group. The figure below comes from that explanation:

So again, $fib(n)$ is always an integer. If $fib(n)$ for some given $n$ were to be greater than $\frac{1}{2}$ away from $\frac{\phi^n}{\sqrt{5}}$ for some given $n$, then that would mean that there was some other integer, besides $fib(n)$, which was the closest integer.

Consider an example: $fib(10)$. That's 55. And $\frac{\phi^{10}}{\sqrt{5}}\approx 55$ (it's a tiny bit over 55). But if $\frac{\phi^{10}}{\sqrt{5}}\approx 55.6$, that would mean another integer was closer to $\frac{\phi^{10}}{\sqrt{5}}$ than $fib(10)$ (namely, 56).

So that's all by way of saying that we want to prove that $\left| fib(n) - \frac{\phi^n}{\sqrt{5}} \right| \leq \frac{1}{2}$, which will show that $Fib(n)$ is the closest integer to $\frac{\phi^n}{\sqrt{5}}$.

So let's prove that $\left| fib(n) - \frac{\phi^n}{\sqrt{5}} \right| \leq \frac{1}{2}$

This is where that whole business we went through in the previous section "Starting with the Hint" comes in handy. Because we know already know that $Fib(n)= \frac{\phi^n - \psi^n}{\sqrt{5}}$. So we can substitute the left side:

$\left| \frac{\phi^n - \psi^n}{\sqrt{5}} - \frac{\phi^n}{\sqrt{5}} \right| \leq \frac{1}{2}$

And then we're left with:

$\left| -\frac{\psi^n}{\sqrt{5}} \right| \leq \frac{1}{2}$

and since we don't have to worry about the negative inside absolute value bars:

$\left|\frac{\psi^n}{\sqrt{5}} \right| \leq \frac{1}{2}$

The $\psi^n$ part is the part whose sign varies depending on the $n$. We don't have to worry about negative signs hiding in the square root of 5, so we can do:

$\frac {\left|\psi^n \right| }{\sqrt{5}} \leq \frac{1}{2}$

${\left|\psi^n \right| } \leq \frac{\sqrt{5}}{2}$

The highest value ${\left|\psi^n \right| }$ achieves is 1, when n = 0. Past that, for each increase in $n$, ${\left|\psi^n \right| }$ becomes smaller. $\left|\psi^1 \right| = \frac{1 - \sqrt{5}}{2} \approx 0.61$, and from there on we're just multiplying more and more sub-1 values together as we increase $n$. So ${\left|\psi^n \right| }$ is always $\leq \frac{\sqrt{5}}{2}$

Why Do We Care Again?

On the Critical Fallibilism Discourse forum, there was some discussion of an unanswered question from a previous post starting around here. I had asked why the book was talking about the golden ratio and the inefficiency of the recursive fibonacci procedure at an earlier point in the book (before Exercise 1.13) and Max said:

So the point was just to rigorously show the complexity of the fib procedure as we had implemented it.

Following Up

More on Exercise 1.10

I realized that I forgot to check my answers against other people's in my last post.

Turns out I was right, but there was one point that I thought warranted further clarification.

For the part of the problem that asked for a concise mathematical definition of (define (h n) (A 2 n)), I saw a couple of answers along the lines of aelanteno's here that used notation that looks more standard, using standard exponentiation and a composition of functions instead of the Rudy Rucker notation for tetration:

Basically I think the analysis is like:

(g n) = $g(n)$ = $2^n$

(h n) = $h(n)$ =(A 2 n)

$g(h(n)) $

I liked the Rudy Rucker notation where you do e.g. ${^{3}2}$ for $2^{2^{2}}$ because it matches my intuitions better. I basically want to include the "bottom" 2 as part of the "tower" of stuff that is getting repeatedly exponentiated (or maybe the word is tetrated). If I used regular exponents, I'd have to use something like $2^{n-1}$