Iterative vs Recursive Methods for Problems that Benefit from Backtracking

In many an interview question I’ve been asked to write a function that will return the ith element of the fibonacci sequence.

Failing to memorise an implementation by heart, the first thing that always comes to mind is a loop. A basic version could look like this:

The first two conditionals relate to the base of the 1rst and 2nd elements that are given as 0 and 1 respectively and my loop starts from the 3rd one onwards.

At this point the interviewer will rest his hand on his chin and say: “could it be done differently?”. There is only one correct answer and that is: “yes! recursively”. Then you write the implementation:

It looks a lot simpler and more elegant to the first version so when the interviewer asks which one is better, you are tempted to say “the second”, but no, you should say the first (after explaining that it does not really matter from a results perspective). That is because the second, the recursive one, has a different complexity*, with the iterative version being O(n) and the recursive O(2^n).

* For an in-depth explanation as to why and for a better implementation of the Fibonacci algorithm see this.

What does all this mean?

Well, N is a variable that denotes the size of the problem an algorithm is called to solve. If we were asked to write a function that prints all the integers from a given value down to 0 then N will be equal to that value, because that’s how many calculations we need to make to reach 0. A basic implementation could consist of nothing more than a variable and a loop that prints its current value, subtracts one and continues.

How many times the loop will run? N times! If the start value was 3, then we need to perform the printing and subtraction 3 times, if it is 100, then 100 times etc. As we are talking about time, we are defining the time complexity of the algorithm. Also, we notice that the function does not allocate any memory for use in its iterations and simply reuses the same variables to store the new values, so as much memory as it needs for one loop it needs for a thousand. This means that the space complexity is constant.


We can write the same function recursively…

Again, it looks simpler and it will still executes as many times as the value of start. But is there all there is to it? It turns out we often need to account for space complexity as this might change from one implementation to another, as it does here, and crash the program when it runs out of memory to allocate.  Our recursive solution might not allocate any memory dynamically (as our iterative did not) but each function call increases the memory stack where the local variables and call arguments of each function are stored. This is an expensive resource and must be accounted for. The number of times the function will be call is proportional to the number we are counting down from, and there is one frame per function call… so:


The stack

Without going into too much details, every time a function executes, a chunk of memory is allocated to hold its local variables as well as the variables passed to it. In the countdown function we are initialising one variable, current and pass another to the function, start, both being integers (64-bit signed integer values in Swift) so the stack should at least be 16 bytes in size to accommodate.  After every iteration and for an input of 3, the stack could look something like this:

Above we see that all the action is essentially happening in one frame. That is the frame of the countdown function which is called only once (one call – one frame). What happens in the recursive solution however is different.


As the function is called recursively, a new stack frame is prepared. There, the value passed as an argument to the function is being copied and space is provided for the local variable called current. When the escape condition is met the function which was last called (the tope for the stack) will return which will trigger a cascading return until we reach the calling function and continue with the rest of our program.

So in the iterative version the stack does not grow while in the recursively it does. The proportion with which it grows is relative to the number of times the function calls itself, which is proportional to the value we are counting down from, hence N space complexity.

So at this point we are ready to stop and conclude that iterative is always better. If we have another look at the graphs above however we will notice one more thing. The recursive version does not use memory space for nothing. It actually saves the state of the previous function calls intact, something that the iterative does not do, and it is therefore possible to return to one of them! Why would we want to do that? Why would we want to move backwards in our solution (and get it less solved) rather than forward?

NOTE: The example above is theoretical and could be handled in a number of ways depending on the compiler. More specifically there is something called tail recursive optimisation which for what concerns us, is simply a re-write of the code (by the compiler) where the recursive call is turned into an iteration, for the space complexity reasons we discussed. This will only happen if the recursive call is the last statement of the function which means that backtracking is not possible, and cannot therefore happen in the examples below.

The ‘Correct Change’ problem (and the need for Backtracking)


When moving through the space of a given problem there is always the chance we might hit a dead end. The simplest way to visualise it is through an actual maze where whatever algorithm we use to solve this, there is always a chance we will reach a (physical this time) dead end. We will need to get back to a point we know was fine and continue from there taking another path.

To illustrate. Take the following example (I tried really hard to find the simplest possible one and still did not manage to get it down to something basic, so bear with me).

Imagine the following problem: Working behind the till we would like to give back the correct amount of change using the coins we have at hand*. Let’s represent our set of coins with a list of positive integers [1, 2, 3, 4, 5, 6, 7, 8, 9] and let’s say that we are trying to reach a sum of 10. An iterative solution might look like.

*this is a simplification of the Change-making problem where we do not care about the amount of coins we give out..

Swift-wise, I have created an extension for the Array class that will pick a number at random, remove it from the array and return it.

I we run the code above we might get a 10. If we run it multiple times however, we will see that the value fluctuates, and that we sometimes get a 9 or even an 8 which is 2 whole degrees further from the optimal solution. Why is this happening. Imagine if the first number we pick is the 1 and the second is the 8. We have reached 9 but there is no other number that is going to take us closer to 10 as the smallest we have left is the 2. In other words we are stuck. The iterative solution will try every other number in the array and when it has exhausted its options it will return with its best guess. The problem occurs because of our (un)lucky start. If we could backtrack from this dead end and try with a different start value we might instead get a 7 rather than an eight. Then we would eventually hit the 2 that is remaining in the list and reach our beloved 10!

This is what the recursive solution does

Our inner function makes a recursive call until it reaches the target value. If it fails after having exhausted all possible draws, it will return its best attempt (the crt variable), returning control to the previous caller who, after failing the “if _solve(candidates, current: crt) == target {…” conditional, will attempt a new draw from the array.

The array is copied over as it is a struct rather than a pointer, so every time we make a draw and remove a value, it will not affect the array of the previous callers.

On a very high level the stack might be represented like so:

For simplification I represented only the 2 variables passed on the _solve inner function after its first recursive call. Obviously this does not capture everything that will happen on the physical stack which will depend upon a lot of very low level variables going down to the hardware. The diagram above is an abstract view of a much more complex process, but it should capture the essence of the logic part of the computation.

Sample Execution: At the second call of _solve we have an 8 after we already had a 1 and the loop will exhaust all tis values at which point it will return false (candidate.count == 0). Returning from a function will pop the frame and go back to the previous one where the current value was still 1. Because the function returned false the if _solve(… condition is not met and the loop continues. At the second iteration we got a 7 and call recursively. This time we got a 2 which brings us to 10. The function will return true which will cause a cascade of true back to the calling function which will receive a 10.

What Happened to our Complexity?

Lets briefly calculate the complexity for the two solutions. In the iterative one we pick a coin (number) at random and whether we can add it to the sum without going over the target or not, we continue by picking another until the coins get exhausted. So the we can say that the loop will run as many times  as there are coins, or in other words N times. So the complexity is linear at O(N).

What about the second solution. We can think of it this way: Each function will only loop over its available coins, since every pick removes a coin from the array. The worst possible case is the one where every function picks a path that eventually leads to a dead-end and it will be forced (through backtracking) to try again with the next pick. Each function therefore will run for the number of coins it has been handed and make a recursive call passing an array with one less that it has been handed until the array is exhausted. The problem is that of finding all subsets from a given set and so the complexity will be O(2^N).