2  Analysis of Algorithms

“Programming is the art of telling another human being what one wants the computer to do.”
– Donald E. Knuth

2.1 Computational Problems

Let us first define a computational problem. A (computational) problem is a well-specified task stated in general terms, specifying the desired output for any acceptable input to the problem.

For example, sorting a list of numbers is a computational problem. It stipulates that an acceptable input must be a list or collection of real numbers, and the desired output must be a list with the same numbers but arranged in increasing (equivalently decreasing) order.

Keep in mind that the task of sorting a specific list (e.g. [0, -1, 2, 0]) should not be called a computational problem, since it is not given in general terms. When considering such a specific input to the problem, we may rather call it an instance of the sorting problem as discussed in the previous paragraph.

Formally, we define the sorting problem as follows:

  • Input: an array of numbers A=\langle a_1, a_2, \ldots, a_n\rangle.

  • Output: A reordering or permutation \langle a_1', a_2', \ldots, a_n'\rangle of the array A such that a_1'\leq a_2'\leq\ldots\leq a_n.

Exercise 2.1 Name five computational problems that we come across in our daily lives.

Solution:
  • adding two numbers

  • sorting an array

  • finding the minimum value is an array

  • searching a query value in an array

  • finding the shortest path between two vertices in a graph

We use algorithms to solve computational problems.

2.2 What is an Algorithm?

An algorithm is any well-defined computational procedure that takes some value (or set of values) as input and produces some value as output. Thus, an algorithm is a sequence of (finite) computational steps that transform its input to an output.

We may have different algorithms to solve the same computational problem. However, they often differ dramatically in their efficiency.

Writing Pseudocode

In this course, we mostly rely on pseudocodes for describing algorithms. Their implementation in Python will be an less significant step. Pseudocode delineates the logic of an algorithm without explicitly using the syntax of a programming language.

Convention

We use the following conventions in our pseudocode:

  • Indentation indicates block structure.

  • The looping constructs while, for, and repeat-until and the if-else conditional construct have interpretations similar to Python.

  • The symbol // indicates that the remainder of the line is a comment.

  • Variables (such as i, j , and key) are local to the given procedure. We won’t use global variables without explicit indication.

  • We access array elements by specifying the array name followed by the index in square brackets. For example, A[i] indicates the ith element of the array A.

  • We typically organize compound data into objects, which are composed of attributes.

  • We pass parameters to a procedure by value: the called procedure receives its own copy of the parameters, and if it assigns a value to a parameter, the change is not seen by the calling procedure

  • A return statement immediately transfers control back to the point of call in the calling procedure.

  • The boolean operators and and or are short circuiting.

We give a few examples first.

2.3 Example Algorithms

Let us develop and analyze a few example algorithms.

SUM-ARRAY

Description: SUM-ARRAY is a computational problem that computes the sum of the n numbers in array A[0:n-1].

  • Input: An array or list A[0:n-1] containing n numbers.

  • Output: The sum of the numbers in A[0:n-1].

Example 2.1 If the input array is A = <1, 0, -1, 2>, then the return must be 1 + 0 + (-1) + 2 = 2.

Pseudocode
\begin{algorithm} \caption{SUM-ARRAY} \begin{algorithmic} \Procedure{SUM-ARRAY}{$A[0:n-1]$} \State $sum = 0$ \For{$i = 0$ \To $n-1$} \State $sum = sum + A[i]$ \State $i = i + 1$\Comment{This line is not needed in Python} \EndFor \Return $sum$ \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 2.2 (Dry)-run SUM-ARRAY on the input A=<-2, 3, 5, 0, -1>.

Exercise 2.3 Implement SUM-ARRAY in Python.

Solution
def SUM_ARRAY(arr):
    sum = 0
    n = len(arr)
    for i in range(n):
        sum = sum + arr[i]
    return sum

Exercise 2.4 Identify at least three corner cases for SUM-ARRAY. Describe how you would address them in the code.

SUM-ARRAY-POS

Description: SUM-ARRAY-POS is a computational problem that computes the sum of the positive numbers in an array A[0:n-1] of size n.

  • Input: An array or list A[0:n-1] containing n numbers.

  • Output: The sum of the positive numbers in A.

Example 2.2 If the input is the array A = <1, 0, -1, 2>, then the return will be 1 + 2 = 3.

Pseudocode
\begin{algorithm} \caption{SUM-ARRAY-POS} \begin{algorithmic} \Procedure{SUM-ARRAY-POS}{$A[0:n-1]$} \State $sum = 0$ \For{$i = 0$ \To $n-1$} \If{$A[i] > 0$} \State $sum = sum + A[i]$ \EndIf \State $i = i + 1$\Comment{This line is not needed in Python} \EndFor \Return $sum$ \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 2.5 (Dry)-run SUM-ARRAY-POS on the input A=<-2, 3, 5, 0, -1>.

Exercise 2.6 Implement SUM-ARRAY-POS in Python.

Solution
def SUM_ARRAY_POS(arr):
    sum = 0
    n = len(arr)
    for i in range(n):
      if arr[i] > 0:
        sum = sum + arr[i]
    return sum

Exercise 2.7 Identify at least three corner cases for SEARCH-ARRAY. Describe how you would address them in the code.

SEARCH-ARRAY

Description: SEARCH-ARRAY is a computational problem that finds an element in an array.

  • Input: An array or list A[0:n-1] containing n numbers and a query value x.

  • Output: An first index i such that A[i]=x, otherwise -1.

Example 2.3 If the inputs are the array A = <1, 0, -1, 2> and the value x = 0, then the return will be 1.

Pseudocode
\begin{algorithm} \caption{SEARCH-ARRAY} \begin{algorithmic} \Procedure{SEARCH-ARRAY}{$A[0:n-1]$, $x$} \For{$i = 0$ \To $n-1$} \If{$A[i] = x$} \Return $i$ \EndIf \State $i = i + 1$ \EndFor \Return $-1$ \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 2.8 (Dry)-run SEARCH-ARRAY on the input A=<-2, 3, 5, 0, -1> and x=4.

Exercise 2.9 Implement SEARCH-ARRAY in Python.

Solution
def SEARCH_ARRAY(arr, x):
    n = len(arr)
    for i in range(n):
        if arr[i] == x:
          return i
    return -1

Exercise 2.10 (MIN-ARRAY) Develop an algorithm to find the minimum value in an array A[0:n-1].

Solution
\begin{algorithm} \caption{MIN-ARRAY} \begin{algorithmic} \Procedure{MIN-ARRAY}{$A[0:n-1]$} \State $min=A[0]$ \For{$i = 1$ \To $n-1$} \If{$A[i] < min$} \State $min=A[i]$ \EndIf \State $i = i + 1$ \EndFor \Return $min$ \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 2.11 Identify at least three corner cases for MIN-ARRAY. Describe how you would address them in the code.

Exercise 2.12 What would happen if you replaced >=with > (or vice-versa) in MIN-ARRAY.

Exercise 2.13 (SUM-MATRIX) Develop an algorithm for summing all the elements in a square matrix A=[a_{i,j}] of size n\times n.

Solution
\begin{algorithm} \caption{SUM-MATRIX} \begin{algorithmic} \Procedure{SUM-MATRIX}{$A[0:n-1][0:n-1]$} \State $min=0$ \For{$i = 0$ \To $n-1$} \For{$j = 0$ \To $n-1$} \State $sum=sum + A[i][j]$ \State $j = j + 1$ \EndFor \State $i = i + 1$ \EndFor \Return $sum$ \EndProcedure \end{algorithmic} \end{algorithm}

The Strand Puzzle

2.4 Analyzing Algorithms

Analyzing an algorithm means to predict the computational resources (e.g. computation time, memory) needed to run the algorithm on different instances or inputs. But how do we compute the running time of an algorithm?

Exercise 2.14 Do you think it is easy to compute the time taken by an algorithm?

Although such a task is almost impossible, we can still pretty easily analyze the complexities of an algorithm—at least conceptually!

Let’s us first review our basic intuition behind running-time of an algorithm:

  • it should depend on the input size

  • the larger the input size, the more time needed

  • running-time should be machine and compiler dependent

In order to make the task easier, we sum the running times for each statement executed. We usually denote the running time of an algorithm on an input of size n by T(n).

SUM-ARRAY

For each line in the algorithm, we assign a cost symbolically, then count how many times the line is visited. This is summarized in the following table.

Line Cost Times
2 c_1 1
3 c_2 n
4 c_3 n
5 c_4 n
6 c_5 n
7 c_5 1

So, T(n)=(c_1 + c_5) + (c_2 + c_3 + c_4)n=a + bn, where a\coloneqq(c_1 + c_5) and b\coloneqq(c_2 + c_3 + c_4). This is a linear function of the input size.

SUM-MATRIX

Exercise 2.15 Analyze the running time of SUM-MATRIX.

Solution

We refer to the pseudocode of SUM-MATRIX for this analysis.

Line Cost Times
2 c_1 1
3 c_2 n
4 c_3 n\times n
5 c_4 n\times n
6 c_5 n\times n
8 c_6 n
10 c_7 1

So, the running time T(n)=(c_1 + c_7) + (c_2 + c_6)n + (c_3 + c_4 + c_5)n^2=a + bn+cn^2, where a\coloneqq(c_1 + c_7), b\coloneqq(c_2 + c_6), and c\coloneqq(c_3 + c_4 + c_5). This is a quadratic function of the input size.

SEARCH-ARRAY

For this analysis, we first note the best case and the worst case inputs.

The best case is when the value x is the first element of the input array A. The search halts in the very first iteration. In this case, the running time is T(n)=a for some constant a.

The worst case occurs when the value x is the last element of the input array A or x does not belong to A. The search, in this case, loops through all the elements of the array. In this case, the running time is T(n)=a + bn for some constants a,b.

Bubble Sort

BUBBLE-SORT is yet a sorting technique as described in the following pseudocode:

\begin{algorithm} \caption{BubbleSort} \begin{algorithmic} \Procedure{BubbleSort}{$A[0:n-1]$} \For{$i=0$ to $n-2$} \For{$j=n-1$ downto $i+1$} \If{$A[j]<A[j-1]$} \State{exchange $A[j]$ with $A[j-1]$} \EndIf \EndFor \EndFor \EndProcedure \end{algorithmic} \end{algorithm}

Exercise 2.16 In 2-3 sentences, describe the strategy of BUBBLE-SORT in plain English.

Solution. The algorithm works be iteratively pushing the smaller elements towards the front of the array. In each iteration, it maintains two properties:

  1. the subarray A[0:i] is always sorted in increasing order, and

  2. no element of the subarray A[i+1:n-1] is strictly smaller that the elements of the subarray A[0:i].

Then, each iteration of i extends the subarray A[0:i] by appending to its end the smallest element of the other subarray A[i+1:n-1]. In a way, larger elements “bubble up” in the latter.

Exercise 2.17 Why does it suffice for the outer for-loop to run from 0 to n-2, leaving out the index n-1?

Solution. When i becomes n-2, then A[0:n-3] is sorted and A[n-2:n-1] (just two elements!) is unclean.

Note that the BUBBLE-SORT procedure makes sure that that these two elements are the largest two elements of A. If A[n-2:n-1] is already sorted, we are done.

If not, i.e., A[n-2]>A[n-1], then the inner for-loop exchanges. Hence, the entire array becomes sorted.

Exercise 2.18 Describe the worst-case input to BUBBLE-SORT.

Solution. There are two ways to argue about this.

  1. Note that for any input array, the running time is still quadratic. Therefore, there is no best- or worst-case input to BUBBLE-SORT.

  2. Let us not think in terms of big-O notation, rather think of the worst input in terms of economizing on computational steps. The worst-case input is an array of numbers in decreasing order. In this case, in the inner for-loop line #5 is always executed.

Exercise 2.19 Analyze the worst-case running time of BUBBLE-SORT.

Solution. In case of the worst-case input, line #5 is always executed. And, the running time is O(n^2), due to the nested for-loops.

It is optional to produce a table with the computational costs attached to the steps in the algorithm. However, some level justification for the asymptotic running time is expected.

Exercise 2.20 Implement BUBBLE-SORT in Python.

Solution.

def bubble_sort(arr):
  n = len(arr)
  for i in range(0, n-1):
    for j in range(n-1, i+2):
      if arr[j] < arr[j-1]:
        # exchange arr[j] and arr[j-1]
        arr[j], arr[j-1] = arr[j-1], arr[j] 
  return

Turing Machine taken from Markoš, Kelemen, 2004