Algorithm analysis: how do you do it?
Learn how to do algorithm analysis. See examples and step-by-step demonstrations of analyzes performed on real algorithms.
11/09/2023
Algorithm analysis is an important discipline of Computer Science. It aims to provide more assertive methods to measure the performance of algorithms or, as it is called, their time complexity. This activity is of fundamental importance for every programmer or systems analyst, because it provides means of:
- design more efficient algorithms;
- judging and choosing efficient algorithms.
In this article, the initial concepts of the discipline of algorithm analysis will be addressed.
You will understand:
- what causes the need for this type of analysis;
- the “analytical model” to determine the time complexity of an algorithm;
- the “formal mathematical definition” for algorithm analysis.
IMPORTANT!
Before starting to study algorithm analysis, it is essential that you already know the concepts of “conditional statements” and “repetition structure”, which are basic concepts of computer programming.
Good studies!
An empirical attempt at analysis
Algorithm analysis can be easily performed by empirical methods (based on practical experience), such as measuring the time in seconds that the algorithm takes to process an input. You could measure the elapsed time before and after running an algorithm and, based on that result, make inferences about its performance.
The code below illustrates an example (using the Java language) of this empirical way of calculating time complexity. It measures the running time of the algorithm that calculates the nth term of the Fibonacci sequence (lines 15-19).
1.public static void main(String[] args) {
2. int n;
3. long start, end;
4. double cpu_time_used;
5. Scanner sc = new Scanner(System.in);
6. System.out.println("Enter the nth term of the Fibonacci sequence");
7. n = sc.nextInt();
8. start = System.nanoTime();
9. int result = fib(n);
10. end = System.nanoTime();
11. cpu_time_used = (double)(end-start) / 1_000_000_000.00;
12. System.out.println("The "+n+"th term of the Fibonacci = "+result);
13. System.out.println("Execution time was "+cpu_time_used+" seconds.");
14.}
15.public static int fib(int n) {
16. if(n==2 || n==1)
17. return 1;
18. return fib(n-1) + fib(n-2);
19.}
Dissecting the code
Let’s dissect this code into parts so you can understand it better.
On lines 15-19 there is the function/method that calculates the nth term of the Fibonacci sequence.
15.public static int fib(int n) {
16. if(n==2 || n==1)
17. return 1;
18. return fib(n-1) + fib(n-2);
19.}
In line 8, the initial time record is obtained in nanoseconds; the time before the execution of the algorithm.
8.start = System.nanoTime();
In line 9, the fib() function is invoked, which executes the Fibonacci algorithm.
9.int result = fib(n);
In line 10, the final time record is obtained, also in nanoseconds; the time after running the algorithm.
10.end = System.nanoTime();
In line 11, the execution time is calculated, comparing the before and after.
11.cpu_time_used = (double)(end-start) / 1_000_000_000.00;
Finally, on line 13, the execution time is displayed.
13.System.out.println("Execution time was "+cpu_time_used+" seconds.");
Results
Below are the execution times (in seconds) for some entries of this algorithm:
n=40: execution time around 0.4 seconds
n=41: execution time around 0.8 seconds
n=42: execution time around 1.2 seconds
n=43: execution time around 2 seconds
n=44: execution time around 3.1 seconds
n=45: execution time around 5 seconds
You might think this way of measuring performance is reasonable. However, this is a very bad idea.
These results were generated with the algorithm written in the Java language, using the Eclipse IDE, in an Intel Core i7 notebook. If these specs were to change, the results would be different, and therein lies the problem!
The running time of an algorithm, measured in units of time, can be influenced by several factors, such as:
- computer processing speed;
- used programming languages;
- compilers;
- operational system; to name a few.
These factors invalidate the confidence of the performance results, as it will never be possible to know for sure the processing time of the algorithm precisely.
See below this empirical time measurement code in several different programming languages.
1.import java.util.Scanner;
2.class Main {
3. public static void main(String[] args) {
4. int n;
5. long start, end;
6. double cpu_time_used;
7. Scanner sc = new Scanner(System.in);
8. System.out.println("Enter the nth term of the Fibonacci sequenced");
9. n = sc.nextInt();
10. start = System.nanoTime();
11. int result = fib(n);
12. end = System.nanoTime();
13. cpu_time_used = (double) (end - start) / 1_000_000_000.00;
14. System.out.println("The " + n + "th term of the Fibonacci = " + result);
15. System.out.println("Execution time was seconds " + cpu_time_used + " seconds.");
16. }
17. public static int fib(int n) {
18. if (n == 2 || n == 1)
19. return 1;
20. return fib(n - 1) + fib(n - 2);
21. }
22.}
1.#include <stdio.h>
2.#include <sys/time.h>
3.int fib(int n);
4.int main(void) {
5. int n;
6. struct timeval start_d, end_d;
7. long long start, end;
8. float cpu_time_used;
9. printf("Enter the nth term of the Fibonacci sequenced: ");
10. scanf("%d", &n);
11. gettimeofday(&start_d, NULL);
12. int result = fib(n);
13. gettimeofday(&end_d, NULL);
14. start = start_d.tv_sec * 1000LL + start_d.tv_usec / 1000;
15. end = end_d.tv_sec * 1000LL + end_d.tv_usec / 1000;
16. cpu_time_used = (end - start) / 1000.0f;
17. printf("The %dth term of the Fibonacci = %d", n, result);
18. printf("Execution time was seconds %f seconds.", cpu_time_used);
19.}
20.int fib(int n) {
21. if (n == 2 || n == 1)
22. return 1;
23. return fib(n - 1) + fib(n - 2);
24.}
1.#include <stdio.h>
2.#include <sys/time.h>
3.int fib(int n);
4.int main() {
5. int n;
6. struct timeval start_d, end_d;
7. long long start, end;
8. float cpu_time_used;
9. std::cout << "Enter the nth term of the Fibonacci sequenced: ";
10. std::cin >> n;
11. gettimeofday(&start_d, NULL);
12. int result = fib(n);
13. gettimeofday(&end_d, NULL);
14. start = start_d.tv_sec * 1000LL + start_d.tv_usec / 1000;
15. end = end_d.tv_sec * 1000LL + end_d.tv_usec / 1000;
16. cpu_time_used = (end - start) / 1000.0f;
17. std::cout << "The " << n << "th term of the Fibonacci = " << result << std::endl;
18. std::cout << "Execution time was seconds " << cpu_time_used << " seconds.";
19.}
20.int fib(int n) {
21. if (n == 2 || n == 1)
22. return 1;
23. return fib(n - 1) + fib(n - 2);
24.}
1.using System;
2.class Program{
3. public static void Main(string[] args){
4. int n;
5. long start, end;
6. double cpu_time_used;
7. Console.Write("Enter the nth term of the Fibonacci sequenced: ");
8. n = Convert.ToInt32(Console.ReadLine());
9. start = DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond;
10. int result = fib(n);
11. end = DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond;
12. cpu_time_used = (double)(end - start) / 1000.0;
13. Console.WriteLine("The " + n + "th term of the Fibonacci = " + result);
14. Console.WriteLine("Execution time was seconds " + cpu_time_used + " seconds.");
15. }
16. public static int fib(int n){
17. if (n == 2 || n == 1)
18. return 1;
19. return fib(n - 1) + fib(n - 2);
20. }
21.}
1.import time
2.
3.def fib(n):
4. if (n == 2 or n == 1):
5. return 1
6. return fib(n - 1) + fib(n - 2)
7.
8.print("Enter the nth term of the Fibonacci sequenced: ")
9.n = input()
10.start = round(time.time() * 1000)
11.result = fib(int(n))
12.end = round(time.time() * 1000)
13.cpu_time_used = (end - start) / 1000.0
14.print("The " + str(n) + "th term of the Fibonacci = " + str(result))
15.print("Execution time was seconds " + str(cpu_time_used) + " seconds.")
The analytical method for analyzing algorithms
The analytical method is a more assertive way to measure the time complexity of algorithms. It is not intended to calculate execution time in seconds, minutes, or some other real representation of time. It aims to find a mathematical expression to translate performance into amount of instructions. It is an approach that seeks to account, in the algorithm itself, the amount of instructions it executes.
In this way, it is possible to measure the performance of the algorithm completely independently of programming languages, compilers, tools and other physical factors that may invalidate the analysis.
To this end, the RAM model (Random Access Machine) is the strategy used to measure performance analytically. It counts the instructions an algorithm executes and ignores all aspects of processing speed, programming language or any other type of hardware optimization.
In the RAM model, each operation (such as assignments, comparisons and initializations) consumes a constant and unique computational cost. See below several examples of computational operations commonly found in algorithms, all with the same computational cost:
Declaration and assignment of variables
1.int value = 0; // a single computational cost
2.float sal = 2005.15; // a single computational cost
Increment or decrement of variables
1.value++; // a single computational cost
2.value--; // a single computational cost
Complex math operations
1.value*(4/Math.sqrt(9)); // a single computational cost
Access to array elements
1.list[i]; // a single computational cost
Logical operations
1.if(value > 10); // a single computational cost
2.if(value==3 || value==2); // a single computational cost
Read and write operations
1.System.out.println("Hello World!"); // a single computational cost
2.scanner.nextInt(); // a single computational cost
That is, all computational operations will have a single computational cost!
It is obvious that, in reality, these instructions have significant performance differences. Some are physically faster to execute than others. However, as we are interested in analyzing the performance of the algorithm based on the ammount of instructions, the RAM model ignores such differences. Mainly due to the fact that these differences have more to do with the computers than with the algorithm itself.
Analysis by instruction counting
To find the time complexity of an algorithm using the analytical method, just count, line by line, the number of times each instruction is executed.
Below is a code of a simple program that we will use as an example to begin to explain the analytical technique of instruction counting.
1.Scanner sc = new Scanner(System.in); // 1 time
2.int age; // 1 time
3.System.out.println("Enter your age:"); // 1 time
4.age = sc.nextInt(); // 1 time
5.if (age >= 18) // 1 time
6. System.out.println("Of legal age:"); // 1 time
7.else
8. System.out.println("Minor:"); // 1 time
After counting the code instructions above, we observe that the algorithm performs a total of 6 instructions (the if-else are mutually exclusive: either execute the if or execute the else).
In this way, we say that T(n)=6 is the time complexity formula of this algorithm.
This way of quantifying the instructions proposed by the RAM model allows comparing several algorithms with each other and stating more precisely which one is faster.
Regardless of the speed of the computers on which they are run, an algorithm that executes a smaller quantity of instructions will always perform better for large volumes of processed data.
Formal mathematical definition
Once the RAM model and the reason for its adoption have been presented, we can proceed to the formal mathematical definition of the analysis of the time complexity of an algorithm.
Given an algorithm to which:
- an entry of size n is submitted;
- which has in its syntax a set of Q instructions (conditional, variable declaration, assignment, among others);
- where each instruction q ∊ Q has the same computational cost;
- is denoting by f(q) the number of times each instruction q is executed in the algorithm (eg, in a repetition structure, which can repeat the same q instruction several times),
the objective of the time complexity analysis, denoted by T(n), is to calculate the sum of the execution quantity of all the instructions q of the algorithm for that input of size n.
Formally, the cost of the algorithm’s time complexity is represented by: ∑q∊Q f(q), that is, the sum of the total executions of each of its instructions.
Analysis of “for” command
To understand the analytical method in more depth, let’s count the instructions in an algorithm that implements a repetition structure (for statement).
Let’s imagine, for example, a very simple function/method; which takes an input of size n and determines a single instruction inside a repeating loop:
1.public static void func(int n){
2. for(int i=0; i<n; i++){
3. print(i);
4. }
5.}
In this case, the instruction count can be performed as follows:
the initialization “int i=0” is an instruction that is executed only once;
2.for(int i=0; i<n; i++){ // 1 time
the quantity of increments “i++” in a for statement is equal to the quantity of repetitions it executes, in this case, n times;
2.for(int i=0; i<n; i++){ // n times
the quantity of executions of the comparison “i<n” is equal to the quantity of executions of the increment “i++” plus 1 (this 1 refers to the comparison that leaves the loop), totaling n+1 times;
2.for(int i=0; i<n; i++){ // n+1 times
finally, inside the loop, we have a single instruction that is executed n times.
3.print(i); // n times
After counting the algorithm instructions, the sum of all executions must be made. By mathematical definition this algorithm has:
- Q = { “int i=0”, “i<n”, “i++”, “print(i)” } // the instructions
- f(“int i=0”) = 1 // amount executed by the 1st instruction
- f(“i<n”) = n+1 // amount executed by the 2nd instruction
- f(“i++”) = n // amount executed by the 3rd instruction
- f(“print(i)”) = n // amount executed by the 4th instruction
- ∑q∊Q f(q) = 1 + (n+1) + n + n = 3n+2 // time complexity cost
Therefore, it can be said that the time complexity of this algorithm is T(n) = 1 + n + (n+1) + n, resulting in the formula T(n)=3n+2.
In this way, one can accurately measure the quantity of instructions executed by the algorithm as a function of its input size n, that is, its performance as a function of input n.
What to do with the complexity formula?
Once the complexity formula of an algorithm is found, some tests can be done to observe its usefulness.
Let’s go back to the example of “for” command analysis.
1.public static void func(int n){
2. for(int i=0; i<n; i++){
3. print(i);
4. }
5.}
We could see in the previous section that this algorithm generates a time complexity formula T(n)=3n+2
From this formula, we can accurately discover the number of instructions performed by the algorithm for each of the n inputs submitted to it. Let’s look at some examples:
If we submit to the algorithm the value n=5, according to the formula, the number of instructions performed would be T(5)=3*5+2 ► T(5)=17.
This means that for an input of size 5, the algorithm executes 17 instructions, that is, T(5)=17.
Following the same reasoning, one can calculate that T(10)=32, as well as that T(15)=47.
Graphical analysis of the algorithm complexity
The greatest utility of a time complexity function is its graphical representation. The graphical representation allows viewing the behavior of the algorithm: the evolution of the number of instructions executed as a function of the size of input n.
See the graph below that illustrates the results for T(5), T(10) and T(15). It is possible to observe how the performance of the algorithm behaves when the value of n tends to grow.
A graph is an excellent way to visualize the T(n) complexity formula of an algorithm.
Exercises
To demonstrate that you have understood well the technique of “analysis by instructions counting” and the “formal mathematical definition” of the time complexity of an algorithm, solve the proposed exercises:
Exercise A
Using the instruction counting technique, find the time complexity formula T(n) for the algorithm below.
1.public static void func(int n){
2. for(int i=0; i<n; i++){
3. print(i);
4. }
5. for(int i=0; i<n; i++){
6. print(i);
7. }
8.}
Exercise B
Using the instruction counting technique, find the time complexity formula T(n) for the algorithm below.
1.public static void func(int n){
2. for(int i=0; i<n/2; i++){
3. print(i);
4. }
5.}
Continue your studies in this course and read the next article about “best and worst case” in algorithm analysis.
David Santiago
Master in Systems and Computing. Graduated in Information Systems. Professor of Programming Language, Algorithms, Data Structures and Development of Digital Games.