© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. IndenPython Challengeshttps://doi.org/10.1007/978-1-4842-7398-2_13

Quick Start O-Notation

Michael Inden1  
(1)
Zurich, Switzerland
 

In this book, the so-called O-notation is used to classify the running time of algorithms. This allows a more formal classification of the complexity of algorithms.

C.1 Estimations with O-Notation

To estimate and describe the complexity of algorithms and classify their running time behavior, it would be impractical to always take measurements. In addition, measurements only reflect the running time behavior under certain restrictions of the hardware (processor clock, memory, etc.). To be able to classify the consequences of design decisions independently of such details and on a more abstract level, computer science uses the so-called O-notation, which indicates the upper bound for the complexity of an algorithm. To do so, you are able to answer the following question: how does a program perform when instead of 1,000 input values, for example, 10,000 or 100,000 input values are processed? To answer this question, the individual steps of an algorithm must be considered and classified. The aim is to formalize the calculation of complexity to estimate the effects of changes in the number of input data on the program running time.

Consider the following while loop as an introductory example:
i = 0                         // O(1)
while i < n:                  // O(n)
    create_person_in_db(i)    // O(1)
    i += 1                    // O(1)

Any single instruction is assigned a complexity of O(1). The loop itself is assigned the complexity O(n) due to the n executions of the loop body.1 Adding these values together, the cost of running the program is thus O(1) + O(n) (O(1) + O(1)) = O(1) + O(n) 2. For an estimation of complexity, constant summands and factors do not matter. Only the highest power of n is of interest. Thus, we get a complexity of O(n) for the program’s illustrated piece. This simplification is permissible since, for larger values of n, the influence of factors and smaller complexity classes is insignificant. For the understanding of the considerations in the following sections, this informal definition should be sufficient.

I would like to quote two sentences by Robert Sedgewick that characterize the O-notation, from his standard work Algorithms [Sed92]: “[...] the O-notation is a useful tool for specifying upper bounds on the running time, which are independent of the input data’s details and the implementation. […] The O-notation proves extremely useful in helping analysts to classify algorithms according to their performance, and by helping algorithms in their search for the ‘best’ algorithms.” (translated from the German book).

C.1.1 Complexity Classes

To be able to compare the running time behavior of different algorithms with each other, seven different complexity classes are usually sufficient. The following bullet points names the respective complexity class and some examples:
  • O(1): The constant complexity results in a complexity that is independent of the number of input data n. This complexity often represents an instruction or a simple computation that consists of a few computational steps.

  • O(log(n)): With logarithmic complexity, the running time doubles when the input data set n is squared. A well-known example of this complexity is binary search.

  • O(n): In the case of linear complexity, the running time grows proportionally to the number of elements n. This is the case for simple loops and iterations, such as a search in an array or a list.

  • O(n x log(n)): This complexity is a combination of linear and logarithmic growth. Some of the fastest sorting algorithms (e. g. Merge Sort) show this complexity.

  • O(n2): When doubling the amount of input data n, the quadratic complexity leads to a quadrupling of the running time. A tenfold increase in the input data already leads to a hundredfold increase in running time. In practice, this complexity is found with two nested for or while loops. Simple sorting algorithms usually have this complexity.

  • O(n3): With cubic complexity, a doubling of n already leads to an eightfold increase of the running time. The naive multiplication of matrices is an example of this complexity class.

  • O(2n): The exponential complexity results for a doubling of n in a squaring of the running time. At first, this does not sound like much. But with a tenfold increase, the running time increases by a factor of 20 billion! The exponential complexity occurs frequently with optimization problems such as the Traveling Salesman Problem, where the goal is to find the shortest path between different cities while visiting all cities.

To cope with the problem of exorbitant running time, the program uses heuristics, which may not find the optimal solution, just an approximation of it, but have much lower complexity and a significantly shorter running time.

Table 13-1 shows impressively which effects the mentioned complexity classes have for different sets of input data n.2
Table 13-1

Effects of Different Time Complexities

n

O(log(n))

O(n)

O(n x log(n))

O(n2)

O(n3)

10

1

10

10

100

1.000

100

2

100

200

10.000

1.000.000

1.000

3

1.000

3.000

1.000.000

1.000.000.000

10.000

4

10.000

40.000

100.000.000

1.000.000.000.000

100.000

5

100.000

500.000

10.000.000.000

1.000.000.000.000.000

1.000.000

6

1.000.000

6.000.000

1.000.000.000.000

1.000.000.000.000.000.000

Based on the values shown, you get a feeling for the effects of different complexities. Up to about O(n x log(n)) the complexity classes are favorable. Optimal and desirable, although not achievable for many algorithms, are the complexities O(1) and O(log(n)). Already O(n2) is usually not favorable for larger input sets, but it can be used for simple computations and smaller values for n without any problems.

Note: Influence of Input Data

Some algorithms behave differently depending on the input data. For Quick Sort, the average case results in a complexity of n x log(n), but this can increase to n2 in the extreme case. Since the O-notation describes the “worst case,” Quick Sort is assigned a complexity of O(n2).

C.1.2 Complexity and Program Running Time

The numbers calculated by a special O-complexity for a set of input values n may sometimes be daunting. Still, they say nothing about the actual execution time, only about its growth when the input set increases. Based on the introductory example, the O-notation makes no statement about the duration of individual calculation steps: The increment i += 1 and the database access create_person_in_db(i) were both rated O(1), even though the database access is several orders of magnitude more expensive than the increment concerning execution time.

For “normal” instructions without accesses to external systems, such as file systems, networks or databases (i.e., additions, assignments, etc.), the impact of n is in many cases not decisive for today’s computers for typical business applications with user interactions. The impact on actual runtime hardly really matters for small n (< 1000) at complexities O(n) or O(n2) and even sometimes at O(n3) nowadays—but this does not mean that you should not use algorithms that are as optimal as possible. Rather, the reverse is true: You can also start with a functionally correct implementation and put it into production. The optimized version may be rolled out sometime later.

All in all, I would like to emphasize once again that even multiple nested loops with the complexity O(n2) or O(n3) are often executed much faster in absolute terms than some database queries over a network with complexity O(n). Similar is true for a search in an array (O(n)) and access to an element of a hash-based data structure (O(1)). For small n, the computation of the hash values can take longer than a linear search. However, the larger n gets, the more the impact of the worse complexity class affects the actual running time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset