© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. IndenPython Challengeshttps://doi.org/10.1007/978-1-4842-7398-2_5

5. Basic Data Structures: Lists, Sets, and Dictionaries

Michael Inden1  
(1)
Zurich, Switzerland
 

In Python, lists, sets, and key-value mappings (dictionaries) are provided directly in the language as container classes . They manage objects of other classes (also potentially of different data types).

5.1 Introduction

In the following, I first describe sequential data types and especially their operations. After that, you briefly look at lists, sets, and dictionaries.

5.1.1 Sequential Data Types

In Python, sequential data types exist as the basis for various data containers, such as lists, tuples, and strings. The name derives from the fact that these data containers combine sequences of elements. This means that the elements have a defined order and can be addressed via an index. Among other things, the following operations are defined:
  • in: elem in values checks if the element is in the sequence.

  • not in: elem not in values checks if the element is not in the sequence.

  • +/+=: values1 + values2 and values1 += values2 appends the sequence values2 to the other sequence and returns a new sequence.

  • *: Repeats the sequence n times.

  • [index]: values[index] leads to an indexed access and returns the ith element from values. Specifically, [-1] can be used to access the last element.

  • [start:end]: values[start:end] performs slicing and returns the elements from position start to exclusive end from values as a new sequence. There are two interesting variants. On the one hand, [:] (i. e. without range specification) returns a copy of the entire sequence. On the other hand, [start:] and [:end] respectively return the parts starting at start to the end or from the beginning to the index end exclusive.

  • [start:end:step]: values[start:end:step] results in slicing and returns the elements from position start to exclusive end with a step size of step from values as a new sequence. There is an interesting variant [::-1] without range specification and with negative step size, which creates a new sequence in reverse order of the original sequence.

  • len(): len(values) returns the size (i. e. the number of elements in the data container).

  • min()/max(): Calls to min(values) resp. max(values) gets the element with the smallest or largest value from values.

  • sum(): sum(values) sums the values from values.

Example

Let’s look at an example of these operations because understanding them is essential for solving tasks and everyday programming in Python. First, you define some lists and then perform indexed accesses and slicing on them:
names1 = ["Micha", "Tim", "Tom", "Willi"]
names2 = ["Marcello", "Karthi", "Michael"]
names = names1 + names2
print(names)
print(names[-1])
print(names[::-1])
print(names[::2])
print("len: %d, min: %s, max: %s" % (len(names), min(names), max(names)))
This results in the following output:
['Micha', 'Tim', 'Tom', 'Willi', 'Marcello', 'Karthi', 'Michael']
Michael
['Michael', 'Karthi', 'Marcello', 'Willi', 'Tom', 'Tim', 'Micha']
['Micha', 'Tom', 'Marcello', 'Michael']
len: 7, min: Karthi, max: Willi
Custom implementation of rindex() A useful functionality, which unfortunately can only be found in strings but not in sequential containers, is the search from the end with rindex() . However, you can easily implement this as a function or lambda as follows by inverting the sequential container and getting the position there:
def rindex(values, item):
    reversed_values = values[::-1]
    return len(values) - reversed_values.index(item) - 1
last_index_of = lambda values, item: len(values) - values[::-1].index(item) - 1

5.1.2 Lists

A list is a sequence of elements ordered by their position. Duplicates are allowed. Lists provide high-performance indexed access. Since the list is a sequential data type, it possesses all the previously described operators for sequences. In addition, the following indexed, 0-based accesses and operations can be performed on lists:
  • pop(i) returns the ith element of the list and removes it from the list. By default (i. e., without specifying the index), the first element is returned.

  • list[i] = element replaces the element at position i with the passed element.

  • append(element) appends the element to the end of the list.

  • insert(i, element) inserts the element at index i in the list.

  • extend(other_list) appends the elements of other_list to the end of the list.

  • count(value) counts how many times the value value occurs in the list.

  • index(value) returns the index where the value value first occurs in the list.

  • remove(value) removes the element with the value value from the list. Only the first one will be deleted if multiple elements with this value exist. If there is no such value, a ValueError is raised.

  • reverse() reverses the order of the elements in the list.

  • sort() sorts the list in ascending order. An inverse (descending) sort is obtained with sort(reverse=True).

Example

Let’s consider an example of these operations. First, you define two lists and then add elements. You also remove one element and add several elements.
numbers = [1, 2, 3, 4]
names = ["Peter", "Tim", "Mike", "Tom", "Mike"]
names.append("Tom")
names.insert(1, "Carsten")
names.remove("Tom")
print(names)
names.extend(numbers)
names.reverse()
print(names)
print("pop:", names.pop())
print("Tom idx:", names.index("Tom"))
print("Mike count:", names.count("Mike"))
This results in the following output:
['Peter', 'Carsten', 'Tim', 'Mike', 'Mike', 'Tom']
[4, 3, 2, 1, 'Tom', 'Mike', 'Mike', 'Tim', 'Carsten', 'Peter']
pop: Peter
Tom idx: 4
Mike count: 2

List Comprehension

Python offers comprehensions as elegant possibilities for creating new data structures. A list comprehension is an expression that generates a new result list based on a sequence of values and a calculation rule to generate a new list of results:
>>> even = [n for n in range(10) if n % 2 == 0]
>>> even
[0, 2, 4, 6, 8]
More complex expressions can also be specified, such as the creation of tuples:
>>> [(x, y) for x in range(3) for y in range(5)]
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4),
 (2, 0), (2, 1), (2, 2), (2, 3), (2, 4)]
>>> [(x, y, z) for x in range(3) for y in range(3) for z in range(3)]
[(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 2, 0), (0,
 2, 1), (0, 2, 2), (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1),
 (1, 1, 2), (1, 2, 0), (1, 2, 1), (1, 2, 2), (2, 0, 0), (2, 0, 1), (2, 0, 2),
 (2, 1, 0), (2, 1, 1), (2, 1, 2), (2, 2, 0), (2, 2, 1), (2, 2, 2)]

In addition, you saw the specification of a condition in the initial example. In general, you should avoid the complexity getting too large. This assists in ensuring understandability and maintainability.

Regardless, comprehensions are a compelling and helpful feature of Python that you should undoubtedly master.

Variant as set and dictionary comprehension Similar possibilities exist for sets and dictionaries. Two examples follow, one for the determination of all odd numbers up to 10 and another for the mapping of even numbers to their square:
>>> {i for i in range(10) if i % 2 != 0}
{1, 3, 5, 7, 9}
>>> {n: n ** 2 for n in range(10) if n % 2 == 0}
{0: 0, 2: 4, 4: 16, 6: 36, 8: 64}

Example: Custom Implementation of remove_all( )

Removing all elements from a collection that match a certain value is a functionality that is unfortunately not built in. Different variants to solve this are shown below.

Inplace variants First, you start with inplace variants, which means that the original list gets modified and consequently nothing is returned. Therefore, you repeatedly call remove() until a ValueError occurs when the value to be deleted is no longer found.
# not optimal variant
def remove_all_inplace(list, value):
    try:
        while True:
            list.remove(value)
    except (ValueError):
        pass
With this variant, it may be perceived as unattractive that exceptions are used to regulate the control flow. Perhaps you come to the following alternative:
# not optimal variant
def remove_all_inplace_improved(values, item):
    while item in values:
        values.remove(item)

At first these solutions look quite good, but both variants iterate using the outer loop (while true/in) and a hidden inner one (by the implementation of remove()) resulting in a running time of O(n2).

Variant offering better performance There is a solution that offers a running time of O(n): You traverse through the list, and for every value, you check whether it matches the one to be deleted. If this is not the case, you continue with the next entry. If you find an entry to be deleted, you copy the successor into the position given by the write counter. The write counter is always moved on if no match is found. Finally, all values up to the write counter correspond to the desired result. You extract this with slicing.
def remove_all_fast(values, item):
    write_idx = 0
    for value in values:
        if value != item:
            values[write_idx] = value
            write_idx += 1
    return values[:write_idx]
Improved variants Next, you use a list comprehension that retains only the values that satisfy the condition specified in if. This does not change the original list but creates a new result list. Again, this solution offers a running time of O(n) but uses slightly more memory.
def remove_all_v2(values, item):
    return [value for value in values if value != item]
This also applies to the following variant. Here, you use the built-in function filter(). While in Python 2 this returns a list, in Python 3, you only get a filter object that provides a function iter () and is thus iterable. You can, therefore, easily wrap this in a list—please see the following practical tip about the implications.
def remove_all_v3(values, item):
    return list(filter(lambda x: x != item, values))
ATTENTION: WHY DOES FILTER() NO LONGER RETURN A LIST?

As convenient as it is for smaller datasets to get the data as a list, there are some reasons why this is not feasible for large datasets. First of all, it may require a lot of memory. Secondly, not all values have to be kept for different calculations. Thus, the change of the mode of operation on Lazy Evaluation in Python 3 helps to save memory and increase performance. If you want to process the data directly and need a list, a simple wrapping is appropriate, as shown above. The dataset described by the filter is converted into a list.

Example: Custom Implementation of collect_all( )

Another useful functionality is to collect elements that match a certain condition. This can be solved as follows:
# Iteration
def collect_all(values, condition):
    result = []
    for elem in values:
        if condition(elem):
            result.append(elem)
    return result
# List Comprehension
def collect_all_v2(values, condition):
    return [elem for elem in values if condition(elem)]
# Filter
def collect_all_v3(values, condition):
    return list(filter(condition, values))

Check Implementations

Let’s try some of the implementations of the functionality once. First, three variants of remove_all() are tested to delete the entry Mike. Finally, collect_all() should keep all entries with the value Mike.
>>> names = ["Tim", "Tom", "Mike", "Mike", "Mike"]
... remove_all_inplace(names, "Mike")
... print(names)
['Tim', 'Tom']
>>> print(remove_all_v2(["Tim", "Tom", "Mike", "Mike", "Mike"], "Mike"))
['Tim', 'Tom']
>>> print(remove_all_fast(["Tim", "Tom", "Mike", "Mike", "Mike"], "Mike"))
['Tim', 'Tom']
>>> names = ["Tim", "Tom", "Mike", "Mike", "Mike"]
... print(collect_all(names, lambda value: value == "Mike"))
['Mike', 'Mike', 'Mike']

5.1.3 Sets

Let’s now explore sets. The mathematical concept states that they contain no duplicates. Thus, sets form an unordered data structure that does not contain duplicates, but also does not provide indexed access. Instead, some set operations exist such as a test for containment and computation of union, intersection, difference, and symmetric difference.1 In addition, there are the following actions:
  • add(element) adds an element to the set if it does not already exist.

  • update(elements) inserts the elements elements from a list or tuple into a set if not already present in the original set.

  • remove(element)/discard(element) deletes the element from the set. If the element does not exist, remove() will raise a KeyError. With discard(), on the other hand, nonexistence will be ignored and nothing will happen.

  • pop() removes the first element from the set (as an iteration would return them).

  • clear() deletes all elements from the set.

  • copy() returns a (shallow) copy of the set.2

Example

As an example of sets, you start by defining a set of colors (more precisely, their names) in curly brackets. Then you add another set and then a single element.
color_set = {"RED", "GREEN", "BLUE"}
color_set.update(["YELLOW", "ORANGE"])
color_set.add("GOLD")
print(color_set)
This results in the following output (showing that insertion order does not get preserved in sets):
{'RED', 'BLUE', 'YELLOW', 'GOLD', 'GREEN', 'ORANGE'}
Finally, you define two sets with numbers and calculate the typical set operations such as union, intersection, and difference:
number_set1 = {1, 2, 3, 4, 5, 6, 7, 8}
number_set2 = {2, 3, 5, 7, 9, 11, 13}
print("union: %s intersection: %s diff 1-2: %s sym diff: %s" %
    ((number_set1 | number_set2), (number_set1 & number_set2),
     (number_set1 - number_set2), (number_set1 ^ number_set2)))
This results in the following output:
union: {1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 13}
intersection: {2, 3, 5, 7}
diff 1-2: {8, 1, 4, 6}
sym diff: {1, 4, 6, 8, 9, 11, 13}

5.1.4 Key-Value Mappings (Dictionaries)

Let’s now turn to mappings from keys to values. They are also called dictionaries or lookup tables; other terms are associative arrays and hashes, respectively. Regardless of the name, the underlying idea is to assign a value for a unique key. An intuitive example is a telephone directory where names are mapped to telephone numbers. A search by name (key) usually returns a phone number (value) quite quickly. If there is no mapping back from phone number to name, finding a name to a phone number becomes quite laborious.

Dictionaries store key-value pairs and offer, among others, the following functions and operations:
  • dictionary[key] = value adds a mapping (from key to value) as an entry to this dictionary. If a value is already stored for the given key, it gets overwritten with the new value.

  • update(other_dictionary) inserts all entries from the passed dictionary into this dictionary. This overwrites values of already existing entries analogous to the way simple assignment works.

  • items() generates a list containing all key-value pairs of the dictionary as tuples.

  • keys()/values() returns a list containing all keys or values stored in the dictionary.

  • get(key, default_value) get the associated value for a key key. If no entry exists for the key, the default value default_value is returned. If no default value was specified in the call, None is returned.

  • pop(key) deletes an entry (key and associated value) from the dictionary. The value associated with the key key is returned. If no entry was stored for this key, a KeyError is the result.

  • clear() removes all entries of the dictionary.

Unfortunately, there are no functions like contains_key() or contains_value() to check if a special key or value is stored in the dictionary. However, this functionality can be easily recreated by querying with in, as you will see in the following example.

Example

Again, you implement an example to learn about some of the possibilities. To do this, you define a dictionary with an initial payload of names and ages. Then you add one as well as several values. For inspection, you print out the dictionary, its keys, its values, and its entries. Finally, you check for the existence of specific values in these results.
mapping = {"Micha" : 49, "Peter": 42, "Tom": 27}
mapping["NEW"] = 42
mapping.update({"Jim" : 37, "John": 55})
print(mapping)
print(mapping.items())
print(mapping.keys())
print(mapping.values())
# "contains key"
print("contains key Micha?", "Micha" in mapping)
print("Micha values:", mapping.pop("Micha"))
print("contains key Micha?", "Micha" in mapping.keys())
# "contains value"
print("contains value 55?", 55 in mapping.values())
This results in the following output:
{'Micha': 49, 'Peter': 42, 'Tom': 27, 'NEW': 42, 'Jim': 37, 'John': 55}
dict_items([('Micha', 49), ('Peter', 42), ('Tom', 27), ('NEW', 42), ('Jim', 37),
    ('John', 55)])
dict_keys(['Micha', 'Peter', 'Tom', 'NEW', 'Jim', 'John'])
dict_values([49, 42, 27, 42, 37, 55])
contains key Micha? True
Micha values: 49
contains key Micha? False
contains value 55? True

Example: Filtering Elements of a Dictionary in a General Way

Sometimes you want to find all key-value mappings whose values meet a particular condition. This can be programmed in a generally appropriate and elegant way for later reuse as follows:
def filter_dict(input_dict, key_value_condition):
    filtered_dict = dict()
    for key, value in input_dict.items():
         if key_value_condition((key, value)):
            filtered_dict[key] = value
    return filtered_dict
def filter_by_value(input_dict, value_condition):
    filtered_result = filter_dict(input_dict,
                                lambda item : value_condition(item[1]))
    return filtered_result

You can either use the more general function, which gets a filter for key and value, or the specific function designed directly for value filtering.

Let’s define a mapping of cities to (approximate) population numbers and a filter condition on larger cities between 200,000 and 700,000 inhabitants to see the whole thing in action. The last call shows how easy it is to extract only the keys if only they are of interest.
cities_sizes = {"Cologne": 1_000_000, "Kiel": 250_000, "Bremen": 550_000,
                "Zurich": 400_000, "Oldenburg": 170_000}
print(filter_dict(cities_sizes, lambda item: 200_000 <= item[1] <= 700_000))
filtered_cities = filter_by_value(cities_sizes,
                                lambda size: 200_000 <= size <= 700_000)
print(filtered_cities)
print(set(filtered_cities.keys()))
This results in the following output:
{'Kiel': 250000, 'Bremen': 550000, 'Zurich': 400000}
{'Kiel': 250000, 'Bremen': 550000, 'Zurich': 400000}
{'Bremen', 'Zurich', 'Kiel'}

5.1.5 The Stack as a LIFO Data Structure

In the following, I describe the data structure stack. It is not directly a part of Python, but it proves to be very practical for various use cases and can be easily implemented.

A stack is similar to a stack of paper or a desk tray in which you put things on top of a pile and from which you can only take things from the top. In addition, a view of the top element is possible. Beyond that, it offers size information or at least a check whether elements are present. This results in the following methods that form the API:
  1. 1.

    push(element) adds an element on top.

     
  2. 2.

    pop() picks and removes the top element.

     
  3. 3.

    peek() takes a peek at the top element.

     
  4. 4.

    is_empty() checks if the stack is empty.

     

These four methods are sufficient to use stacks profitably for various tasks in practice and for algorithms, for example, when sequences have to be reversed. This property is described in computer science by the term LIFO for Last In, First Out. Sometimes it is referred as FCFS for First Come, First Serve.

Exercise 2 is about implementing a stack on your own.

Example

After solving Exercise 2, you can put some elements on the stack, look at the top element, take elements from the top again, and finally, check if the stack is empty according to expectations:
def main():
    stack = Stack()
    stack.push("first")
    stack.push("second")
    print("PEEK: " + stack.peek())
    print("POP: " + stack.pop())
    print("POP: " + stack.pop())
    print("ISEMPTY: " + str(stack.is_empty()))
This provides the following output:
PEEK: second
POP: second
POP: first
ISEMPTY: true

5.1.6 The Queue as a FIFO Data Structure

To conclude this introduction to basic data structures, I would like to talk about queues. They are also not a part of Python. Like a stack, a queue is often very useful and can also be easily implemented.

A queue is similar to a line at a cash register. People queue up, and the person who came first is served first, known in computer science as FIFO for First In, First Out.

Normally, only a few actions, such as adding and removing elements, benefit from a queue. In addition, a look at the element at the beginning is possible. Beyond that, it offers size information or at least a check whether elements are present. This results in the following methods that form the API:
  1. 1.

    enqueue(element) adds an element to the end of the queue.

     
  2. 2.

    dequeue() takes a look at the element at the beginning of the queue.

     
  3. 3.

    peek() takes a look at the element at the beginning of the queue.

     
  4. 4.

    is_empty() checks if the queue is empty.

     

These four methods are sufficient to create queues for various tasks in practice and for algorithms, such as if you intend to transform recursive algorithms into iterative ones.

Implementation

The method names reflect the concept described earlier. For this purpose, the implementation stores its elements in a list and inserts elements at the end. By using pop(0) or shorter pop() the foremost element is provided.
class Queue:
    def __init__(self):
        self.values = []
    def enqueue(self, elem):
        self.values.append(elem)
    def dequeue(self):
        if self.is_empty():
            raise QueueIsEmptyException()
        return self.values.pop(0);
    def peek(self):
        if (self.is_empty()):
            raise QueueIsEmptyException()
        return self.values[0]
    def is_empty(self):
        return len(self.values) == 0
class QueueIsEmptyException(Exception):
    pass

Example

To understand how it works, you can insert some elements into the queue and then process them as long as there are elements. In particular, reprocessing is aimed for the entry Michael.
def main():
    waiting_persons = Queue()
    waiting_persons.enqueue("Marcello")
    waiting_persons.enqueue("Michael")
    waiting_persons.enqueue("Karthi")
    while not waiting_persons.is_empty():
        if waiting_persons.peek() == "Michael":
            # reprocess at the end
            waiting_persons.enqueue("Michael again")
        next_person = waiting_persons.dequeue()
        print("Processing " + next_person)
The small sample program provides the following output:
Processing Marcello
Processing Michael
Processing Karthi
Processing Michael again
NOTE: DOES IT NEED THE CUSTOM CREATIONS OF STACK AND QUEUE?

Let’s recap: With a stack, you get the element stored last as the first element (that is, in reverse insertion order). This is why it is called Last-In-First-Out (LIFO) . For insertion, as you know, you conceptually use a function called push() and for removal you use one called pop(). On the other hand, a queue represents a queue as we know it from shopping or ticket machines. Here, the person who was there first gets to go first. Accordingly, one speaks of First-In-First-Out (FIFO). Typically, the corresponding operations are called enqueue() and dequeue().

Easily emulate stack and queue with lists As mentioned, Python offers neither stack nor queue as a data structure by default, but does offer powerful lists in terms of functionality. With them, the functions of stack and queue previously mentioned can be implemented easily. In both cases you use the function append() to add elements, such as to emulate push() or enqueue(). In addition, the list provides the function pop(). It can optionally be passed an index that determines the position of the element to be removed—without index, simply the last element. Let’s see how to emulate a stack and a queue with a list:
# List as Stack
stack_of_tasks = []
# Add "tasks" to the stack via append()
stack_of_tasks.append("Task 1")
stack_of_tasks.append("Task 2")
stack_of_tasks.append("Task 3")
stack_of_tasks.append("Task 4")
# Take the top 2 "tasks" from the stack via pop()
last_tasks = stack_of_tasks.pop()
second_last_tasks = stack_of_tasks.pop()
print("Top most:", last_tasks)
print("2nd from top:", second_last_tasks)
# List as Queue
queue_of_numbers = []
# Add 3 elements to the queue via append()
queue_of_numbers.append("First")
queue_of_numbers.append("Second")
queue_of_numbers.append("Third")
# Remove elements via pop(0) until the queue is empty
while len(queue_of_numbers) > 0:
print("Processing:", queue_of_numbers.pop(0))
The above program produces the following output:
Top most: Task 4
2nd from top: Task 3
Processing: First
Processing: Second
Processing: Third

Discussion: What is not optimal about this? First of all, it is pretty obvious that the function names of the list do not optimally match those commonly used for stacks and queues. Worse, if you accidentally specify no index or the wrong index for pop(), this leads to confusion. Even worse, stacks and queues conceptually do not allow indexed access to elements, but lists do. And this is only the beginning of possible problems. Because the above implementations (or to be precise, usages) are lists, you can also call arbitrary functions that have nothing to do with stacks and queues at all, such as insert() or remove(). If this is not clear enough, you could also sort the elements by calling sort() and thus probably mess up the desired order a lot. As you can see, the pitfalls are many.

Conclusion

Based on this reasoning and the need for intuitive handling without risk of misuse, the definition of your own classes for providing the specific data structures stack and queue becomes obvious.

5.2 Exercises

5.2.1 Exercise 1: Common Elements (★★✩✩✩)

Find the common elements of two lists, A and B, and return them as a set. Implement this, both with and without using matching functions from Python’s sets. Write your own function find_common(values1, values2), which works like the Python function intersection().

Examples

Input A

Input B

Result

[1, 2, 4, 7, 8]

[2, 3, 7, 9]

{2, 7}

[1, 2, 7, 4, 7, 8]

[7, 7, 3, 2, 9]

{2, 7}

[2, 4, 6, 8]

[1, 3, 5, 7, 9]

= set()

5.2.2 Exercise 2: Your Own Stack (★★✩✩✩)

Define the basic requirements for a stack and implement class Stack based on these requirements using a list.

5.2.3 Exercise 3: List Reverse (★★✩✩✩)

Exercise 3a: List Reverse (★✩✩✩✩)

Write function reverse(values) that returns the elements of the original list in reverse order—of course without calling the reverse() function of the list.

Examples

Input

Result

[1, 2, 3, 4]

[4, 3, 2, 1]

[“A”, “BB”, “CCC”, “DDDD”]

[“DDDD”, “CCC”, “BB”, “A”]

Exercise 3b: List Reverse Inplace (★★✩✩✩)

What is different if you want to implement reversing the order inplace to be memory- optimal for very large datasets? What should be given then?

Exercise 3c: List Reverse Without Performant Index Access (★★✩✩✩)

Now let’s assume that no performant random index access is available. What happens if you want to reverse the order and any position-based access will result in O(n) and therefore O(n2) for the complete reversal process. How do you avoid this?

Tip

Use a stack.

5.2.4 Exercise 4: Remove Duplicates (★★✩✩✩)

You are supposed to remove duplicate entries from a list. The constraint is that the original order should be preserved. Write function remove_duplicates(values).

Examples

Input

Result

[1, 1, 2, 3, 4, 1, 2, 3]

[1, 2, 3, 4]

[7, 5, 3, 5, 1]

[7, 5, 3, 1]

[1, 1, 1, 1]

[1]

5.2.5 Exercise 5: Maximum Profit (★★★✩✩)

Imagine that you have a sequence of prices ordered in time and that you want to calculate the maximum profit. The challenge is to determine at which time (or value, in this case) it would be ideal to buy and to sell. Write function max_revenue(prices) for this purpose, where the temporal order is expressed by the index in the list.

Examples

Input

Result

[250, 270, 230, 240, 222, 260, 294, 210]

72

[0, 10, 20, 30, 40, 50, 60, 70]

70

[70, 60, 50, 40, 30, 20, 10, 0]

0

[ ]

0

5.2.6 Exercise 6: Longest Sequence (★★★✩✩)

Suppose you are modeling stock prices or altitudes of a track by a list of numbers. Find the longest sequence of numbers whose values ascend or at least stay the same. Write function find_longest_growing_sequence(values).

Examples

Input

Result

[7, 2, 7, 1, 2, 5, 7, 1]

[1, 2, 5, 7]

[7, 2, 7, 1, 2, 3, 8, 1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]

[1, 1, 2, 2, 2, 3, 3, 3, 3]

[1, 1, 2, 2, 2, 3, 3, 3, 3]

5.2.7 Exercise 7: Well-Formed Braces (★★✩✩✩)

Write function check_parentheses(braces_input) that checks whether a sequence of braces is neatly nested in each case. This should accept any round, square, and curly braces but no other characters.

Examples

Input

Result

Comment

“(( ))”

True

 

“({[ ]})”

True

 

“((( ))”

False

Odd number of braces

“((a)”

False

Wrong character, no braces

“((])”

False

No matching braces

Bonus Extend the solution so that a clear assignment of error causes becomes possible. Start with the following enumeration:
from enum import Enum, auto
class CheckResult(Enum):
    OK = auto()
    ODD_LENGTH = auto()
    CLOSING_BEFORE_OPENING = auto()
    MISMATCHING_PARENTHESIS = auto()
    INVALID_CHAR = auto()
    REMAINING_OPENING = auto()

5.2.8 Exercise 8: Pascal’s Triangle (★★★✩✩)

Write function pascal(n) that computes Pascal’s triangle in terms of nested lists. As you know, each new line results from the previous one. If there are more than two elements in it, two values are added and the sums build the values of the new line. In each case, a 1 is appended to the front and back.

Example

For the value 5, the desired representation is as follows:
[1]
[1, 1]
[1, 2, 1]
[1, 3, 3, 1]
[1, 4, 6, 4, 1]

5.2.9 Exercise 9: Check Magic Triangle (★★★✩✩)

Write function is_magic_triangle(values) that checks whether a sequence of numbers forms a magic triangle. Such a triangle is defined as one where the respective sums of the three sides’ values must all be equal.

Examples

The following shows this for one triangle each of side length three and side length four:
  1                2
 6 5              8 5
2 4 3            4   9
                3 7 6 1
This results in the following sides and sums:

Input

Values 1

Values 2

side 1

1 + 5 + 3 = 9

2 + 5 + 9 + 1 = 17

side 2

3 + 4 + 2 = 9

1 + 6 + 7 + 3 = 17

side 3

2 + 6 + 1 = 9

3 + 4 + 8 + 2 = 17

Tip Model the individual sides of the triangle as sublists.

5.2.10 Exercise 10: Most Frequent Elements (★★✩✩✩)

Write function value_count(values) that determines a histogram (i. e., the distribution of the frequencies of the numbers in the given list). Also write function sort_dict_by_value(dictionary) to sort the dictionary by its values instead of by keys. Thereby a descending sorting is to be realized so that smaller values are listed at the beginning.

Examples

Input

Result

Most frequent(s)

[1, 2, 3, 4, 4, 4, 3, 3, 2, 4]

{1=1, 2=2, 3=3, 4=4}

4=4

[1, 1, 1, 2, 2, 2, 3, 3, 3]

{1=3, 2=3, 3=3}

Depending on query, logically all

5.2.11 Exercise 11: Addition of Digits (★★★✩✩)

Consider two decimal numbers that are to be added. Sounds simple, but for this assignment, the numbers are interestingly represented as a list of digits. Write function list_add(values1, values2). Also, consider the special case where there is an overflow.

Exercise 11a: Addition (★★★✩✩)

In the first part of the task, the digits are to be stored in the order of their occurrence in the list.

Examples

Input 1

Input 2

Result

123 = [1, 2, 3]

456 = [4, 5, 6]

579 = [5, 7, 9]

927 = [9, 2, 7]

135 = [1, 3, 5]

1062 = [1, 0, 6, 2]

Exercise 11b: Addition Inverse (★★★✩✩)

What changes if the digits are stored in reverse order in the list?

Examples

Input 1

Input 2

Result

123 = [3, 2, 1]

456 = [6, 5, 4]

579 = [9, 7, 5]

927 = [7, 2, 9]

135 = [5, 3, 1]

1062 = [2, 6, 0, 1]

5.2.12 Exercise 12: List Merge (★★✩✩✩)

Given two lists of numbers, each sorted in ascending order, merge them into a result list according to their order. Write function merge(values1, values2).

Examples

Input 1

Input 2

Result

1, 4, 7, 12, 20

10, 15, 17, 33

1, 4, 7, 10, 12, 15, 17, 20, 33

2, 3, 5, 7

11, 13, 17

2, 3, 5, 7, 11, 13, 17

2, 3, 5, 7, 11

7, 11, 13, 17

2, 3, 5, 7, 7, 11, 11, 13, 17

[1, 2, 3]

∅ = [ ]

[1, 2, 3]

5.2.13 Exercise 13: Excel Magic Select (★★✩✩✩)

If you have worked a little with Excel, then you have probably used the Magic Selection. It continuously populates a selected area with values based on the previous values. This works for numbers, weekdays, or dates, for example. To achieve something similar on your own, write function generate_following_values(current_value, sequence_length) that implements this for numbers. Create a variation suitable for weekdays and with the following signature: generate_following_values_for_- predefined(predefined_values, current_value, sequence_length).

Examples

Initial value

Count

Result

1

7

[1, 2, 3, 4, 5, 6, 7]

5

4

[5, 6, 7, 8]

FRIDAY

8

[FRIDAY, SATURDAY, SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY]

5.2.14 Exercise 14: Stack-Based Queue (★★✩✩✩)

You learned about stack and queue data structures in the introduction and implemented a queue based on a list. Then in exercise 2, you implemented a stack itself. Now you are asked to build a queue based on the stack data structure.

Example

Please check the functionality with the following procedure:
def main():
    waiting_persons = Queue()
    waiting_persons.enqueue("Marcello")
    waiting_persons.enqueue("Michael")
    waiting_persons.enqueue("Karthi")
    while not waiting_persons.is_empty():
        if waiting_persons.peek() == "Michael":
            # reprocess at the end
            waiting_persons.enqueue("Michael again")
        next_person = waiting_persons.dequeue()
        print("Processing " + next_person)
The small sample program should produce the following output:
Processing Marcello
Processing Michael
Processing Karthi
Processing Michael again

5.3 Solutions

5.3.1 Solution 1: Common Elements (★★✩✩✩)

Find the common elements of two lists, A and B, and return them as a set. Implement this, both with and without using matching functions from Python’s sets. Write your own function find_common(values1, values2), which works like the Python function intersection().

Examples

Input A

Input B

Result

[1, 2, 4, 7, 8]

[2, 3, 7, 9]

{2, 7}

[1, 2, 7, 4, 7, 8]

[7, 7, 3, 2, 9]

{2, 7}

[2, 4, 6, 8]

[1, 3, 5, 7, 9]

∅ = set( )

Algorithm Use dictionaries and manage a counter for being contained in list 1 or 2. You first run through all elements from list 1 and enter the value 1 in the dictionary. Now you run through all elements of the second list. You increase the counter if an entry already exists in the dictionary for the value. Thus, all elements contained in both lists receive the value 2 and with multiple occurrences, a higher value. On the other hand, elements exclusively from list 2 are never stored. Finally, you keep only those entries whose number is greater than or equal to 2.
def find_common(values1, values2):
    results = {}
    populate_from_collection1(values1, results)
    mark_if_also_in_second(values2, results)
    return remove_all_just_in_first(results)
def populate_from_collection1(values1, results):
    for elem1 in values1:
        results[elem1] = 1
def mark_if_also_in_second(values2, results):
    for elem2 in values2:
        if elem2 in results:
            results[elem2] += 1
def remove_all_just_in_first(results):
    final_result = set()
    for key, value in results.items():
        if value >= 2:
            final_result.add(key)
    return final_result
Python shortcut With the help of set comprehension, the last function becomes a one-liner:
def remove_all_just_in_first(results):
    return {key for key, value in results.items() if value >= 2}

Despite this improvement, it seems too complicated. How can you make it better?

Optimized algorithm In fact, the problem can be solved much more compactly and understandably. Check all elements from the first collection to see if they are contained in the second collection. If so, these values get included in the result set.
def find_common_short(values1, values2):
    results = set()
    for elem1 in values1:
        if elem1 in values2:
            results.add(elem1)
    return results
Python shortcut With the help of set comprehension, this becomes a one-liner:
def find_common_short_comprehension(values1, values2):
    return {elem1 for elem1 in values1 if elem1 in values2}
Built-in Python shortcut For your own projects, please use the built-in functionality in the form of intersection():
def find_common_build_in(values1, values2):
    return set(values1).intersection(values2)

Verification

Test the implementation through the following unit tests:
def inputs_and_expected():
    return [([1, 2, 4, 7, 8], [2, 3, 7, 9], {2, 7}),
            ([1, 2, 7, 4, 7, 8], [7, 7, 3, 2, 9], {2, 7}),
            ([2, 4, 6, 8], [1, 3, 5, 7, 9], set())]
@pytest.mark.parametrize("values1, values2, expected",
                         inputs_and_expected())
def test_find_common(values1, values2, expected):
    result = find_common(values1, values2)
    assert result == expected
@pytest.mark.parametrize("values1, values2, expected",
                         inputs_and_expected())
def test_find_common(values1, values2, expected):
    result = find_common_short(values1, values2)
    assert result== expected

5.3.2 Solution 2: Your Own Stack (★★✩✩✩)

Define the basic requirements for a stack and implement class Stack based on these requirements using a list.

Algorithm It is possible to implement a stack yourself, using a list as a data storage, but not providing direct access to it externally. Users just have access through the following methods typical of a stack:
  1. 1.

    push(element) adds an element on top.

     
  2. 2.

    pop() picks and removes the top element.

     
  3. 3.

    peek() takes a look at the top element.

     
  4. 4.

    is_empty() checks if the stack is empty.

     
Each call to push() adds an element at the end of the list. This way, you simulate the stack. When accessing the top element, it is checked upfront whether the stack is empty, in which case a StackIsEmptyException is thrown. Otherwise, the top element is returned.
class Stack:
    def __init__(self):
        self.__values = []
    def push(self, elem):
        self.__values.append(elem)
    def pop(self):
        if self.is_empty():
            raise StackIsEmptyException()
        return self.__values.pop()
    def peek(self):
        if self.is_empty():
            raise StackIsEmptyException()
        return self.__values[-1]
    def is_empty(self):
        return len(self.__values) == 0
class StackIsEmptyException(Exception):
    pass

Perhaps somewhat more comprehensible would be to add elements at the beginning of the list and take them from there. However, this would be unfavorable in terms of performance. Why? This would result in constant recopying of the internal data of the list.

In addition, one could argue that the Python online documentation3 describes how to use lists as stacks. Even if possible, the interface is not restricted to the above methods. In my book Der Weg zum Java-Profi [Ind20] I discuss in detail what can be problematic about this.

HINT: VISIBILITIY/ACCESSIBILITY
While languages like Java or C++ have visibilities to control and protect access to private class components, this is not possible in Python. However, there are two variants with _and to achieve something similar. What is this all about?
  • If names start with _, then by convention, the one underscore means that this method or attribute is considered private and an implementation detail of the class. However, Python does not enforce this. Especially, access is not prevented. Instead, you must rely on other programmers to observe this fact.

  • A double underscore (__) marks an internal method. When used for attributes, this attribute is no longer visible to the outside for other classes under its name. This is also true for methods.

Verification

You verify the correct working of the stack you just implemented using a predefined flow. First, you insert two elements. Then you look at the top one with peek(). After that, you remove elements twice with pop(). As expected, they are supplied in reverse order of insertion. Finally, you check to see if the stack is empty. Because this is the case, a subsequent inspection of the topmost element should throw a StackIsEmptyException—show here just as a comment.
def main():
    stack = Stack()
    stack.push("first")
    stack.push("second")
    print("PEEK: " + stack.peek())
    print("POP: " + stack.pop())
    print("POP: " + stack.pop())
    print("ISEMPTY: " + str(stack.is_empty()))
    # print("POP: " + stack.pop())
This results in the following output:
PEEK: second
POP: second
POP: first
ISEMPTY: true

5.3.3 Solution 3: List Reverse (★★✩✩✩)

Solution 3a: List Reverse (★✩✩✩✩)

Write function reverse(values) that returns the elements of the original list in reverse order—of course without calling the reverse() function of the list.

Examples

Input

Result

[1, 2, 3, 4]

[4, 3, 2, 1]

[“A”, “BB”, “CCC”, “DDDD”]

[“DDDD”, “CCC”, “BB”, “A”]

Algorithm A simple solution is to traverse a list from back to front and add the current element to a result list. This can be implemented index-based as follows:
def reverse(values):
    result = []
    for i in range(len(values) - 1, -1, -1):
        result.append(values[i])
    return result
Python shortcut Using list comprehensions, the whole thing can be written shorter and more concisely. The first variant is still based on the realization shown above, while the second relies on an inverted iterator with reversed():
def reverse_with_comprehension(values):
    return [values[i] for i in range(len(values) - 1, -1, -1)]
def reverse_with_comprehension_nicer(values):
    return [value for value in reversed(values)]
Using list() in combination with reversed() is even shorter and nicer:
def reverse_with_list_nicer(values):
    return list(reversed(values))
In fact, with slicing, the whole thing can be written briefly as follows, where again the result is a new list with the contents reversed from the original:
reversed = values[::-1]

Solution 3b: List Reverse Inplace (★★✩✩✩)

What is different if you want to implement reversing the order inplace to be memory- optimal for very large datasets? What should be given then?

Algorithm Based on indexed access, you proceed inwards from the beginning and the end, swapping the elements:
def reverse_inplace(original):
    left = 0
    right = len(original) - 1
    # run from the left and right, swap the elements based on their positions
    while left < right:
         left_elem = original[left]
         right_elem = original[right]
         # swap
        original[left] = right_elem
        original[right] = left_elem
        left += 1
        right -= 1
    return original
Python shortcut Please keep in mind that in real projects, the standard functionality reverse() should be used, which works inplace:
values.reverse()

Solution 3c: List Reverse Without Performant Index Access (★★✩✩✩)

Now let’s assume that no performant random index access is available. What happens if you want to reverse the order and any position-based access will result in O(n) and therefore O(n2) for the complete reversal process. How do you avoid this?

Tip

Use a stack.

Algorithm In the case that no performant indexed-based access is available and you still have to reverse the order with running time complexity of O(n), a stack comes into play—just as for various other algorithms, including this one. You traverse the list from front to back and put the current element on the stack each time. Afterwards, you iteratively remove the top element from the stack and add it to a result list until the stack is empty.
def list_reverse_with_stack(values):
    # Go through the list from front to back and fill a stack
    all_values = Stack()
    for element in values:
        all_values.push(element)
    # Empty the stack and fill a result list
    result = []
    while not all_values.is_empty():
        result.append(all_values.pop())
    return result

Verification

Let’s experiment with the input values from the example and invoke the function you created earlier— in the accompanying project all variants will of course be tested:
def list_reverse_inputs_and_expected():
    return [([1, 2, 3, 4], [4, 3, 2, 1]),
             (["A", "BB", "CCC", "DDDD"], ["DDDD", "CCC", "BB", "A"])]
@pytest.mark.parametrize("inputs, expected",
                         list_reverse_inputs_and_expected())
def test_reverse(inputs, expected):
    result = reverse(inputs)
    assert result == expected
@pytest.mark.parametrize("inputs, expected",
                         list_reverse_inputs_and_expected())
def test_reverse_inplace(inputs, expected):
    modifiable_inputs = list(inputs)
    reverse_inplace(modifiable_inputs)
    assert modifiable_inputs == expected

5.3.4 Solution 4: Remove Duplicates (★★✩✩✩)

You are supposed to remove duplicate entries from a list. The constraint is that the original order should be preserved. Write function remove_duplicates(values).

Examples

Input

Result

[1, 1, 2, 3, 4, 1, 2, 3]

[1, 2, 3, 4]

[7, 5, 3, 5, 1]

[7, 5, 3, 1]

[1, 1, 1, 1]

[1]

Algorithm Traverse the list from front to back and successively fill a set with the entries contained in the list. For each element of the list, check whether it is already contained in the set of entries found. If not, it will be included and also added to the result. Otherwise, the next element gets checked.
def remove_duplicates(values):
    result = []
    already_found_numbers = set()
    for elem in values:
        if elem not in already_found_numbers:
            already_found_numbers.add(elem)
            result.append(elem)
    return result
Optimized algorithm While implementing you might get the idea of simply deleting the duplicates by refilling them into a set. This works but potentially messes up the order of the elements. A workaround is to use a dictionary. Calling fromkeys() creates a dictionary based on the passed list and automatically removes duplicate keys. In addition, since Python 3.6, the insertion order is preserved. With this knowledge, implementing the removal of duplicates is a snap.
list_with_duplicates = ["a", "b", "a", "c", "d", "c", "d"]
# order may change
no_duplicates1 = list(set(list_with_duplicates))
# stable order
no_duplicates2 = list(dict.fromkeys(list_with_duplicates))
Python shortcut With this knowledge, create the following implementation of the removal of duplicates as a function:
def remove_duplicates_with_dict(values):
    return list(dict.fromkeys(values))

Verification

Again, you use the introductory example’s values to verify the implementation. The tests for the two optimized versions are not shown below because they are, apart from the function call, identical.
def inputs_and_expected():
    return [([1, 1, 2, 3, 4, 1, 2, 3], [1, 2, 3, 4]),
            ([7, 5, 3, 5, 1], [7, 5, 3, 1]),
            ([1, 1, 1, 1], [1])]
@pytest.mark.parametrize("inputs, expected",
                         inputs_and_expected())
def test_remove_duplicates(inputs, expected):
    result = remove_duplicates(inputs)
    assert result == expected
@pytest.mark.parametrize("inputs, expected",
                         inputs_and_expected())
def test_remove_duplicates_with_dict(inputs, expected):
    result = remove_duplicates_with_dict(inputs)
    assert result == expected

5.3.5 Solution 5: Maximum Profit (★★★✩✩)

Imagine that you have a sequence of prices ordered by time and you want to calculate the maximum profit. The challenge is to determine at which time (or value in this case) it would be ideal to buy and to sell. Write function max_revenue(prices) for this purpose, where the temporal order is expressed by the index in the list.

Examples

Input

Result

[250, 270, 230, 240, 222, 260, 294, 210]

72

[0, 10, 20, 30, 40, 50, 60, 70]

70

[70, 60, 50, 40, 30, 20, 10, 0]

0

[ ]

0

Algorithm Initially, you may be tempted to determine the minimum and the maximum and simply return the difference. After a short reflection, it becomes clear that a time dimension has to be considered in this case. First, a purchase and then a sale at a higher price must take place to realize a profit.

The next idea is to run through the list twice. First, all minimum values are determined by looking to see if the current value is less than the current minimum. This is then added to the list of minimum values valid for the time. In the second run, you determine the largest difference by comparing element by element. If the current value is greater than the currently valid minimum value, then the profit thus obtained is the difference between the current value and the minimum value determined at the position. Finally, the maximum profit is calculated from the maximum of the current maximum and the current profit. For the above example 1, the result is as follows:

Value

255

260

250

240

228

270

300

210

245

Minimum

255

255

250

240

228

228

228

210

210

Difference

0

5

0

0

0

42

72

0

35

Max. Difference

0

5

5

5

5

42

72

72

72

According to this idea, you express the whole thing in Python by first determining all relevant minimum values and then, based on that, the maximum:
def max_revenue(prices):
    relevant_mins = calc_relevant_mins(prices)
    return calc_max_revenue(prices, relevant_mins)
The actual work happens in the following two helper functions:
def calc_relevant_mins(prices):
    relevant_mins = []
    current_min = sys.maxsize
    for current_price in prices:
        current_min = min(current_min, current_price)
        relevant_mins.append(current_min)
    return relevant_mins
def calc_max_revenue(prices, relevant_mins):
    max_revenue = 0
    for i, price in enumerate(prices):
        if price > relevant_mins[i]:
            current_revenue = price - relevant_mins[i]
            max_revenue = max(max_revenue, current_revenue)
    return max_revenue

Optimized algorithm The variation just shown requires two passes. As long as the accesses are made in memory, this hardly plays a crucial role in the performance. The situation is somewhat different if the data is determined each time, for example, via a REST call or from a database.

In fact, the number of necessary calls and loop iterations can be reduced. However, this optimization can probably only be achieved if the previous implementation has been completed first.
def max_revenue_optimized(prices):
    current_min = sys.maxsize
    max_revenue = 0
    for current_price in prices:
        current_min = min(current_min, current_price)
        current_revenue = current_price - current_min
        max_revenue = max(max_revenue, current_revenue)
    return max_revenue

Verification

For testing, you again use the values from the introductory example:
def prices_and_expected():
    return [([0, 10, 20, 30, 40, 50, 60, 70], 70),
             ([70, 60, 50, 40, 30, 20, 10], 0),
             ([], 0)]
@pytest.mark.parametrize("prices, expected", prices_and_expected())
def test_max_revenue(prices, expected):
    result = max_revenue(prices)
    assert result == expected

5.3.6 Solution 6: Longest Sequence (★★★✩✩)

Suppose you are modeling stock prices or altitudes of a track by a list of numbers. Find the longest sequence of numbers whose values ascend or at least stay the same. Write function find_longest_growing_sequence(values).

Examples

Input

Result

[7, 2, 7, 1, 2, 5, 7, 1]

[1, 2, 5, 7]

[7, 2, 7, 1, 2, 3, 8, 1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]

[1, 1, 2, 2, 2, 3, 3, 3, 3]

[1, 1, 2, 2, 2, 3, 3, 3, 3]

Algorithm Here a so-called greedy algorithm is used. The idea is to collect the subsequent elements starting from one element until the next element is smaller than the current one. A temporary list and a result list are used for this purpose. Both are initially empty and are successively filled: the temporary list at each element read that is greater than or equal to the predecessor and the result list whenever a smaller successor value is found. If a value is smaller, the temporary list is cleared and starts as a one-element list with the current value. If the result list at a flank change is shorter than the temporary list with the previously collected elements, then the temporary list becomes the new result list. This procedure is repeated until you reach the end of the initial list.

Let’s look at a procedure for the input 1272134572:

Input

Current character

Temporary list

Result list

1272134572

1

1

 

1272134572

2

12

 

1272134572

7

127

 

1272134572

2

2

127

1272134572

1

1

127

1272134572

3

13

127

1272134572

4

134

127

1272134572

5

1345

127

1272134577

7

13457

127

1272134572

2

2

13457

def find_longest_growing_sequence(values):
    longest_subsequence = []
    current_subsequence = []
    last_value = sys.maxsize
    for current_value in values:
        if current_value >= last_value:
             last_value = current_value
            current_subsequence.append(current_value)
        else:
            # end of this sequence, start new sequence
            if len(current_subsequence) >= len(longest_subsequence):
                longest_subsequence = current_subsequence
            current_subsequence = []
            last_value = current_value
            current_subsequence.append(current_value)
    # important, because otherwise the last sequence might not be considered
    if len(current_subsequence) >= len(longest_subsequence):
        longest_subsequence = current_subsequence
    return longest_subsequence

Be sure to note the additional check after the for loop—otherwise, a final sequence would not be correctly returned as a result.

Mini optimization The check should be optimized a bit further. As you can see, assigning the value and adding it to the current temporary list happens in every case. Thus, these actions can be separated from the condition and written as follows:
for current_value in values:
    if current_value < last_value:
        # end of this sequence, start new sequence
        if len(current_subsequence) >= len(longest_subsequence):
            longest_subsequence = current_subsequence
        current_subsequence = []
    last_value = current_value
    current_subsequence.append(current_value)

Procedure for sections of equal length When checking for the longest sequence, you can either compare with > or >=. If there are two or more sequences of the same length, in the first case with > the first one is taken as a result, with >= always the last one.

Alternative and optimized algorithm Sometimes creating temporary data structures can be rather undesirable, for example, when the subsections can become huge. In such a case, it offers itself to determine only the respective index borders. As a final step, you extract the appropriate part.
def find_longest_growing_sequence_optimized(values):
    if len(values) == 0:
        return values
    longest = (0, 0)
    start_current = 0
    end_current = 0
    for end_current in range(1, len(values)):
        # flank change
        if values[end_current] < values[end_current - 1]:
            if end_current - start_current > len(longest):
                longest = (start_current, end_current)
            start_current = end_current
    if end_current - start_current > len(longest):
        longest = (start_current, end_current)
    return values[longest[0] : longest[1]]

Verification

Use the sequences of values from the introduction to compare the computed results with your expectations:
@pytest.mark.parametrize("values, expected",
                         [([7, 2, 7, 1, 2, 5, 7, 1], [1, 2, 5, 7]),
                          ([7, 2, 7, 1, 2, 3, 8, 1, 2, 3, 4, 5],
                           [1, 2, 3, 4, 5]),
                          ([1, 1, 2, 2, 2, 3, 3, 3, 3],
                           [1, 1, 2, 2, 2, 3, 3, 3, 3]),
                          ([], [])])
def test_find_longest_growing_sequence(values, expected):
    result = find_longest_growing_sequence(values)
    assert result == expected

5.3.7 Solution 7: Well-Formed Braces (★★✩✩✩)

Write function check_parentheses(braces_input) that checks whether a sequence of braces is neatly nested in each case. This should accept any round, square and curly braces but no other characters.

Examples

Input

Result

Comment

“(( ))”

True

 

“({[ ]})”

True

 

“((( ))”

False

Odd number of braces

“((a)”

False

Wrong character, no braces

“((])”

False

No matching braces

Algorithm Traverse the string from front to back. If the current character is an opening brace (that is, one of the characters (, [, or {), store it in a stack. If it is a closing brace, try to match it with the last opening brace. If there is no opening brace yet, or if the brace types do not match, False is returned. If they match, the next character is read. If it is an opening brace, proceed as before. If it is a closing brace, get the top element from the stack and compare it to the character just read. Check for matching the type of braces, which are ( ), [ ], and { }. Let’s look at a flow for the input (()]:

Input

Current character

Stack

Comment

(()]

  

Start

(()]

(

(

Store

(()]

(

((

Store

[()]

)

(

Match

(()]

]

(

Mismatch

The implementation uses a stack and performs the checks and actions described above:
def check_parentheses(braces_input):
    # odd length cannot be a well-formed bracing
    if len(braces_input) % 2 != 0:
        return False
    opening_parentheses = Stack()
    for char in braces_input:
        if is_opening_parenthesis(char):
            opening_parentheses.push(char)
        elif is_closing_parenthesis(char):
            if opening_parentheses.is_empty():
                 # closing before opening brace
                return False
            last_opening_parens = opening_parentheses.pop()
            if not is_matching_parenthesis_pair(last_opening_parens, char):
                # different pairs of braces
                return False
        else:
            # invalid character
            return False
    return opening_parentheses.is_empty()
Once again, it is recommended to extract helper functions such as is_opening_parenthesis() to be able to implement the actual algorithm at a higher level of abstraction and thus more clearly. Finally, let’s take an examining look at the three helper functions—for the closing braces, here is an elegant Python variant with in for a list of characters instead of a character-by-character check with or and :
def is_opening_parenthesis(ch):
    return ch == '(' or ch == '[' or ch == '{'
def is_closing_parenthesis(ch):
    return ch in [")", "]", "}"]
def is_matching_parenthesis_pair(opening, closing):
    return (opening == '(' and closing == ')') or
           (opening == '[' and closing == ']') or
           (opening == '{' and closing == '}')
Checking for matching pair of braces can also be written more elegantly using a list of tuples containing the opening and closing braces:
# Alternative variant using tuple notation
def is_matching_parenthesis_pair(opening, closing):
    return (opening, closing) in [('(', ')'), ('[', ']'), ('{', '}')]

Verification

Use the values from the introduction to see your just-implemented functionality in action:
@pytest.mark.parametrize("values, expected",
                         [("()", True), ("()[]{}", True),
                          ("[((()[]{}))]", True),
                          ("(()", False), ("((})", False),
                          ("(()}", False), (")()(", False),
                          ("()((", False), ("()A(", False)])
def test_check_parentheses(values, expected):
    result = check_parentheses(values)
    assert result == expected

Let’s look again at the implementation of the check and the return values. Several comments exist why True or False is returned. Wouldn’t it be more intuitive to express this with a suitable enumeration as a return? Let’s take a look at that now in the bonus.

Bonus

Extend the solution so that a clear assignment of error causes becomes possible. Start with the following enumeration:
from enum import Enum, auto
class CheckResult(Enum):
    OK = auto()
    ODD_LENGTH = auto()
    CLOSING_BEFORE_OPENING = auto()
    MISMATCHING_PARENTHESIS = auto()
    INVALID_CHAR = auto()
    REMAINING_OPENING = auto()
By using the enumeration, possible error causes may be communicated more clearly. Besides, you can omit the comments on the return values in the source code since the enumeration values adequately describe them.
def check_parentheses_v2(braces_input):
    # odd length cannot be well-formed braces
    if len(braces_input) % 2 != 0:
        return CheckResult.ODD_LENGTH
    opening_parentheses = Stack()
    for current_char in braces_input:
        if is_opening_parenthesis(current_char):
           opening_parentheses.push(current_char)
        elif is_closing_parenthesis(current_char):
            if opening_parentheses.is_empty():
                return CheckResult.CLOSING_BEFORE_OPENING
            last_opening_parens = opening_parentheses.pop()
            if not is_matching_parenthesis_pair(last_opening_parens,
                                                current_char):
                return CheckResult.MISMATCHING_PARENTHESIS
        else:
            return CheckResult.INVALID_CHAR
    if opening_parentheses.is_empty():
        return CheckResult.OK
    return CheckResult.REMAINING_OPENING

Verification

Using enumeration not only increases the readability of the application’s source code but also adds clarity and conciseness to the unit test. As usual, use the values from the introductory example to see your just implemented functionality in action:
@pytest.mark.parametrize("values",
                         [("()"), ("()[]{}"), ("[((()[]{}))]")])
def test_check_parentheses_v2(values):
    result = check_parentheses_v2(values)
    assert result == CheckResult.OK
@pytest.mark.parametrize("values, expected",
                         [("(()", CheckResult.ODD_LENGTH),
                          ("((})", CheckResult.MISMATCHING_PARENTHESIS),
                          ("(()}", CheckResult.MISMATCHING_PARENTHESIS),
                          (")()(", CheckResult.CLOSING_BEFORE_OPENING),
                          ("()((", CheckResult.REMAINING_OPENING),
                          ("()A(", CheckResult.INVALID_CHAR)])
def test_check_parentheses_v2_errors(values, expected):
    result = check_parentheses_v2(values)
    assert result == expected

5.3.8 Solution 8: Pascal’s Triangle (★★★✩✩)

Write function pascal(n) that computes Pascal’s triangle in terms of nested lists. As you know, each new line results from the previous one. If there are more than two elements in it, two values are added, and the sums build the values of the new line. In each case, a 1 is appended to the front and back.

Example

For the value 5, the desired representation is as follows:
[1]
[1, 1]
[1, 2, 1]
[1, 3, 3, 1]
[1, 4, 6, 4, 1]
Algorithm The determination of the individual lines is done recursively. For the first line, a one-element list with the value 1 is generated. For all others, you calculate the values by invoking helper function calc_line(previous_line) based on the predecessor line and then add the intermediate result to the overall result. It might be a bit irritating that the call is 1-based, but the list index is, of course, 0-based.
def pascal(n):
    result = []
    __pascal_helper(n, result)
    return result
def __pascal_helper(n, results):
    if n == 1:
        # recursive termination
        results.append([1])
    else:
        # recursive descent
        previous_line = __pascal_helper(n - 1, results)
        # calculate based on previous line
        current_line = __calc_line(previous_line)
        results.append(current_line)
    return results[n - 1]

Computing a row’s values based on the predecessor row is performed for all rows with n ≥ 2 as follows: If there is more than one value stored in the predecessor row list, iterate through it and sum each. To complete the computation, the value 1 is appended at the front and the back.

Somewhat more formally it can be written as follows, where the index of the rows and columns starts from 1 and not as in Python from 0:

The implementation is done directly and is much more understandable than the purely recursive definition for each value already presented in section 3.3.9.
# each row is calculated from the values of the row above it,
# flanked in each case by a 1
def __calc_line(previous_line):
    current_line = [previous_line[i] + previous_line[i + 1]
                    for i in range(len(previous_line) - 1)]
    return [1] + current_line + [1]

Verification

To test the implementation, define a function where you compute Pascal’s triangle for the passed value and then print it appropriately:
def print_pascal(n):
    for line in pascal(n):
        print(line)
Let’s try it out:
>>> print_pascal(4)
...
[1]
[1, 1]
[1, 2, 1]
[1, 3, 3, 1]
If you like it a bit more formal, a matching unit test is provided:
@pytest.mark.parametrize("n, expected",
                         [(1, [[1]]),
                          (2, [[1], [1, 1]]),
                          (3, [[1], [1, 1], [1, 2, 1]]),
                          (4, [[1], [1, 1], [1, 2, 1], [1, 3, 3, 1]])])
def test_pascal(n, expected):
    result = pascal(n)
    assert result == expected

5.3.9 Solution 9: Check Magic Triangle (★★★✩✩)

Write function is_magic_triangle(values) that checks whether a sequence of numbers forms a magic triangle. Such a triangle is defined as one where the respective sums of the three sides’ values must all be equal.

Examples

The following shows this for one triangle each of side length three and side length four:
  1             2
 6 5           8 5
2 4 3         4   9
             3 7 6 1
This results in the following sides and sums:

Input

Values 1

Values 2

side 1

1 + 5 + 3 = 9

2 + 5 + 9 + 1 = 17

side 2

3 + 4 + 2 = 9

1 + 6 + 7 + 3 = 17

side 3

2 + 6 + 1 = 9

3 + 4 + 8 + 2 = 17

Tip

Model the individual sides of the triangle as sublists.

HINT: PROBLEM SOLVING STRATEGIES FOR THE JOB INTERVIEW

If the problem is initially unclear, it is advisable to reduce the problem to one or two concrete value assignments and to find the appropriate abstractions based on these.

Using the triangle of side length three as an example, you can build the sides shown above. If you think for a while, you will find that the sides can be expressed as sublists. However, the last side requires special treatment. For closing the figure again, the value of position 0 has to be taken into account. Still, it is not part of the sublist. Here two tricks offer themselves. The first one is to duplicate the list and extend it by the 0th element:
values_with_loop = list(values)
# close the triangle
values_with_loop.append(values[0])
side1 = values_with_loop[0:3]
side2 = values_with_loop[2:5]
side3 = values_with_loop[4:7]
Alternatively, create three slices and add the 0th element in the third to fit:
side1 = values[0:3]
side2 = values[2:5]
side3 = values[4:6]
# close the triangle
side3.append(values[0])
Algorithm: For triangles with side length three With the previous knowledge gathered, you start implementing the check for the special case of a triangle of side length three. Therefore, you first determine the sides and then build and compare the partial sums of the numbers contained there:
def is_magic6(values):
    values_with_loop = list(values)
    values_with_loop.append(values[0])  # close the triangle
    side1 = values_with_loop[0:3]
    side2 = values_with_loop[2:5]
    side3 = values_with_loop[4:7]
    return compare_sum_of_sides(side1, side2, side3)
You extract the summing of the values of the sides as well as their comparison into the following function:
def compare_sum_of_sides(side1, side2, side3):
    sum1 = sum(side1)
    sum2 = sum(side2)
    sum3 = sum(side3)
    return sum1 == sum2 and sum2 == sum3
Intermediate inspection Now you should at least check the implementation with some values before you move on to the generalization:
>>> is_magic6([1, 5, 3, 4, 2, 6])
True
>>> is_magic6([1, 2, 3, 4, 5, 6])
False
Algorithm, general variant With the knowledge gained from the concrete example, a general variant can be created. The variance resides in calculating the indices for the sides of the triangle. Additionally, you add a sanity check at the beginning of the function. This prevents you from working on potentially invalid data constellations.
def is_magic_triangle(values):
    if len(values) % 3 != 0:
       raise ValueError("Not a triangle!", len(values), "must be a factor of 3")
    side_length = 1 + len(values) // 3
    values_with_loop = list(values)
    # close the triangle
    values_with_loop.append(values[0])
    side1 = values_with_loop[0: side_length]
    side2 = values_with_loop[side_length - 1: side_length * 2 - 1]
    side3 = values_with_loop[(side_length - 1) * 2: side_length * 3 - 2]
    return compare_sum_of_sides(side1, side2, side3)

Verification

Let’s check the implementation with the following unit test:
@pytest.mark.parametrize("values, expected",
                         [([1, 5, 3, 4, 2, 6], True),
                          ([1, 2, 3, 4, 5, 6], False),
                          ([2, 5, 9, 1, 6, 7, 3, 4, 8], True),
                          ([1, 2, 3, 4, 5, 6, 7, 8, 9], False)])
def test_is_magic_triangle(values, expected):
    result = is_magic_triangle(values)
    assert result == expected
Alternative algorithm Based on the generalization already done, it is possible to omit the extraction of the sublists. Therefore, you once again use the idea of a position counter and traverse the original list in two loops. The outer loop represents the current side; in an inner loop, the respective position is handled. Two tricks are used:
  1. 1.

    The variable pos models the current position within the list. The new position is determined by adding 1. However, you need to reaccess the list’s first value at the end of the list, so a modulo operation is used here.

     
  2. 2.

    After adding up the values for one side, you must go back by one position since the end value of one side of the triangle is also the start value of the next side.

     
As usual, add a sanity check at the beginning of the method. This will prevent you from potentially invalid data constellations.
def is_magic_triangle_v2(values):
    if len(values) % 3 != 0:
        raise ValueError("Not a triangle: " + len(values))
    side_length = 1 + len(values) // 3
    pos = 0
    sum_of_sides = [0, 0, 0]
    for current_side in range(3):
        for _ in range(side_length):
            sum_of_sides[current_side] += values[pos]
            # trick 1: with modulo => no special treatment
            pos = (pos + 1) % len(values)
        # trick 2: The sides overlap, end field = next start field
         pos -= 1
    return sum_of_sides[0] == sum_of_sides[1] and
           sum_of_sides[1] == sum_of_sides[2]

Verification

The verification is performed with a unit test analogous to the previous one and therefore not shown again.

5.3.10 Solution 10: Most Frequent Elements (★★✩✩✩)

Write function value_count(values) that determines a histogram (i. e., the distribution of the frequencies of the numbers in the given list). Also, write function sort_dict_by_value(dictionary) to sort the dictionary by its values instead of by keys. Thereby a descending sorting is realized, so that smaller values are listed at the beginning.

Examples

Input

Result

Most frequent(s)

[1, 2, 3, 4, 4, 4, 3, 3, 2, 4]

{4=4, 3=3, 2=2, 1=1}

4=4

[1, 1, 1, 2, 2, 2, 3, 3, 3]

{1=3, 2=3, 3=3}

Depending on query, logically all

Algorithm Based on the input values, you compute a histogram as a dictionary with frequency values:
def value_count(values):
    value_to_count = {}
    for elem in values:
        if elem not in value_to_count:
            value_to_count[elem] = 0
        value_to_count[elem] += 1
    return value_to_count
As a final step, you still need to sort the resulting dictionary by value. Conveniently, this can be done with sorted() and specifying how the values are accessed and inverted. However, this returns a list of value pairs that you transfer to a dictionary with dict().
def sort_dict_by_value(dictionary):
    return dict(sorted(dictionary.items(), key=itemgetter(1), reverse=True))

Verification

As usual, use the values from the introduction to check your just implemented functionality with unit tests:
@pytest.mark.parametrize("values, expected",
                         [([1, 2, 3, 4, 4, 4, 3, 3, 2, 4],
                           {1: 1, 2: 2, 3: 3, 4: 4}),
                          ([1, 1, 1, 2, 2, 2, 3, 3, 3],
                           {1: 3, 2: 3, 3: 3})])
def test_value_count(values, expected):
    result = value_count(values)
    assert result == expected
@pytest.mark.parametrize("dictionary, expected",
                         [({1: 1, 2: 2, 3: 3, 4: 4},
                           {4: 4, 3: 3, 2: 2, 1: 1})])
def test_sort_dict_by_value(dictionary, expected):
    result = sort_dict_by_value(dictionary)
    assert result == expected

5.3.11 Solution 11: Addition of Digits (★★★✩✩)

Consider two decimal numbers that are to be added. Sounds simple, but for this assignment, the numbers are interestingly represented as a list of digits. Write function list_add(values1, values2). Also, consider the special case where there is an overflow.

Solution 11a: Addition (★★★✩✩)

In the first part of the task, the digits are to be stored in the order of their occurrence in the list.

Examples

Input 1

Input 2

Result

123 = [1, 2, 3]

456 = [4, 5, 6]

579 = [5, 7, 9]

927 = [9, 2, 7]

135 = [1, 3, 5]

1062 = [1, 0, 6, 2]

Algorithm Start with a simplification, namely that the numbers have the same amount of digits. Analogous to adding on the blackboard, you go from back to front from position to position and add the digits in each case. There may be a carry, which you must take into account in the following addition. If there is also a carry at the end of the processing (so for you at the front-most position), you must add the value 1 to the result at the front position. See Figure 5-1.
Figure 5-1

Example of an addition with carries

You apply this procedure to two lists of digits and traverse them from back to front—at the beginning still simplifying lists of equal length, which avoids special treatments.
def list_add(values1, values2):
    result = []
    carry = 0
    for i in range(len(values1) - 1, -1, -1):
        sum = values1[i] + values2[i] + carry
        result.insert(0, sum % 10)
        carry = 1 if sum >= 10 else 0
    # add a 1 at the front of a carryover
    if carry == 1:
        result.insert(0, 1)
    return result
A deviating implementation would be to use iterators, here not the forward variant with iter(), but the backward variant with reversed(). However, this implementation struggles with the same problem as before with input data of different lengths.
def list_add_with_iter(values1, values2):
    result = []
    carry = 0
    backiterator1 = reversed(values1)
    backiterator2 = reversed(values2)
    while True:
        try:
            value1 = next(backiterator1)
            value2 = next(backiterator2)
            sum = value1 + value2 + carry
            result.insert(0, sum % 10)
            carry = 1 if sum >= 10 else 0
        except StopIteration:
            break
    # consider carryover
    if carry == 1:
        result.insert(0, 1)
    return result
HINT: POSSIBLE ALTERNATIVE WITH ZIP()?

You could also come up with the idea of combining the two sequences of values with zip() and traversing them backwards. However, then you still need a wrapping with list(), since zip() is not reversible. However, this variant also fails since zip() restricts the combination to the smallest length of the sequences passed. Thus, again, you cannot add sequences of numbers of different lengths.

Improved algorithm If you want to provide a generally valid addition, you have to add the digits again starting from the back. However, with unequal length, it is then at some point no longer possible to access any digits because one number has fewer digits than the other. The auxiliary function safe_get_at() helps to handle a potentially failing access and provides a fallback of 0 in this case.
def list_add_improved(values1, values2):
    result = []
    carry = 0
    idx1 = len(values1) - 1
    idx2 = len(values2) - 1
    while idx1 >= 0 or idx2 >= 0:
        value1 = safe_get_at(values1, idx1)
        value2 = safe_get_at(values2, idx2)
        sum = value1 + value2 + carry
        result.insert(0, sum % 10)
        carry = 1 if sum >= 10 else 0
        idx1 -= 1
        idx2 -= 1
    # add a 1 at the front of a carryover
    if carry == 1:
        result.insert(0, 1)
    return result
Let’s take a quick look at the implementation of the safe indexed access, which maps accesses outside the allowed index range to the value 0. I use the Python feature of two comparison operators.
def safe_get_at(values, pos):
    if 0 <= pos < len(values):
        return values[pos]
    return 0

In the implementation, my Java origin can be spotted.

In Python, it is stylistically nicer to handle expected index exceptions as follows:
def safe_get_at(values, pos):
    try:
        return values[pos]
    except IndexError:
        return 0

Verification

Use unit tests to verify that the implementation produces the desired result for a given sequence of numbers:
@pytest.mark.parametrize("values1, values2, expected",
                         [([1, 2, 3], [4, 5, 6], [5, 7, 9]),
                          ([9, 2, 7], [1, 3, 5], [1, 0, 6, 2])])
def test_list_add_improved(values1, values2, expected):
    result = list_add_improved(values1, values2)
    assert result == expected
Let’s also consider the special case of unequal lengths of numbers for both implementations—only the second improved variant handles this correctly:
>>> list_add([7,2,1], [1,2,7,0,0,0])
[8, 4, 8]
>>> list_add_improved([7, 2, 1], [1, 2, 7, 0, 0, 0])
[1, 2, 7, 7, 2, 1]

Solution 11b: Addition Inverse (★★★✩✩)

What changes if the digits are stored in reverse order in the list?

Examples

Input 1

Input 2

Result

123 = [3, 2, 1]

456 = [6, 5, 4]

579 = [9, 7, 5]

927 = [7, 2, 9]

135 = [5, 3, 1]

1062 = [2, 6, 0, 1]

Algorithm If the order of the digits in the list is reversed to that within the number, things get simpler. You can then add directly, and the handling of numbers with unequal amounts of digits becomes easier. Again, you use the function safe_get_at(). Moreover, in case of an overflow, it is only necessary to add in the natural direction.
def list_add_inverse(values1, values2):
    result = []
    carry = 0
    idx = 0
    while idx < len(values1) or idx < len(values2):
        value1 = safe_get_at(values1, idx)
        value2 = safe_get_at(values2, idx)
        sum = value1 + value2 + carry
        carry = 1 if sum >= 10 else 0
        result.append(sum % 10)
        idx += 1
    # add a 1 as carry to the "front"
    if carry == 1:
        result.append(1)
    return result

Verification

Consider two numbers in the form of lists with single digits—the values are written the other way around than in the number. In particular, this variant allows the addition of numbers of different lengths without having to deal with two index values.
@pytest.mark.parametrize("values1, values2, expected",
                         [([3, 2, 1], [6, 5, 4], [9, 7, 5]),
                          ([7, 2, 9], [5, 3, 1], [2, 6, 0, 1])])
def test_list_add_inverse(values1, values2, expected):
    result = list_add_inverse(values1, values2)
    assert result == expected

5.3.12 Solution 12: List Merge (★★✩✩✩)

Given two lists of numbers, each sorted in ascending order, merge them into a result list according to their order. Write function merge(values1, values2).

Examples

Input 1

Input 2

Result

1, 4, 7, 12, 20

10, 15, 17, 33

1, 4, 7, 10, 12, 15, 17, 20, 33

2, 3, 5, 7

11, 13, 17

2, 3, 5, 7, 11, 13, 17

2, 3, 5, 7, 11

7, 11, 13, 17

2, 3, 5, 7, 7, 11, 11, 13, 17

[1, 2, 3]

= []

[1, 2, 3]

Algorithm At first, the problem seems quite easy to solve. You start at the beginning of both lists. Then you compare the respective position’s values, insert the smaller one into the result, and increase the position in the list from which the element originates. This looks like the following:
def merge_first_try(values1, values2):
    pos1 = 0
    pos2 = 0
    result = []
    while pos1 < len(values1) or pos2 < len(values2):
        value1 = values1[pos1]
        value2 = values2[pos2]
        if value1 < value2:
            result.append(value1)
            pos1 += 1
        else:
            result.append(value2)
            pos2 += 1
    return result
Although this solution seems to be intuitive and good, it still contains problems. To identify them, let’s try the function once for the second combination of values:
>>> merge_first_try([2, 3, 5, 7], [11, 13, 17])
...
IndexError: list index out of range
As a quick fix, you could replace the or with an and, which eliminates problems with exceptions. But this leads to another problem: Not all of the elements of both lists are processed any longer, usually depending on the value distribution even different numbers. So this is not a universal solution, but still a good start. You only have to cover the special needs of the elements remaining in a list appropriately. They are added to the result for this purpose.
def merge(values1, values2):
    pos1 = 0
    pos2 = 0
    result = []
    while pos1 < len(values1) and pos2 < len(values2):
        value1 = values1[pos1]
        value2 = values2[pos2]
        if value1 < value2:
            result.append(value1)
            pos1 += 1
        else:
            result.append(value2)
            pos2 += 1
    add_remaining(result, values1, pos1)
    add_remaining(result, values2, pos2)
    return result
You move the functionality of appending the remaining elements into function add_remaining(). Interestingly, no special checks are required before calling it. This is indirectly given by supplying the respective index as well as the termination condition in the for loop.
def add_remaining(result, values, idx):
    for i in range(idx, len(values)):
        result.append(values[i])
Python shortcut for add_remaining() In fact, adding the remaining elements is done in a shorter and more understandable way using slicing, as follows:
def add_remaining(result, values, idx):
    result += values[idx:]
Python shortcut The sorted merging of two lists can be easily implemented using the + operator and with the help of sorted():
def merge(values1, values2):
    return sorted(values1 + values2)

Alternative algorithm One variant is to generate the result data structure in advance. However, this leads to more index variables, and the entire thing becomes confusing. How can you avoid index access?

Instead of the potentially error-prone index accesses, try a variant with iterators. You run through the two lists from front to back and insert the elements as usual. Also, the appending of the remaining part can be transferred quite easily to iterators.
def merge_with_iter(values1, values2):
    result = []
    iterator1 = iter(values1)
    iterator2 = iter(values2)
    while True:
        try:
            value1, iterator1 = peek(iterator1)
            value2, iterator2 = peek(iterator2)
            if value1 < value2:
                result.append(value1)
                next(iterator1)
            else:
                result.append(value2)
                next(iterator2)
        except StopIteration:
            break
    add_remaining_with_iter(result, iterator1)
    add_remaining_with_iter(result, iterator2)
    return result
The last thing you implement is the addition of the remaining elements. However, this is a little bit more complex with iterators than the two variants shown before.
def add_remaining_with_iter(result, it):
    while True:
        try:
            value = next(it)
            result.append(value)
        except StopIteration:
            break
Another difficulty is that you cannot simply read out both elements via next(), since only one element is transferred to the result at a time. Therefore, you use a trick and create the function peek(), which first determines the next element and then reconstructs the iterator. In the above algorithm, you first take a look at the respective elements, and after comparing the value, you consume the element from the matching input data.
def peek(it):
    first = next(it)
    return first, itertools.chain([first], it)

The built-in function chain() is used here, which links two iterables together (i.e., make one out of two). Here it is used to restore the original dataset of the iterator.

Verification

Test the functionality with the value combinations from the introduction:
def inputs_and_expected():
    return [([1, 4, 7, 12, 20], [10, 15, 17, 33],
              [1, 4, 7, 10, 12, 15, 17, 20, 33]),
             ([2, 3, 5, 7], [11, 13, 17],
              [2, 3, 5, 7, 11, 13, 17]),
             ([2, 3, 5, 7, 11], [7, 11, 13, 17],
              [2, 3, 5, 7, 7, 11, 11, 13, 17]),
             ([1, 2, 3], [], [1, 2, 3])]
@pytest.mark.parametrize("values1, values2, expected",
                         inputs_and_expected())
def test_merge(values1, values2, expected):
    result = merge(values1, values2)
    assert result == expected
@pytest.mark.parametrize("values1, values2, expected",
                         inputs_and_expected())
def test_merge_with_iter(values1, values2, expected):
    result = merge_with_iter(values1, values2)
    assert result == expected

5.3.13 Solution 13: Excel Magic Select (★★✩✩✩)

If you have worked a little with Excel, then you have probably used the Magic Selection. It continuously populates a selected area with values based on the previous values. This works for numbers, weekdays, or dates, for example. To achieve something similar on your own, write function generate_following_values(current_value, sequence_length) that implements this for numbers. Create a variation suitable for weekdays and with the following signature: generate_following_values_for_- predefined(predefined_values, current_value, sequence_length).

Examples

Initial value

Count

Result

1

7

[1, 2, 3, 4, 5, 6, 7]

5

4

[5, 6, 7, 8]

FRIDAY

8

[FRIDAY, SATURDAY, SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY]

Algorithm At first, you might think that this is based on something very sophisticated. But when thinking a second time about the algorithm, you quickly realize that all you need is a list as the result data structure and a loop to populate it:
def generate_following_values(current_value, sequence_length):
    result = []
    while sequence_length > 0:
        result.append(current_value)
        current_value += 1
        sequence_length -= 1
    return result
Python shortcut With list comprehension, you write it briefly as follows:
def generate_following_values_v2(start_value, sequence_length):
    return [value for value in range(start_value,
                                      start_value + sequence_length)]
Alternatively, this can be implemented by combining various functionalities from the itertools module. However, I like the two previous variants much better in terms of readability and comprehensibility.
def generate_following_values_built_in(start_value, sequence_length):
    return list(itertools.islice(itertools.count(start_value), sequence_length))

Modified algorithm It is similarly easy to fill in with days of the week or via a list of predefined values, which, unlike numerical values, always repeat according to the length of the sequence.

With this knowledge, you minimally modify the previously used algorithm:
def generate_following_values_for_predefined(predefined_values,
                                             current_value, sequence_length):
    result = []
    current_pos = predefined_values.index(current_value)
    while sequence_length > 0:
        result.append(current_value)
        current_value, current_pos = next_cyclic(predefined_values, current_pos)
        sequence_length -= 1
    return result
This function is intended to allow the cyclical traversal of a list in the forward direction by starting again at the beginning after the last element:
def next_cyclic(values, current_pos):
    next_pos = (current_pos + 1) % len(values)
    return values[next_pos], next_pos

Verification

To track completion, you use a parameterized test, among other things one starting on a Friday, to generate eight values:
@pytest.mark.parametrize("start_value, sequence_length, expected",
                         [(1, 7, [1, 2, 3, 4, 5, 6, 7]),
                          (5, 4, [5, 6, 7, 8])])
def test_generate_following_values(start_value, sequence_length, expected):
    result = generate_following_values(start_value, sequence_length)
    assert result == expected
def predefined_values():
    return ["Monday", "Tuesday", "Wednesday", "Thursday",
            "Friday", "Saturday", "Sunday"]
@pytest.mark.parametrize("predefined_values, current_value, "
                         "sequence_length, expected",
                         [(predefined_values(), "Monday", 3,
                           ["Monday", "Tuesday", "Wednesday"]),
                          (predefined_values(), "Friday", 8,
                           ["Friday", "Saturday", "Sunday", "Monday",
                            "Tuesday", "Wednesday", "Thursday", "Friday"])])
def test_generate_following_values_for_predefined(predefined_values,
                                                  current_value,
                                                  sequence_length, expected):
    result = generate_following_values_for_predefined(predefined_values,
                                                      current_value,
                                                      sequence_length)
    assert result == expected

5.3.14 Solution 14: Stack-Based Queue (★★✩✩✩)

You learned about stack and queue data structures in the introduction and implemented a queue based on a list. Then in exercise 2, you implemented a stack itself. Now you are asked to build a queue based on the stack data structure.

Example

Please check the functionality with the following procedure:
def main():
    waiting_persons = Queue()
    waiting_persons.enqueue("Marcello")
    waiting_persons.enqueue("Michael")
    waiting_persons.enqueue("Karthi")
    while not waiting_persons.is_empty():
        if waiting_persons.peek() == "Michael":
            # reprocess at the end
            waiting_persons.enqueue("Michael again")
        next_person = waiting_persons.dequeue()
        print("Processing " + next_person)
The small sample program should produce the following output:
Processing Marcello
Processing Michael
Processing Karthi
Processing Michael again
Algorithm You have already learned that a stack is suitable for reversing a list’s order. Suppose you combine two stacks appropriately, one as an input buffer and one as an output buffer. In this case, you can implement a queue quite easily as follows. The only thing that’s a bit tricky is that you just transfer the data from the input buffer to the output buffer when the latter is empty.
class Queue:
    def __init__(self):
        self._inbox = Stack()
        self._outbox = Stack()
    def enqueue(self, elem):
        self._inbox.push(elem)
    def dequeue(self):
        if self.is_empty():
            raise QueueIsEmptyException()
        self._transfer_inbox_to_outbox()
        return self._outbox.pop()
    def peek(self):
        if self.is_empty():
            raise QueueIsEmptyException()
        self.__transfer_inbox_to_outbox()
        return self._outbox.peek()
    def is_empty(self):
        return self._inbox.is_empty() and self._outbox.is_empty()
    def _transfer_inbox_to_outbox(self):
        if self._outbox.is_empty():
            # transfer inbox to outbox
            while not self._inbox.is_empty():
                self._outbox.push(self._inbox.pop())

Verification

To test your implementation of the stack-based queue, execute the main() function and see if the output is as expected.

5.4 Summary: What You Learned

This chapter deepened your knowledge of basic data structures like lists, sets, and dictionaries. This knowledge is essential in business applications. These structures are useful for solving many tasks, not only individually but also in combination, such as the deletion of duplicates from lists. In addition, the exercise of the magic triangle, for example, trained abstract thinking. A small delicacy was to program the auto-completion of Excel itself. It is quite surprising what an elegant implementation this results in. Finally, you developed some functionality for merging lists. This is an elementary component for Merge Sort.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset