Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4. Strings

Michael Inden¹

(1)

Zurich, Switzerland

Strings model character sequences and possess the type str, which offers a variety of functions. In this chapter, you will learn about this topic through various exercises.

4.1 Introduction

Strings consist of single characters and, like lists, are sequential data types (see section 5.1.1), which is why many actions can be performed analogously, such as slicing. Unlike other languages, Python does not have a data type for individual characters, so they are simply represented as strings of length 1.

Strings can be created as character sequences in double or single quotes, as shown by the following two lines:

str1 = "DOUBLE QUOTED STRING"

str2 = 'SINGLE QUOTED STRING'

4.1.1 Practically Relevant Functions

For strings, I’ll go over the most common functions that are useful in practice. Let’s assume that the variable str is a string. Then you can call the following functions:

len(str) gets the length of the string. This is a general Python function for querying the length of sequential data types such as lists or tuples, etc., but also strings.
str[index] provides index-based access to individual letters.
str[start:end]/str[start:end:step] extracts the characters between the positions start and end - 1. As a special feature, a stepwidth can be specified. Interestingly, even the range specification can be omitted and with [::-1] only a negative step size can be used, resulting in a new string with the reverse letter order of the original string.
str[:end] extracts the characters between the beginning and the position end - 1.
str[start:] extracts the characters between the position start and the end of the string.
str.lower()/str.upper() creates a new string consisting of lowercase or uppercase letters. Numbers and punctuation marks are not converted.
str.strip() removes whitespace at the beginning and end of text and returns this as a new string. As a special feature, you can pass a character that will be removed instead of whitespace.
str.isalpha()/str.isdigit()/... checks if all characters of the string are alphanumeric, digits, etc.
str.startswith(other)/str.endswith(other) checks whether the string starts or ends with the given string.
str.find(other)/str.rfind(other) searches for the supplied string and returns the index of the first occurrence or -1 on nonexistence. The function rfind() searches from the end. As a special feature, it is possible to specify an index range in both cases.
str.index(other, start, end)/str.rindex(other, start, end) returns the index of the first or last occurrence of other. Unlike find(), an exception is thrown if the index is not present.
str.count(text) counts how many times text occurs in the string.
str.replace(old, new) creates a new string in which all occurrences of old are replaced by new.
str.split(delim) returns a list of substrings resulting from splitting the original string. The delimiter is no regular expression¹. Without specifying a delimiter, a text is split with respect to whitespace.
str.join(list) does the opposite of split(). Specifically, the elements passed as a list are joined to the string as a delimiter.
str.capitalize()/str.title() converts the first character to uppercase. With title() additionally within a string, the beginning of each new word is converted to uppercase.

4.1.2 Example Conversions and Extractions

Let’s take an introductory look at simple actions on strings such as converting to lowercase or uppercase and splitting:

name = "Carl Heinz Miller"

print(name.lower())

print(name.upper())

print(name.split())

time = '20:26:45'

hrs, mins, secs = time.split(':')

print(hrs, mins, secs)

This results in the following output:

carl heinz miller

CARL HEINZ MILLER

['Carl', 'Heinz', 'Miller']

20 26 45

In addition, you can repeat text with * and remove text components, often whitespace, from the margins with strip(). As you can see, you can even pass in characters.

print("-repeater-" * 3)

with_whitespace = " --CONTENT-- "

stripped1 = with_whitespace.strip()

stripped2 = stripped1.strip("-")

print("strip1:", stripped1, "length:", len(stripped1))

print("strip2:", stripped2, "length:", len(stripped2))

This results in the following output:

-repeater--repeater-repeater-

strip1:

--CONTENT-- length: 11

strip2: CONTENT length: 7

4.1.3 Equality

Now let’s look at the definition of two strings and how they can be compared, in particular the effects of == (content equality) and is (reference equality):

str1 = 'String with same contents but different quotes'

str2 = "String with same contents but different quotes"

str3 = "String with same contents but XXX quotes".replace("XXX", "different")

print("str1:", str1)

print("str2:", str2)

print("str3:", str3)

if str1 == str2:

print("same content")

if str1 is str2:

print("same reference str1 / str2")

if str1 == str3:

print("same content")

if str1 is str3:

print("same reference str1 / str3")

You get the following output:

str1: String with same contents but different quotes

str2: String with same contents but different quotes

str3: String with same contents but different quotes

same content

same reference str1 / str2")

same content

That the output of references str1 and str2 is the same may be surprising at first. Why is that? As an optimization, Python sometimes groups identical objects together.

However, this behavior is not guaranteed. At the latest, when actions are performed on the strings, like above the replace() references are no longer the same, but the content is in this case, of course, identical.

4.1.4 Slicing—Access to Individual Characters and Substrings

In the following code you use powerful slicing operations to access single characters, whole components, and even non-contiguous ranges. After that, you count occurrences and simulate a search and research. Finally, you replace a text component.

strange_message= "a message containing only a message"

mid_chars = strange_message[10:20]

last_seven_chars = strange_message[-7:]

print("mid_chars:", mid_chars, " / last_seven_chars:", last_seven_chars)

first_char = strange_message[0]

print(first_char, "count:", strange_message.count(first_char))

print(last_seven_chars, "count:", strange_message.count(last_seven_chars))

# search and continue searching

print("find message:", strange_message.find("message"))

print("find next message:", strange_message.find("message", 3))

# replace (all)

print("replace by info:", strange_message.replace("message", "info"))

This results in the following output:

mid_chars: containing / last_seven_chars: message

a count: 5

message count: 2

find message: 2

find next message: 28

replace by info: a info containing only a info

4.1.5 Converting a String into a List of Characters

Sometimes you want to process text as single characters. A call to list() can be helpful for this:

print(list("Text als Liste"))

This results in the following outputs:

['T', 'e', 'x', 't', ' ', 'a', 'l', 's', ' ', 'L', 'i', 's', 't', 'e']

4.1.6 Iteration

There are several variants when looping through the individual characters of a string. First, it is possible to work indexed with a for loop and len() in combination with range(). However, this is the least adequate way in Python. It is better to work with enumerate(), which provides access to both the index and the value. Sometimes you don’t need access to the index at all; then the third variant with in is recommended.

message = "Python has several loop variants"

for i in range(len(message)):

print(i, message[i], end=',')

print()

for i, current_char in enumerate(message):

print(i, current_char, end=',')

print()

for current_char in message:

print(current_char, end=',')

print()

These loops produce the following output:

0 P,1 y,2 t,3 h,4 o,5 n,6 ,7 h,8 a,9 s,10 ,11 s,12 e,13 v,14 e,15 r,16 a,17 l

,18 ,19 l,20 o,21 o,22 p,23 ,24 v,25 a,26 r,27 i,28 a,29 n,30 t,31 s,

0 P,1 y,2 t,3 h,4 o,5 n,6 ,7 h,8 a,9 s,10 ,11 s,12 e,13 v,14 e,15 r,16 a,17 l

,18 ,19 l,20 o,21 o,22 p,23 ,24 v,25 a,26 r,27 i,28 a,29 n,30 t,31 s,

P,y,t,h,o,n, ,h,a,s, ,s,e,v,e,r,a,l, ,l,o,o,p, ,v,a,r,i,a,n,t,s,

4.1.7 Formatted Output

The following calls to capitalize() and title()

text = "this is a very special string"

print(text.capitalize())

print(text.title())

result in this output:

This is a very special string

This Is A Very Special String

Python offers different ways of formatting output with placeholders. In the simplest case, you specify the values in a comma-separated way in print(). Alternatively, you can specify placeholders in the text using {}, which will then be filled with the values of the call to of format(). There is also the variant with placeholders like %s and %d as well as the modulo operator in combination with a tuple that provides the values. Finally, an explicitly formatted string with f"text" and named parameters can be used.

product = "Apple iMac"

price = 3699

# variants of the formatted output

print("the", product, "costs", price)

print("the {} costs {}".format(product, price))

print(f"the %s costs %d" % (product, price))

print(f"the {product} costs {price}")

This results in the following output for all four variants:

the Apple iMac costs 3699

4.1.8 Character Processing

If you need to process single characters, the functions ord() and chr() can be useful. Here chr() converts a numerical value into a string of length 1 and ord() converts such a such a string into an int value:

>>> ord("A")

>>> chr(65)

'A'

>>> ord("0")

>>> chr(48)

'0'

4.1.9 Example: String Processing

As a final example of string processing, you want to count the number of occurrences of each letter in a string, treating lowercase and uppercase letters equally. For the text “Otto,” you expect 2 x t and 2 x o due to the conversion to lowercase letters. Such a processing is also called a histogram . This is a representation of the distribution of objects, often numerical values. It is also known from photography for the brightness distribution of a picture. The following is about the distribution or determination of the frequencies of letters for a text. To do this, you first convert the input to lowercase with lower() and then iterate through this string. By calling isalpha() you make sure that you only include letters in your count.

from operator import itemgetter

def generate_character_histogram(word):

char_count_map = {}

for current_char in list(word.lower()):

if current_char.isalpha():

if current_char in char_count_map:

char_count_map[current_char] += 1

else:

char_count_map[current_char] = 1

return dict(sorted(char_count_map.items(), key=itemgetter(0)))

Let’s try this out in the Python command line:

>>> generate_character_histogram("Otto")

{'o': 2, 't': 2}

>>> generate_character_histogram("Hello Micha")

{'a': 1, 'c': 1, 'e': 1, 'h': 2, 'i': 1, 'l': 2, 'm': 1, 'o': 1}

>>> generate_character_histogram("Python Challenges, Your Python Training")

{'a': 2, 'c': 1, 'e': 2, 'g': 2, 'h': 3, 'i': 2, 'l': 2, 'n': 5, 'o': 3, 'p': 2,

'r': 2, 's': 1, 't': 3, 'u': 1, 'y': 3}

4.2 Exercises

4.2.1 Exercise 1: Number Conversions (★★✩✩✩)

Based on a string, implement a validation for binary numbers and a conversion to it. Repeat both for hexadecimal numbers.

Note

The conversion can be solved with int(value, radix) and base 2 for binary numbers and base 16 for hexadecimal numbers. Do not use these explicitly; implement them yourself.

Examples

Input	Method	Result
“10101”	is_binary_number( )	True
“111”	binary_to_decimal( )	7
“AB”	hex_to_decimal( )	171

Exercise 1a (★✩✩✩✩)

Write function is_binary_number(number) that checks that a given string consists only of the characters 0 and 1 (i. e., represents a binary number).

Exercise 1b (★★✩✩✩)

Write function binary_to_decimal(number) that converts a (valid) binary number represented as a string to the corresponding decimal number.

Exercise 1c (★★✩✩✩)

Write the entire conversion again, but this time for hexadecimal numbers.

4.2.2 Exercise 2: Joiner (★✩✩✩✩)

Write function join (values, delimiter) that joins a list of strings with the given delimiter and returns it as one string. Implement this yourself without using any special Python functionality like join() provided by type str.

Example

Input	Separator	Result
[“hello”, “world”, “message”]	“ +++ ”	“hello +++ world +++ message”

4.2.3 Exercise 3: Reverse String (★★✩✩✩)

Write function reverse(text) that reverses the letters in a string and returns it as a result. Implement this yourself; in other words, do not use any special Python functionality, such as [::-1].

Examples

Input	Result
“ABCD”	“DCBA”
“OTTO”	“OTTO”
“PETER”	“RETEP”

4.2.4 Exercise 4: Palindrome (★★★✩✩)

Exercise 4a (★★✩✩✩)

Write function is_palindrome(text) that checks whether a given string is a palindrome regardless of case. A palindrome is a word that reads the same from the front and the back.

Note

You can easily solve the verification with [::-1]. Explicitly do not use Python components; implement the functionality yourself.

Examples

Input	Result
“Otto”	True
“ABCBX”	False
“ABCXcba”	True

Exercise 4b (★★★✩✩)

Write an extension that does not consider spaces and punctuation as relevant, allowing whole sentences to be checked, such as this one:

Was it a car or a cat I saw?

4.2.5 Exercise 5: No Duplicate Chars (★★★✩✩)

Determine if a given string contains duplicate letters. Uppercase and lowercase letters should not make any difference. Write function check_no_duplicate_chars(text) for this purpose.

Examples

Input	Result
“Otto”	False
“Adrian”	False
“Micha”	True
“ABCDEFG”	True

4.2.6 Exercise 6: Remove Duplicate Letters (★★★✩✩)

Write function remove_duplicates(text) that keeps each letter only once in a given text, thus deleting all subsequent duplicates regardless of case. However, the original order of the letters should be preserved.

Examples

Input	Result
“bananas”	“bans”
“lalalamama”	“lam”
“MICHAEL”	“MICHAEL”
“AaBbcCdD”	“ABcd“

4.2.7 Exercise 7: Capitalize (★★✩✩✩)

Exercise 7a (★★✩✩✩)

Write function capitalize(text) that converts a given text into an English title format where each word starts with a capital letter. You must explicitly not use the built-in function title() of the type str.

Examples

Input	Result
“this is a very special title”	“This Is A Very Special Title”
“effective java is great”	“Effective Java Is Great”

Exercise 7b: Modification (★★✩✩✩)

Assume now that the input is a list of strings and that a list of strings should be returned, with the individual words and then starting with a capital letter.

Exercise 7c: Special treatment (★★✩✩✩)

In headings, it is common to encounter special treatment of words. For example, “is” and “a” are not capitalized. Implement this as function capitalize_special_2(words, ignorable_words), which gets the words excluded from the conversion as the second parameter.

Example

Input	Exceptions	Result
[“this”, “is”, “a”, “title”]	[“is”, “a”]	[“This”, “is”, “a”, “Title”]

4.2.8 Exercise 8: Rotation (★★✩✩✩)

Consider two strings, str1 and str2, where the first string is supposed to be longer than the second. Figure out if the first one contains the other one. In doing so, the characters within the first string may also be rotated. Characters can be moved from the beginning or the end to the opposite position (even repeatedly). To do this, create function contains_rotation(str1, str2), which is case-insensitive during the check.

Examples

Input 1	Input 2	Result
“ABCD”	“ABC”	True
“ABCDEF	“EFAB”	True (“ABCDEF” < x 2 ⇒ “CDEFAB” contains “EFAB”)
“BCDE”	“EC”	False
“Challenge”	“GECH”	True

4.2.9 Exercise 9: Well Formed Braces (★★✩✩✩)

Write function check_braces(text) that checks whether the sequence of round braces passed as a string contains matching (properly nested) pairs of braces.

Examples

Input	Result	Comment
“(( ))”	True
“( )( )”	True
“(( )))((( ))”	False	Although it has the same amount of opening and closing braces, it is not properly nested
“((( )”	False	No suitable bracing

4.2.10 Exercise 10: Anagram (★★✩✩✩)

The term anagram is used to describe two strings that contain the same letters in the same frequency. Here, uppercase and lowercase should not make any difference. Write function is_anagram(str1, str2).

Examples

Input 1	Input 2	Result
“Otto”	“Toto”	True
“Mary	“Army”	True
“Ananas”	“Bananas”	False

4.2.11 Exercise 11: Morse Code (★★✩✩✩)

Write function to_morse_code(text) that is capable of translating a given text into Morse code characters. They consist of sequences of one to four short and long tones per letter, symbolized by a dot (.) or a dash (-). It is desirable for easier distinguishability to place a space between each tone and three spaces between each sequence of letter tones. Otherwise, S (...) and EEE (...) would not be distinguishable from each other.

For simplicity, limit yourself to the letters E, O, S, T, W with the following encoding:

Letter	Morse code
E	.
O	- - -
S	. . .
T	-
W	. - -

Examples

Input	Result
SOS	. . . - - - . . .
TWEET	- . - - . . -
WEST	. - - . . . . -

Bonus Try to find out the corresponding Morse code for all letters of the alphabet so you can convert your name. You can find the necessary hints for this at https://en.wikipedia.org/wiki/Morse_code.

4.2.12 Exercise 12: Pattern Checker (★★★✩✩)

Write function matches_pattern(pattern, text) that examines a space-separated string (second parameter) against the structure of a pattern passed in the form of individual characters as the first parameter.

Examples

Input pattern	Input text	Result
“xyyx”	“tim mike tim”	True
“xyyx”	“tim mike tom tim”	False
“xyxx”	“tim mike tim”	False
“xxxx”	“tim tim”	True

4.2.13 Exercise 13: Tennis Score (★★★✩✩)

Write function tennis_score(score, player1_name, player2_name) that makes an announcement in a familiar style such as Fifteen Love, Deuce, or Advantage Player X, based on a textual score for two players, PL1 and PL2. The score is given in the format <PL1 points>:<PL2 points>.

The following counting rules apply to a game in tennis:

A game is won (Game <PlayerX>) when a player reaches four or more points and is ahead by at least two points.
Points from zero to three are named Love, Fifteen, Thirty, and Forty.
In case of at least three points and a tie, this is called Deuce.
With at least three points and a one-point difference, this is called Advantage <PlayerX> for the one who has one more point.

Examples

Input	Score
“1:0”, “Micha”, “Tim”	“Fifteen Love”
“2:2”, “Micha”, “Tim”	“Thirty Thirty”
“2:3”, “Micha”, “Tim”	“Thirty Forty”
“3:3”, “Micha”, “Tim”	“Deuce”
“4:3”, “Micha”, “Tim”	“Advantage Micha”
“4:4”, “Micha”, “Tim”	“Deuce”
“5:4”, “Micha”, “Tim”	“Advantage Micha”
“6:4”, “Micha”, “Tim”	“Game Micha”

4.2.14 Exercise 14: Version Numbers (★★✩✩✩)

Write function compare_versions(version1, version2) that compares version numbers in the format MAJOR.MINOR.PATCH with each other. Thereby the specification of PATCH is optional. In particular, the return value should be represented in the form of the characters <, =, and >.

Examples

Version 1	Version 2	Result
1.11.17	2.3.5	<
2.1	2.1.3	<
2.3.5	2.4	<
3.1	2.4	>
3.3	3.2.9	>
7.2.71	7.2.71	=

4.2.15 Exercise 15: Conversion str_to_number (★★✩✩✩)

Convert a string into an integer. To do this, write function str_to_number(text) on your own.

Note

The conversion can be easily achieved with int(value). Do not use this explicitly, but implement the entire conversion yourself.

Examples

Input	Result
“+123”	123
“-123”	-123
“7271”	7271
“ABC”	ValueError
“0123”	83 (for bonus task)
“-0123”	-83 (for bonus task)
“0128”	ValueError (for bonus task)

Bonus Enable the parsing of octal numbers.

4.2.16 Exercise 16: Print Tower (★★★✩✩)

Write function print_tower(n) that represents a tower of n slices stacked on top of each other as ASCII graphics, symbolized by the character #. Also, draw a lower boundary line.

Example

A tower of height three should look something like this:

# |#

## |##

### |###

----------

4.2.17 Exercise 17: Filled Frame (★★✩✩✩)

Write function print_box(width, height, fillchar) that draws a rectangle of the specified size as an ASCII graphic and fills it with the passed-in fill character.

Examples

Below you see two rectangles filled differently:

+-----+ +-------+

|*****| |$$$$$$$|

+-----+ |$$$$$$$|

|$$$$$$$|

+-------+

4.2.18 Exercise 18: Guessing Vowels (★★✩✩✩)

Write function translate_vowel(text, replacement) that replaces all vowels in a given text with a character or string. This can be used for a little guessing quiz, for example, or to determine word similarities based on consonants only.

Input	Replacement	Result
“guide”	“?”	“g??d?”
“lawnmower”	“-”	“l-wnm-w-r”
“quiz”	“_”	“q z”
“lawnmower”	“”	“lwnmwr”

4.3 Solutions

4.3.1 Solution 1: Number Conversions (★★✩✩✩)

Based on a string, implement a validation for binary numbers and a conversion to it. Repeat both for hexadecimal numbers.

Note

The conversion can be solved with int(value, radix) and base 2 for binary numbers and base 16 for hexadecimal numbers. Do not use these explicitly; implement them yourself.

Examples

Input	Method	Result
“10101”	is_binary_number( )	True
“111”	binary_to_decimal( )	7
“AB ”	hex_to_decimal( )	171

Solution 1a (★✩✩✩✩)

Write function is_binary_number(number) that checks that a given string consists only of the characters 0 and 1 (i. e., represents a binary number).

Algorithm The brute force and index-based version iterates through the string character by character from the beginning to the end, checking whether the current character is 0 or 1. If another character is detected, the loop terminates and then False is returned.

def is_binary_number(number):

is_binary = True

i = 0

while i < len(number) and is_binary:

current_char = number[i]

is_binary = (current_char == "0" or current_char == "1")

i += 1

return is_binary

This can also be formulated as a search problem but needs some thought here when returning:

def is_binary_number_v2(number):

i = 0

while i < len(number) and number[i] in ["0", "1"]:

i += 1

return i >= len(number)

Python shortcut The whole thing can be implemented in an easier and more understandable way with Python specifics:

def is_binary_number_short_cut(word):

for current_char in word:

if current_char not in ["0", "1"]:

return False

return True

PYTHON STYLE: DON’T ASK FOR PERMISSION, ASK FOR FORGIVENESS

There is—as indicated in the task —still the possibility to use int(). Then you follow the Python motto of “Don’t ask for permission, ask for forgiveness.” In this case, it means trying potentially dangerous actions, like index accesses with the wrong index and reacting appropriately if they fail. With a strong Java background, I take a rather critical view of this habit—sure, the approach is often practical, but sometimes it is a bit risky. But let’s look at this stylistically perfectly good variant of the check:

def is_binary_number_v3(number):

try:

int(number, 2)

return True

except ValueError:

return False

Solution 1b (★★✩✩✩)

Write function binary_to_decimal(number) that converts a (valid) binary number represented as a string to the corresponding decimal number.

Algorithm You traverses the string character by character from left to right and processes each character as a binary digit. The current character is used to calculate the value by multiplying the previously converted value by 2 and adding the current value. It is possible to formulate the algorithm more clearly, meaning without special treatments, because a valid input is ensured by the previously implemented function is_binary_number(number).

def binary_to_decimal(number):

if not is_binary_number(number):

raise ValueError(number + " is not a binary number")

decimal_value = 0

for current_char in number:

value = int(current_char)

decimal_value = decimal_value * 2 + value

return decimal_value

Solution 1c (★★✩✩✩)

Write the entire conversion again, but this time for hexadecimal numbers.

Algorithm For hexadecimal numbers, the factor has to be changed to 16. In addition, the letters A to F are now permitted in both lowercase and uppercase. Their value is determined by a subtraction ord(current_char.upper()) - ord("A") + 10— thus forming “A” to “F” to the values 0 to 5 and add 10, which then gives the correct value.

def hex_to_decimal(number):

if not is_hex_number(number):

raise ValueError(number + " is not a hex number")

decimal_value = 0

for current_char in number:

if current_char.isdigit():

value = int(current_char)

else:

value = ord(current_char.upper()) - ord("A") + 10

decimal_value = decimal_value * 16 + value

return decimal_value

The check for valid hexadecimal numbers uses a tricky check with in under the specification of all possible digits and letters for hexadecimal numbers:

def is_hex_number(number):

for current_char in number:

if current_char not in "0123456789ABCDEFabcdef":

return False

return True

Verification

For testing, use the following inputs, which show the correct functionality:

@pytest.mark.parametrize("value, expected",

[("10101", True), ("222", False), ("12345", False)])

def test_is_binary_number(value, expected):

assert is_binary_number(value) == expected

@pytest.mark.parametrize("value, expected",

[("111", 7), ("1010", 10), ("1111", 15), ("10000", 16)])

def test_binary_to_decimal(value, expected):

assert binary_to_decimal(value) == expected

@pytest.mark.parametrize("value, expected",

[("7", 7), ("A", 10), ("F", 15), ("10", 16)])

def test_hex_to_decimal(value, expected):

assert hex_to_decimal(value) == expected

4.3.2 Solution 2: Joiner (★✩✩✩✩)

Write function join(values, delimiter) that joins a list of strings with the given delimiter and returns it as one string. Implement this yourself without using any special Python functionality like join() provided by type str.

Example

Input	Separator	Result
[“hello”, “world”, “message”]	“ +++ ”	“hello +++ world +++ message”

Algorithm Traverse the list of values from front to back. Insert the text into a string, add the separator string, and repeat this until the last value. As a special treatment, no separator string may be added after this.

def join(values, delimiter):

result = ""

for i, current_value in enumerate(values):

result += current_value

# no separator after last occurrence

if i < len(values) - 1:

result += delimiter

return result

Python shortcut String joining can be written in a compact and understandable way and without special handling using the appropriate function join():

result = delimiter.join(values)

A variant with reduce() from module functools looks like this:

import functools

result = functools.reduce(lambda str1, str2: str1 + delimiter + str2, values)

By the way, the function join() is also handy when you want to convert the values of a list into a string. For this purpose, you use an empty string as a delimiter.

"".join(values) # trick: Convert list to string

Verification

For testing, use the following inputs, which show the correct functionality:

@pytest.mark.parametrize("values, delimiter, expected",

[(["hello", "world", "message"], " +++ ",

"hello +++ world +++ message")])

def test_join(values, delimiter, expected):

assert join(values, delimiter) == expected

4.3.3 Solution 3: Reverse String (★★✩✩✩)

Write function reverse(text) that reverses the letters in a string and returns it as a result. Implement this yourself; in other words, without using any special Python functionality, such as [::-1].

Examples

Input	Result
“ABCD”	“DCBA”
“OTTO”	“OTTO”
“PETER	“RETEP”

Algorithm Initially, an idea could be to traverse the original string character by character from the end and add the respective character to the result:

def reverse(text):

reversed_text = ""

for i in range(len(text) - 1, -1, -1):

current_char = text[i]

reversed_text += current_char

return reversed_text

A bit messy is the for loop with the multiple -1. The built-in functionality reversed() allows you to run through the text character by character from back to front, which is more readable:

def reverse_nicer(text):

reversed_text = ""

for current_char in reversed(text):

reversed_text += current_char

return reversed_text

However, a small problem exists. The string concatenations with += are potentially expensive because strings are immutable in Python and thereby new string objects are created. Generally, each externally visible change creates a new string.

Optimized algorithm So how could it be more memory-efficient, for example, if very long strings are to be reversed extremely frequently?

The idea is to convert the string with list() into a list and work directly on it. In addition, you use two pointers, left and right, which initially point to the first and last character, respectively. Now you swap the individual letters, and the position pointers move inwards. Repeat the whole process as long as left < right is valid; if left >= right, the process is aborted.

Let’s illustrate the procedure for the text ABCD, where l stands for left and r for right:

A B C D

l r

D B C A

l r

D C B A

r l => End

You implement the described procedure as follows:

def reverse_inplace(text):

original_chars = list(text)

left = 0

right = len(original_chars) - 1

while left < right:

left_char = original_chars[left]

right_char = original_chars[right]

original_chars[left] = right_char

original_chars[right] = left_char

left+=1

right-=1

# trick: convert list to string

return "".join(original_chars)

Python shortcut Of course, the whole thing can be achieved much easier by the following two calls. Still, this exercise is about to get to know the character-by- character processing and possible optimizations.

reversed_text = text[::-1])

reversed_text = "".join(reversed(text)))

Verification

Let’s write a unit test to verify the desired functionality:

def input_and_expected():

return [("ABCD", "DCBA"), ("OTTO", "OTTO"), ("PETER", "RETEP")]

@pytest.mark.parametrize("input, expected",

input_and_expected())

def test_reverse(input, expected):

assert reverse(input) == expected

@pytest.mark.parametrize("input, expected",

input_and_expected())

def test_reverse_inplace(input, expected):

assert reverse_inplace(input) == expected

4.3.4 Solution 4: Palindrome (★★★✩✩)

Solution 4a (★★✩✩✩)

Write function is_palindrome(text) that checks whether a given string is a palindrome regardless of case. A palindrome is a word that reads the same from the front and the back.

Note

You can easily solve the verification with [::-1]. Explicitly do not use Python components; implement the functionality yourself.

Examples

Input	Result
“Otto”	True
“ABCBX”	False
“ABCXcba”	True

JOB INTERVIEW TIPS

In a job interview, here are possible questions you may ask to clarify the scope of the assignment:

Should it be case-sensitive?
ANSWER: No
Are spaces relevant?
ANSWER: First yes, later no, then to be ignored

Algorithm As in exercise 3, the string is represented as a list and you advance one position inward from the left and one position from the right, as long as the characters match and as long as the left position is still smaller than the right position:

def is_palindrome(text):

left = 0

right = len(text) - 1

lower_input = text.lower()

is_same_char = True

while left < right and is_same_char:

is_same_char = (lower_input[left] == lower_input[right])

left += 1

right -= 1

return is_same_char

Python shortcut Of course, the whole thing can be achieved in a straightforward way by calling the built-in functionality [::-1]. Still, this will generate an additional string.

def is_palindrome_short(text):

adjusted_input = text.lower()

return adjusted_input == adjusted_input[::-1]

Algorithm with recursion How can you solve the palindrome problem recursively and without using a list as an auxiliary data structure? After reading Chapter 3 and working through some of the exercises on recursion given there, you should be able to implement this easily. With the strategy or the idiom of the helper function in mind, the following recursive implementation emerges, which, starting from the outside, always checks two characters. This is continued inward as long as the characters match and the left position is smaller than the right one.

def is_palindrome_rec(text):

return is_palindrome_rec_in_range(text.lower(), 0, len(text) - 1)

def is_palindrome_rec_in_range(text, left, right):

if left >= right:

return True

if text[left] == text[right]:

return is_palindrome_rec_in_range(text, left + 1, right - 1)

return False

An alternative way is always to shorten the string by the characters. Why is this logical solution not so good practically? The answer is obvious: This causes many temporary string objects to be created. Besides, a large number of copy actions would have to take place.

Solution 4b (★★★✩✩)

Write an extension that does not consider spaces and punctuation as relevant, allowing whole sentences to be checked, such as this one:

Was it a car or a cat I saw?

Algorithm You can incorporate special checks for whitespace into the algorithm. Still, it is easier to create a version of the function and replace all unwanted punctuation and whitespace therein before calling the original function.

def is_palindrome_special(text, ignore_spaces_and_punctuation):

adjusted_input = text.lower()

if ignore_spaces_and_punctuation:

adjusted_input = adjusted_input.replace(" ", "")

adjusted_input = adjusted_input.replace("!", "")

adjusted_input = adjusted_input.replace(".", "")

return is_palindrome_rec(adjusted_input)

Please note that replace() unfortunately does not support regular expression to remove the special characters. Here this is the case for a space, exclamation mark, and period. Therefore, you simply call this three times appropriately.

HINT: REGULAR EXPRESSIONS WITH PYTHON

If you prefer to use a regular expression after all, you can utilize the re module as follows:

import re

def is_palindrome_special_with_reg_ex(text, ignore_spaces_and_punctuation):

adjusted_input = text.lower()

if ignore_spaces_and_punctuation:

adjusted_input = re.sub(r"[ !.?]", "", adjusted_input)

return is_palindrome_rec(adjusted_input)

Verification

To verify, you again write a unit test with the following inputs that show the correct operation:

def palindrome_inputs_and_expecteds():

return [("Otto", True),

("ABCBX", False),

("ABCXcba", True)]

@pytest.mark.parametrize("input, expected",

palindrome_inputs_and_expecteds())

def test_is_palindrome(input, expected):

assert is_palindrome(input) == expected

@pytest.mark.parametrize("input, expected",

palindrome_inputs_and_expecteds())

def test_is_palindrome_rec(input, expected):

assert is_palindrome_rec(input) == expected

@pytest.mark.parametrize("input, expected",

[("Was it a car or a cat i saw.", True),

("This is not a Palindrome!", False)])

def test_is_palindrome_special(input, expected):

ignore_spaces_and_punctuation = True

assert is_palindrome_special(input,

ignore_spaces_and_punctuation) == expected

FINDINGS: PAY ATTENTION TO COMPREHENSIBILITY

It is absolutely natural for strings to choose an iterative solution due to their API and position/index-based access. This would no longer be convenient if you had to determine the palindrome property for the digits of a number. This can be done with recursion and some consideration even without the detour via conversion shown as exercise 10 in section 3.3.10. Having developed the functionality reverse() in the previous exercise, you can profitably use it here as follows:

def is_palindrome_with_reverse(text):

adjusted_input = text.lower()

return adjusted_input == reverse(adjusted_input)

This demonstrates that problem- and context-aware programming enables the creation of comprehensible and maintainable solutions. The properties of understandability, maintainability, and changeability are of high importance in practice since source code is usually modified far more frequently due to changing or new requirements than created completely from scratch.

4.3.5 Solution 5: No Duplicate Chars (★★★✩✩)

Determine if a given string contains duplicate letters. Uppercase and lowercase letters should not make any difference. Write function check_no_duplicate_chars(text) for this purpose.

Examples

Input	Result
“Otto”	False
“Adrian”	False
“Micha”	True
“ABCDEFG”	True

Algorithm When solving the task, you might get the idea of storing the individual characters in a set. You run through the input character by character from front to back. For each character, you check whether it is already in the set. If so, you have encountered a duplicate character and abort processing. Otherwise, you insert the character into the set and continue with the next character until you reach the end of the input or detect a duplicate character.

def check_no_duplicate_chars(text):

contained_chars = set()

for current_char in text.upper():

if current_char in contained_chars:

return False

contained_chars.add(current_char)

return True

Python Shortcut Although the implementation shown is quite straightforward, other even more compact alternatives exist. They take advantage of the fact that any string can be converted into a list or set using the functions list() or set(). If there are no duplicates, the number of characters must be equal to the length of the string. Many words, but few instructions ... the whole thing can be formulated as follows:

def check_no_duplicate_chars_v2(text):

upper_case_input = text.upper()

return len(upper_case_input) == len(set(upper_case_input))

Verification

You again use a unit test to verify the desired functionality:

@pytest.mark.parametrize("input, expected",

[("Otto", False), ("Adrian", False),

("Micha", True), ("ABCDEFG", True)])

def test_check_no_duplicate_chars(input, expected):

assert check_no_duplicate_chars(input) == expected

4.3.6 Solution 6: Remove Duplicate Letters (★★★✩✩)

Examples

Input	Result
“bananas”	“bans”
“lalalamama”	“lam”
“MICHAEL”	“MICHAEL”
“AaBbcCdD”	“ABcd”

Algorithm Again, you run through the string character by character and store the respective letters in a set called already_seen. If the current character is not yet contained there, it will be included in both the set and the result text. However, if such a character already exists, you continue with the next character of the input.

def remove_duplicates(text):

result = ""

already_seen = set()

for current_char in text:

if not current_char.lower() in already_seen:

already_seen.add(current_char.lower())

result += current_char

return result

Verification

Check the removal of duplicate letters using the following unit test:

@pytest.mark.parametrize("input, expected",

[("bananas", "bans"),

("lalalamama", "lam"),

("MICHAEL", "MICHAEL"),

("AaBbcCdD", "ABcd")])

def test_remove_duplicates(input, expected):

assert remove_duplicates(input) == expected

4.3.7 Solution 7: Capitalize (★★✩✩✩)

Exercise 7a (★★✩✩✩)

Examples

Input	Result
“this is a very special title”	“This Is A Very Special Title”
“effective java is great”	“Effective Java Is Great”

Algorithm Because strings are immutable, initially you copy the contents into a list upon which you make the modifications. You traverse this list from front to back, looking for the beginning of a new word. As an indicator, you use a boolean flag capitalize_next_char. This indicates that the first letter of the next word has to be capitalized. Initially, this flag is True, so the current (first) character is converted into a capital letter. This happens only for letters, not for numbers. After the conversion, the flag gets reset and letters are skipped until a space is found. You then reset the flag to True. This procedure gets repeated until the end of the list is reached. Finally, a new string is created from the list containing the modifications.

def capitalize(text):

input_chars = list(text)

capitalize_next_char = True

for i, current_char in enumerate(input_chars):

if current_char.isspace():

capitalize_next_char = True

elif capitalize_next_char and current_char.isalpha():

input_chars[i] = current_char.upper()

capitalize_next_char = False

return "".join(input_chars)

Let’s try the whole thing in the Python command line:

>>> capitalize("seems to be okay")

'Seems To Be Okay'

Now, however, you may wonder about the behavior that is supposed to occur for letters after digits or other non-letters:

>>> capitalize("what should happen with -a +b 1c")

'What Should Happen With -A +B 1C'

HINT: SPECIAL TREATMENT VARIANT

A moment ago, I brought up a special case. It is a matter of definition how to deal with it. If letters after special characters should not be converted to uppercase, this can be achieved easily. The difference compared to before is subtle: You remove the isalpha() check and call upper() in every case. This is possible because the function can handle not only letters but also other characters.

def capitalize_special(text):

input_chars = list(text)

capitalize_next_char = True

for i, current_char in enumerate(input_chars):

if current_char.isspace():

capitalize_next_char = True

elif capitalize_next_char:

input_chars[i] = current_char.upper()

capitalize_next_char = False

return "".join(input_chars)

This then yields the following output:

>>> capitalize_special("what should happen with -a +b 1c")

'What Should Happen With -a +b 1c'

Behavior for whitespace It is also interesting to see how whitespace is handled:

print(capitalize("This is a text"))

This returns the following, not entirely without surprise—which is not to be considered further for your task:

This is a text

Python shortcut The desired functionality can be implemented as follows with list comprehension and split() :

def capitalize_shorter(text):

converted = [word[0].upper() + word[1:] for word in text.split()]

return " ".join(converted)

This provides for the two inputs the same result:

>>> print(capitalize_shorter("This is a text"))

This Is A Text

>>> print(capitalize_shorter("This is a text"))

This Is A Text

Exercise 7b: Modification (★★✩✩✩)

Assume now that the input is a list of strings and that a list of strings should be returned, with the individual words and then starting with a capital letter.

Algorithm First, create a list to store the converted words. Then iterate through all elements of the given list and process each one by calling the function capitalize_word(). To convert the first character to a capital letter, retrieve it indexed with [0] and then call upper(). The remaining characters are returned by slicing with [1:]. A new word is formed from both and inserted into the result. To make the function capitalize_word() error-tolerant, it handles an empty input with a sanity check.

def capitalize_words(words):

return [capitalize_word(word) for word in words]

def capitalize_word(word):

if not word:

return ""

return word[0].upper() + word[1:]

Exercise 7c: Special treatment (★★✩✩✩)

In headings, it is common to encounter special treatment of words. For example, “is” and “a” are not capitalized. Implement this as function capitalize_special_2(words, ignorable_words) that gets the words excluded from the conversion as the second parameter.

Example

Input	Exceptions	Result
[“this”, “is”, “a”, “title”]	[“is”, “a”]	[“This”, “is”, “a”, “Title”]

Algorithm The previously developed functionality is extended by a list of words that should not be converted. When traversing, you check if the current word is one from the negative list. If so, it is added to the result without modification. Otherwise, you perform the actions as before.

def capitalize_special_2(words, ignorable_words):

capitalized_words = []

for word in words:

if word in ignorable_words:

capitalized_words.append(word)

else:

capitalized_words.append(capitalize_word(word))

return capitalized_words

Verification

For testing, use the following inputs, which show the correct functionality:

@pytest.mark.parametrize("input, expected",

[("this is a very special title",

"This Is A Very Special Title"),

("effective java is great",

"Effective Java Is Great")])

def test_capitalize(input, expected):

assert capitalize(input) == expected

@pytest.mark.parametrize("words, expected",

[(["this", "is", "a", "very", "special", "title"],

["This", "Is", "A", "Very", "Special", "Title"]),

(["effective", "java", "is", "great"],

["Effective", "Java", "Is", "Great"])])

def test_capitalize_words(words, expected):

assert capitalize_words(words) == expected

@pytest.mark.parametrize("words, expected",

[(["this", "is", "a", "very", "special", "title"],

["This", "is", "a", "Very", "Special", "Title"]),

(["effective", "java", "is", "great"],

["Effective", "Java", "is", "Great"])])

def test_capitalize_special_2(words, expected):

assert capitalize_special_2(words, ["a", "is"]) == expected

4.3.8 Solution 8: Rotation (★★✩✩✩)

Examples

Input 1	Input 2	Result
“ABCD”	“ABC”	True
“ABCDEF	“EFAB”	True (“ABCDEF” < x 2 ⇒ “CDEFAB” contains “EFAB”)
“BCDE”	“EC”	False
“Challenge”	“GECH”	True

JOB INTERVIEW TIPS: POSSIBLE QUESTIONS AND SOLUTION IDEAS

In a job interview, here are possible questions you may ask to clarify the assignment:

Is the direction of the rotation known ← / →?
ANSWER: No, arbitrary
Should the rotation check be case-sensitive?
ANSWER: No, treat as same

Idea 1: Brute Force As a first idea, you could try all combinations. You start without rotation. Then you rotate string str1 to the left and check if this rotated string is contained in str2. In the worst case, this procedure is repeated up to n times. This is extremely inefficient.

Idea 2: First check if rotation makes sense Another idea for solving this is to collect all characters in a Set per string in advance and then use issubset() to check if all needed letters are included. But even this is laborious and does not really reflect well the problem to be solved.

Idea 3: Procedure in reality Think for a while and consider how you might solve the problem on a piece of paper. At some point, you get the idea to write the word twice in a sequence:

ABCDEF EFAB

ABCDEFABCDEF EFAB

Algorithm Checking whether one string can be present in the other if rotated can be solved very elegantly with the simple trick of writing the longer string behind the other. In the combination, you check whether the string to be searched for is contained there. With this approach, the solution is both surprisingly short and extremely simple:

def contains_rotation(str1, str2):

new_doubled_str1 = (str1 + str1).lower()

return str2.lower() in new_doubled_str1

Verification

For testing, use the following inputs, which show the correct operation:

@pytest.mark.parametrize("str1, str2, expected",

[("ABCD", "ABC", True),

("ABCDEF", "EFAB", True),

("BCDE", "EC", False),

("Challenge", "GECH", True)])

def test_contains_rotation(str1, str2, expected):

assert contains_rotation(str1, str2) == expected

4.3.9 Solution 9: Well Formed Braces (★★✩✩✩)

Write function check_braces(text) to check whether the sequence of round braces passed as string contains matching (properly nested) pairs of braces.

Examples

Input	Result	Comment
“(( ))”	True
“( )( )”	True
“(( )))((( ))”	False	Although it has the same amount of opening and closing braces, it is not properly nested.
“((( )”	False	No suitable bracing

Algorithm Without much consideration, you might be tempted to try all possible combinations. After some thinking, you probably come to the following optimization: You only count the number of opening braces and compare it with the number of closing braces. You have to consider the detail of a closing brace before an opening one. Proceed as follows: Traverse the string from front to back. If the current character is an opening brace, increase the counter for opening braces by one. If it is a closing brace, reduce the counter by one. If the counter fall below 0, you encounter a closing brace without a corresponding opening brace. In the end, the counter must be equal to 0, so that it represents a correct bracing.

def check_braces(text):

opening_count = 0

for ch in text:

if ch == "(":

opening_count += 1

elif ch == ")":

opening_count -= 1

if opening_count < 0:

return False

return opening_count == 0

Verification

Test your newly developed check for correct bracing with the following inputs for a parameterized test—using an additional hint parameter as a trick, which is not used for testing, but only for preparing an informative output:

@pytest.mark.parametrize("input, expected, hint",

[("(())", True, "ok"),

("()()", True, "ok"),

("(()))(())", False, "not properly nested"),

("(()", False, "no matching parenthesis"),

(")()", False, "starts with closing parenthesis")])

def test_check_braces(input, expected, hint):

assert check_braces(input) == expected

4.3.10 Solution 10: Anagram (★★✩✩✩)

The term “ anagram” is used to describe two strings that contain the same letters in the same frequency. Here, uppercase and lowercase should not make any difference. Write function is_anagram(str1, str2).

Examples

Input 1	Input 2	Result
“Otto”	“Toto”	True
“Mary	“Army”	True
“Ananas”	“Bananas”	False

Algorithm The description of the task provides hints on how you can proceed. First, you convert the words with function calc_char_frequencies(text) into a histogram. Here, you run character by character through the respective word and fill a dictionary. This is done for both words. Afterwards, to find a deviation, the resulting two dictionaries are compared with each other.

def is_anagram(str1, str2):

char_counts1 = calc_char_frequencies(str1)

char_counts2 = calc_char_frequencies(str2)

return char_counts1 == char_counts2

def calc_char_frequencies(text):

char_counts = {}

for current_char in text.upper():

if current_char in char_counts:

char_counts[current_char] += 1

else:

char_counts[current_char] = 1

return char_counts

Python shortcut The creation of the histogram (i. e., the counting of the letter frequencies) can be written a bit more compactly—but less comprehensibly for my taste. So let’s take advantage of the fact that setdefault() returns the current value for the key or else the default value given here in case of non-existence.

def calc_char_frequencies_shorter(text):

char_counts = {}

for current_char in text.upper():

char_counts[current_char] = char_counts.setdefault(current_char, 0) + 1

return char_counts

Verification

For testing, use the following inputs, which show the correct functioning:

@pytest.mark.parametrize("str1, str2, expected",

[("Otto", "Toto", True),

("Mary", "Army", True),

("Ananas", "Bananas", False)])

def test_is_anagram(str1, str2, expected):

assert is_anagram(str1, str2) == expected

4.3.11 Solution 11: Morse Code (★★✩✩✩)

Write function to_morse_code(text) that is capable of translating a given text into Morse code characters. They consist of sequences of one to four short and long tones per letter, symbolized by a dot (.) or dash (-). It is desirable for easier distinguishability to place a space between each tone and three spaces between each sequence of letter tones. Otherwise, S (...) and EEE (...) would not be distinguishable from each other.

For simplicity, limit yourself to the letters E, O, S, T, W with the following encoding:

Letter	Morse code
E	.
O	- - -
S	. . .
T	-
W	. - -

Examples

Input	Result
SOS	. . . - - - . . .
TWEET	- . - - . . -
WEST	. - - . . . . -

Algorithm The string is traversed character by character and the current character is mapped to the corresponding Morse code. The function convert_to_morse_code( current_char) performs this task.

def to_morse_code(text):

converted_msg = ""

for current_char in text.upper():

converted_letter = convert_to_morse_code(current_char)

converted_msg += converted_letter

converted_msg += " "

return converted_msg.strip()

While in other programming languages the mapping of single letters is accomplished using a switch, in Python there is a trick with a dictionary:

def convert_to_morse_code(current_char):

chars_to_morse = {"E": ".",

"O": "- - -",

"S": ". . .",

"T": "-",

"W": ". - -"}

return chars_to_morse[current_char]

Modern Python and match In many languages, case distinctions may be expressed using the if statement as well as the switch statement. The latter was missing in Python for a long time. With Python 3.10 comes match, an even more powerful variant for case discrimination with which we can now finally also realize the switch statement. Please consult section D.2 in Appendix D for more details.

As mentioned, you can use the new keywords match and case to formulate case distinctions like the following:

def convert_to_morse_code(current_char):

value = "?"

match current_char:

case "E": value = "."

case "O": value = "- - -"

case "S": value = ". . ."

case "T": value = "-"

case "W": value = ". - -"

return value

Bonus

Experiment and research a little bit to find out the corresponding Morse code for all letters of the alphabet so you can convert your name. You can find the necessary hints for this at https://en.wikipedia.org/wiki/Morse_code.

Verification

Let’s check it using a unit test as follows:

@pytest.mark.parametrize("input, expected",

[("SOS", ". . . - - - . . ."),

("TWEET", "- . - - . . -"),

("OST", "- - - . . . -"),

("WEST", ". - - . . . . -")])

def test_to_morse_code(input, expected):

assert to_morse_code(input) == expected

4.3.12 Solution 12: Pattern Checker (★★★✩✩)

Examples

Input pattern	Input text	Result
“xyyx”	“tim mike tim”	True
“xyyx”	“tim mike tom tim”	False
“xyxx”	“tim mike tim”	False
“xxxx”	“tim tim tim”	True

JOB INTERVIEW TIPS: PROBLEM SOLVING STRATEGIES

With exercises like this, you should always ask a few questions to clarify the context and gain a better understanding. For this example, possible questions include the following:

1.
Is the pattern limited to the characters x and y?

ANSWER: No, but only one letter each as a placeholder

2.
Is the pattern always only four characters long?

ANSWER: No, arbitrary

3.
Does the pattern ever contain spaces?

ANSWER: No

4.
Is the input always separated with exactly one space?

ANSWER: Yes

Algorithm As always, it is important first to understand the problem and identify appropriate data structures. You may recognize the pattern specification as a sequence of characters and the input values as space-separated words. They can be transformed into a corresponding list of individual values using split(). Initially, you check whether the length of the pattern and the list of input values match. Only in this case you run through the pattern character by character, as you have done so many times before. As an auxiliary data structure, you use a dictionary, which maps individual characters of the pattern to words. Now you check whether another word has already been inserted for a pattern character. With the help of this trick, you can easily detect mapping errors.

def matches_pattern(pattern, text):

# perparation

values = text.split(" ")

if len(values) != len(pattern) or (len(values) == 1 and not values[0]):

return False

placeholder_to_value_map = {}

# run through all characters

for i, pattern_char in enumerate(pattern):

value = values[i]

# add, if not already there

if pattern_char not in placeholder_to_value_map:

placeholder_to_value_map[pattern_char] = value

# does stored value match current string?

assigned_value = placeholder_to_value_map[(pattern_char)]

if not assigned_value == value:

return False

return True

In the code, before the actual check, you still need to verify the special case of an empty input explicitly, since "".split(" ") results in a list of length 1.

Verification

For testing, use the following inputs, which show the correct functionality:

@pytest.mark.parametrize("pattern, input, expected",

[("x", "", False),

("", "x", False)])

def test_matches_pattern_special_cases(pattern, input, expected):

assert matches_pattern(pattern, input) == expected

@pytest.mark.parametrize("pattern, input, expected",

[("xyyx", "tim mike mike tim", True),

("xyyx", "time mike tom tim", False),

("xyxx", "tim mike mike tim", False),

("xxxx", "tim tim tim tim", True)])

def test_matches_pattern(pattern, input, expected):

assert matches_pattern(pattern, input) == expected

4.3.13 Solution 13: Tennis Score (★★★✩✩)

Write function tennis_score(score, player1_name, player2_name) that makes an announcement in a familiar style such as Fifteen Love, Deuce, or Advantage Player X based on a textual score for two players, PL1 and PL2. The score is given in the format <PL1 points>:<PL2 points>.

The following counting rules apply to a game in tennis:

A game is won (Game <PlayerX>) when a player reaches four or more points and is ahead by at least two points.
Points from zero to three are named Love, Fifteen, Thirty, and Forty.
In case of at least three points and a tie, this is called Deuce.
With at least three points and one point difference, this is called Advantage <PlayerX> for the one who has one more point.

Examples

Input	Score
“1:0”, “Micha”, “Tim”	“Fifteen Love”
“2:2”, “Micha”, “Tim”	“Thirty Thirty”
“2:3”, “Micha”, “Tim”	“Thirty Forty”
“3:3”, “Micha”, “Tim”	“Deuce”
“4:3”, “Micha”, “Tim”	“Advantage Micha”
“4:4”, “Micha”, “Tim”	“Deuce”
“5:4”, “Micha”, “Tim”	“Advantage Micha”
“6:4”, “Micha”, “Tim”	“Game Micha”

Algorithm In this case, it is a two-step algorithm:

1.
First, a score in terms of two int values should be obtained from the textual representation.
2.
Afterwards, it is your task to generate the corresponding textual score names based on these values.

When parsing the score, you can rely on standard functionality such as split() and int(). In addition, for reusable functionality, it is reasonable to include certain security checks. First, both values should be positive. After that, the specifics on the scores are to be tested. The player who reaches four points first wins the match, but only if they lead at least with two points. If both players have three or more points, then the point difference must be less than three. Otherwise, it is not a valid state in tennis. You extract the parsing with the checks into the function extract_points(score).

def extract_points(score):

values = score.strip().split(":")

if len(values) != 2:

raise ValueError("illegal format -- score has not" +

"format <points>:<points>, e.,g. 7:6")

score1 = int(values[0])

score2 = int(values[1])

# sanity check

if score1 < 0 or score2 < 0:

raise ValueError("points must be > 0")

# prevents both e. g. 6:3 and 5:1

if (score1 > 4 or score2 > 4) and abs(score1 - score2) > 2:

raise ValueError("point difference must be < 3, " +

"otherwise invalid score")

return score1, score2

After extracting the two scores separated by ‘:’ from the input, you proceed with the conversion. Again, you use a multi-step decision procedure. According to the rules, a simple mapping comes into play for scores below three. This is perfectly described in terms of a dictionary. Starting from three points, a tie, advantage, or game win can occur. It is also possible for one player to win with four points if the other scores a maximum of two points. For the winning message, it is only necessary to determine which of the two players has more points. The described logic is implemented as follows:

def tennis_score(score, player1_name, player2_name):

points = extract_points(score)

score1 = points[0]

score2 = points[1]

if score1 >= 3 and score2 >= 3:

return generate_info(score1, score2, player1_name, player2_name)

elif score1 >= 4 or score2 >= 4:

return "Game " + (player1_name if (score1 > score2) else player2_name)

else:

# special naming

point_names = {0: "Love", 1: "Fifteen", 2: "Thirty", 3: "Forty"}

return point_names[score1] + " " + point_names[score2]

Only one last detail remains, namely the generation of the hint text for advantage or victory:

def generate_info(score1, score2, player1_name, player2_name):

score_difference = abs(score1 - score2)

player_name = player1_name if (score1 > score2) else player2_name

if score1 == score2:

return "Deuce"

if score_difference == 1:

return "Advantage " + player_name

if score_difference == 2:

return "Game " + player_name

raise ValueError("Unexpected difference: " + score_difference)

Verification

Let’s test the tennis scoring functionality with an imaginary gameplay:

@pytest.mark.parametrize("score, expected",

[("1:0", "Fifteen Love"),

("2:2", "Thirty Thirty"),

("2:3", "Thirty Forty"),

("3:3", "Deuce"),

("4:3", "Advantage Micha"),

("4:4", "Deuce"),

("5:4", "Advantage Micha"),

("6:4", "Game Micha")])

def test_tennis_score_hard_win(score, expected):

assert tennis_score(score, "Micha", "Tim") == expected

You should add more imaginary game sequences to neatly cover the edge cases of a close victory and an unchallenged victory:

@pytest.mark.parametrize("score, expected",

[("1:0", "Fifteen Love"),

("2:2", "Thirty Thirty"),

("3:2", "Forty Thirty"),

("4:2", "Game Micha")])

def test_tennis_score_normal_win(score, expected):

assert tennis_score(score, "Micha", "Tim") == expected

@pytest.mark.parametrize("score, expected",

[("1:0", "Fifteen Love"),

("2:0", "Thirty Love"),

("3:0", "Forty Love"),

("4:0", "Game Micha")])

def test_tennis_score_straight_win(score, expected):

assert tennis_score(score, "Micha", "Tim") == expected

4.3.14 Solution 14: Version Numbers (★★✩✩✩)

Write function compare_versions(version1, version2) that permits you to compare version numbers in the format MAJOR.MINOR.PATCH with each other, thereby the specification of PATCH is optional. In particular, the return value should be represented in the form of the characters <, =, and >.

Examples

Version 1	Version 2	Result
1.11.17	2.3.5	<
2.1	2.1.3	<
2.3.5	2.4	<
3.1	2.4	>
3.3	3.2.9	>
7.2.71	7.2.71	=

Algorithm Split the textual version numbers into a list containing version number components by calling split(). Loop through them and convert them to a version number using int(). Then compare in pairs using a separate function compare() starting at MAJOR, then MINOR and PATCH if necessary. If one input has more values than the other, then the single last number is not used except when the version number matches up to that component, such as for 3.1 and 3.1.7.

def compare_versions(version1, version2):

v1_numbers = version1.split(".")

v2_numbers = version2.split(".")

pos = 0

compare_result = "="

while pos < len(v1_numbers) and

pos < len(v2_numbers) and compare_result == "=":

current_v1 = int(v1_numbers[pos])

current_v2 = int(v2_numbers[pos])

compare_result = compare(current_v1, current_v2)

pos += 1

# same start about 3.1 and 3.1.7

if compare_result == "=":

return compare(len(v1_numbers), len(v2_numbers))

return compare_result

def compare(val1, val2):

if val1 < val2:

return "<"

if val1 > val2:

return ">"

return "="

Verification

Test the comparison of version numbers with the following inputs for a parameterized test:

@pytest.mark.parametrize("version1, version2, expected",

[("1.11.17", "2.3.5", "<"),

("2.3.5", "2.4", "<"),

("2.1", "2.1.3", "<"),

("3.1", "2.4", ">"),

("3.3", "3.2.9", ">"),

("7.2.71", "7.2.71", "=")])

def test_compare_versions(version1, version2, expected):

assert compare_versions(version1, version2) == expected

HINT: HANDLING OF TRAILING ZEROS

The assignment did not specify a special case, namely the treatment of additional zeros in version numbers, such as for 3.1 and 3.1.0 or 3 and 3.0.0. You will find an extension that handles these special features in the accompanying sources.

4.3.15 Solution 15: Conversion str_to_number (★★✩✩✩)

Convert a string into an integer. To do this, write function str_to_number(text) on your own.

Note

The conversion can be easily achieved with int(value). Do not use this explicitly, but implement the entire conversion yourself.

Examples

Input	Result
“+123”	123
“-123”	-123
“7271”	7271
“ABC”	ValueError
“0123”	83 (for bonus task)
“-0123”	-83 (for bonus task)
“0128”	ValueError (for bonus task)

Algorithm Let’s start brute force without looking for all details. So the initial step is checking if the first character is +/- and set flag is_negative accordingly. You also check if the first character is a sign (+ or -) to start processing the digits one position later if necessary. Then run through all characters and convert them to digits. The previous value is multiplied by 10 each time, and at the end, the corresponding numeric value results.

def str_to_number_first_try(text):

is_negative = text[0] == "-"

value = 0

startpos = 1 if starts_with_sign(text) else 0

for pos in range(startpos, len(text)):

digit_value = ord(text[pos]) - 48

value = value * 10 + digit_value

return -value if is_negative else value

def starts_with_sign(text):

return text[0] in ["-", "+"]

Corrected algorithm Even without a more thorough analysis, it is clear that the above variant does not work correctly when mixing letters with digits. In this case, it is reasonable to throw a ValueError when a check with isdigit() fails:

def str_to_number(text):

is_negative = text[0] == "-"

value = 0

startpos = 1 if starts_with_sign(text) else 0

for pos in range(startpos, len(text)):

if not text[pos].isdigit():

raise ValueError(text + " contains not only digits")

digit_value = ord(text[pos]) - 48

value = value * 10 + digit_value

return -value if is_negative else value

HINT: VARIANT

A variant would be to abort processing upon finding the first character that is not a digit, as implemented in Perl, for example. Then the number string “123ab4567” would become 123.

Verification

To test the functionality, use three numbers: one with a positive sign, one with a negative sign, and one without. The positive sign should just be ignored during the conversion. Check the reaction to input with letters instead of numbers separately and expect a ValueError.

@pytest.mark.parametrize("input, expected",

[("+123", 123), ("-123", -123),

("123", 123), ("7271", 7271)])

def test_str_to_number(input, expected):

assert str_to_number(input) == expected

def test_str_to_number_invalid_input():

with pytest.raises(ValueError):

str_to_number("ABC")

Bonus: Enable the Parsing of Octal Numbers

Octal numbers are identified in Python by a leading prefix 0 or, since Python 3.8, 0o and, according to their name, have base 8 rather than base 10. To support octal numbers, you must first determine whether the leading prefix exists. If this is the case, the factor for the positions in the number system must be changed to 8. Finally, with base 8, the two digits 8 and 9 are no longer allowed. Therefore, you add another check in the loop for processing the values. All in all, the source code is a bit bloated by the special treatments. The complexity is just manageable, especially because you use problem- adapted auxiliary functions with speaking names here.

def str_to_number_bonus(text):

is_negative = text[0] == "-"

is_octal = text[0:2] == '0o' or

(starts_with_sign(text) and text[1:3] == "0o")

value = 0

factor = 8 if is_octal else 10

startpos = calc_start_pos(text, is_octal)

for pos in range(startpos, len(text)):

if not text[pos].isdigit():

raise ValueError(text + " contains not only digits")

digit_value = ord(text[pos]) - 48

if is_octal and digit_value >= 8:

raise ValueError(text + " found digit >= 8")

value = value * factor + digit_value

return -value if is_negative else value

def calc_start_pos(text, is_octal):

pos = 0

if is_octal:

pos = 3 if starts_with_sign(text) else 2

elif starts_with_sign(text):

pos = 1

return pos

Verification

To test the functionality, use three numbers: one with a positive, one with a negative sign, and one without. The positive sign should just be ignored during the conversion. In addition, check a positive and negative octal number. In a separate test, it is ensured that digits greater than or equal to 8 must not occur in octal numbers.

@pytest.mark.parametrize("input, expected",

[("+123", 123), ("-123", -123),

("123", 123), ("7271", 7271),

("+0o77", 63), ("-0o77", -63),

("0o77", 63), ("+0o123", 83),

("-0o123", -83), ("0o123", 83)])

def test_str_to_number_bonus(input, expected):

assert str_to_number_bonus(input) == expected

def test_str_to_number_bonus_invalid_input():

with pytest.raises(ValueError) as excinfo:

str_to_number_bonus("0o128")

assert str(excinfo.value).find("found digit >= 8") != -1

4.3.16 Solution 16: Print Tower (★★★✩✩)

Write function print_tower(n) that represents a tower of n slices stacked on top of each other as ASCII graphics, symbolized by the character #. Also, draw a lower boundary line.

Example

A tower of height three should look something like this:

#|#

##|##

###|###

---------

Algorithm You can divide the drawing into three steps: draw the top bar, the slices, and then the bottom boundary. Thus, the algorithm can be described using three function calls:

def print_tower(height):

draw_top(height)

draw_slices(height)

draw_bottom(height)

You implement the drawing of the individual components of this tower in a couple of helper functions, as already indicated:

def draw_top(height):

print(" " * (height + 1) + "|")

def draw_bottom(height):

print("-" * ((height + 1) * 2 + 1))

Drawing the slices of the tower is a bit more complex due to their different sizes and the required computation of the free space on the left and right side:

def draw_slices(height):

for i in range(height - 1, -1, -1):

value = height - i

padding = i + 1

print(" " * padding + "#" * value + "|" + "#" * value)

It is obvious how the problem can be broken down into increasingly smaller subproblems. Each function becomes thereby short for itself and usually also well testable (if no console outputs, but computations with return take place).

Algorithm with recursion Interestingly, the drawing of the individual slices of the tower can also be expressed recursively as follows:

def draw_slices_rec(slice, height):

if slice > 1:

draw_slices_rec(slice - 1, height)

print(" " * (height - slice + 1) + "#" * slice + "|" + "#" * slice)

Then, the call must be minimally modified:

def print_tower_rec(height):

draw_top(height)

draw_slices_rec(height, height)

draw_bottom(height)

Verification

To check the functionality, use the Python command line interpreter one more time—here to print a tower of height four:

>>> print_tower(4)

#|#

##|##

###|###

####|####

-----------

4.3.17 Solution 17: Filled Frame (★★✩✩✩)

Write function print_box(width, height, fillchar) that draws a rectangle of the specified size as an ASCII graphic and fills it with the passed-in fill character.

Examples

Below you see two rectangles filled differently:

+-----+ +-------+

|*****| |$$$$$$$|

+-----+ |$$$$$$$|

|$$$$$$$|

+-------+

Algorithm To draw the filled frame, you traverse all lines and likewise all positions in the x-direction. In this exercise, the main concern is to correctly solve the special treatments at the corners and the edges with the index positions. In addition, with print() it is crucial to set the end character to empty to avoid the otherwise usual line break.

def print_box(width, height, fillchar):

for y in range(height):

for x in range(width):

if x == 0 and (y == 0 or y == height - 1):

print("+", end="")

elif x == width - 1 and (y == 0 or y == height - 1):

print("+", end="")

elif y == 0 or y == height - 1:

print("-", end="")

elif x == 0 or x == width - 1:

print("|", end="")

else:

print(fillchar, end="")

print()

Verification

To check the functionality, use the Python command line interpreter again:

>>> print_box(9, 7, "$")

+-------+

|$$$$$$$|

+-------+

4.3.18 Solution 18: Guessing Vowels (★★✩✩✩)

Input	Replacement	Result
“guide”	“?”	“g??d?”
“lawnmower”	“-”	“l-wnm-w-r”
“quiz”	“_”	“q z”
“lawnmower”	“”	“lwnmwr”

Algorithm To convert the given text, you traverse it character by character from front to back. You collect the result in a new string. If you find a vowel, you insert the given replacement string, otherwise the consonant (or more precisely, the original character, which could also be a digit or a punctuation mark).

def translate_vowel(text, replacement):

translated = ""

for letter in text:

if is_vowel(letter):

translated += replacement

else:

translated += letter

return translated

def is_vowel(letter):

return letter in "AÄEIOÖUüaäeioöuü"

Python shortcut Interestingly, strings in Python provide the method maketrans() to create a mapping dictionary and the function translate() to transform according to the passed mapping. Next, you implement the transformation as follows:

def translate_vowel_shorter(text, replacement):

vowels = "AÄEIOÖUüaäeioöuü"

trans_dict = text.maketrans(vowels, replacement * len(vowels))

return text.translate(trans_dict)

Verification

To check the functionality, use the Python command line interpreter:

>>> print(translate_vowel("guide", "?"))

... print(translate_vowel("guide", "-"))

... print(translate_vowel("table tennis", "_"))

... print(translate_vowel("quiz", "_"))

... print(translate_vowel("lawnmower", ""))

g??d?

g--dt_

bl_ t_nn_s

q__z

lwnmwr

The chosen algorithm is even capable of transforming entire sentences:

>>> print(translate_vowel("the guide recommends Java", "-"))

th- g--d- r-c-mm-nds J-v-

>>>

>> print(translate_vowel("fit through the Python challenge", "-"))

f-t thr--gh th- Pyth-n ch-ll-ng-

4.4 Summary: What You Learned

Strings are an integral part of almost every program. You built up a sound understanding by solving simple tasks like palindrome checking and string reversing. Other tasks can be significantly simplified using suitable auxiliary data structures, such as sets or dictionaries. They help when checking for well-formed braces, converting a word into Morse code, and other tasks. I hope you feel that solving problems is becoming easier the more basic knowledge you have in different areas, especially data structures.

That’s why you will have a deeper look into this topic in the next chapter, which introduces lists, sets, and dictionaries in more depth.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4. Strings

Create new playlist

Sign In

Sign Up

4. Strings

4.1 Introduction

4.1.1 Practically Relevant Functions

4.1.2 Example Conversions and Extractions

4.1.3 Equality

4.1.4 Slicing—Access to Individual Characters and Substrings

4.1.5 Converting a String into a List of Characters

4.1.6 Iteration

4.1.7 Formatted Output

4.1.8 Character Processing

4.1.9 Example: String Processing

4.2 Exercises

4.2.1 Exercise 1: Number Conversions (★★✩✩✩)

Examples

Exercise 1a (★✩✩✩✩)

Exercise 1b (★★✩✩✩)

Exercise 1c (★★✩✩✩)

4.2.2 Exercise 2: Joiner (★✩✩✩✩)

Example

4.2.3 Exercise 3: Reverse String (★★✩✩✩)

Examples

4.2.4 Exercise 4: Palindrome (★★★✩✩)

Exercise 4a (★★✩✩✩)

Examples

Exercise 4b (★★★✩✩)

4.2.5 Exercise 5: No Duplicate Chars (★★★✩✩)

Examples

4.2.6 Exercise 6: Remove Duplicate Letters (★★★✩✩)

Examples

4.2.7 Exercise 7: Capitalize (★★✩✩✩)

Exercise 7a (★★✩✩✩)

Examples

Exercise 7b: Modification (★★✩✩✩)

Exercise 7c: Special treatment (★★✩✩✩)

Example

4.2.8 Exercise 8: Rotation (★★✩✩✩)

Examples

4.2.9 Exercise 9: Well Formed Braces (★★✩✩✩)

Examples

4.2.10 Exercise 10: Anagram (★★✩✩✩)

Examples

4.2.11 Exercise 11: Morse Code (★★✩✩✩)

Examples

4.2.12 Exercise 12: Pattern Checker (★★★✩✩)

Examples

4.2.13 Exercise 13: Tennis Score (★★★✩✩)

Examples

4.2.14 Exercise 14: Version Numbers (★★✩✩✩)

Examples

4.2.15 Exercise 15: Conversion str_to_number (★★✩✩✩)

Examples

4.2.16 Exercise 16: Print Tower (★★★✩✩)

Example

4.2.17 Exercise 17: Filled Frame (★★✩✩✩)

Examples

4.2.18 Exercise 18: Guessing Vowels (★★✩✩✩)

4.3 Solutions

4.3.1 Solution 1: Number Conversions (★★✩✩✩)

Examples

Solution 1a (★✩✩✩✩)

Solution 1b (★★✩✩✩)

Solution 1c (★★✩✩✩)

Verification

4.3.2 Solution 2: Joiner (★✩✩✩✩)

Example

Verification

4.3.3 Solution 3: Reverse String (★★✩✩✩)

Examples

Verification

4.3.4 Solution 4: Palindrome (★★★✩✩)

Solution 4a (★★✩✩✩)

Examples

Solution 4b (★★★✩✩)

Verification

4.3.5 Solution 5: No Duplicate Chars (★★★✩✩)

Examples

Table of Contents for
4. Strings