Before performing string comparisons, a standard operating procedure is to either uppercase or lowercase all values. The following recipe shows how to create and apply a custom function that uppercases all the values in a single column.
Let's recreate the DataFrame from the previous recipe:
import pandas as pd lc = pd.DataFrame({ 'people' : ["cole o'brien", "lise heidenreich", "zilpha skiles", "damion wisozk"], 'age' : [24, 35, 46, 57], 'ssn': ['6439', '689 24 9939', '306-05-2792', '992245832'], 'birth_date': ['2/15/54', '05/07/1958', '19XX-10-23', '01/26/0056'], 'customer_loyalty_level' : ['not at all', 'moderate', 'moderate', 'highly loyal']})
# Create the function to uppercase a string def uppercase_string(s): """ Standardizes a string by making it all caps :param s: string to uppercase :return: s """ try: s = s.upper() except: pass return s lc.customer_loyalty_level = lc.customer_loyalty_level.apply(uppercase_string)
After creating our DataFrame, we define a method that takes a string as an argument, and returns an uppercased copy of that string using the upper method from the Python String library to perform the uppercase. We then apply the new function to the customer_loyalty_level
column of our DataFrame.
Before we apply the function, the DataFrame looks as follows: