Removing whitespace in Pandas

It is very common to find whitespace at the beginning, the end, or the inside of a string, whether it's data in a CSV file or data from another source. Use the following recipe to create a custom function to remove the whitespace from every row of a column in a Pandas DataFrame.

Getting ready

Continue using the customers DataFrame you created earlier, or import the file into a new DataFrame.

How to do it…

def remove_whitespace(x):
    """
    Helper function to remove any blank space from a string
    x: a string
    """
    try:
        # Remove spaces inside of the string
        x = "".join(x.split())

    except:
        pass
    return x

customers.last_name = customers.last_name.apply(remove_whitespace)

How it works…

We first create a custom function named remove_whitespace() that takes a string as an argument. The function removes any single space it finds, and returns the cleaned string. Then, just as in the previous recipe, we apply the function to the last_name column of the customers DataFrame.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset