Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

How it works...

How to do it...

Read in the college dataset, group by state, and display the total number of groups. This should equal the number of unique states retrieved from the nunique Series method:

>>> college = pd.read_csv('data/college.csv', index_col='INSTNM')
>>> grouped = college.groupby('STABBR')
>>> grouped.ngroups
59

>>> college['STABBR'].nunique() # verifying the same number
59

The grouped variable has a filter method, which accepts a custom function that determines whether a group is kept or not. The custom function gets implicitly passed a DataFrame of the current group and is required to return a boolean. Let's define a function that calculates the total percentage of minority students and returns True if this percentage is greater than a user-defined threshold:

>>> def check_minority(df, threshold):
        minority_pct = 1 - df['UGDS_WHITE']
        total_minority = (df['UGDS'] * minority_pct).sum()
        total_ugds = df['UGDS'].sum()
        total_minority_pct = total_minority / total_ugds
        return total_minority_pct > threshold

Use the filter method passed with the check_minority function and a threshold of 50% to find all states that have a minority majority:

>>> college_filtered = grouped.filter(check_minority, threshold=.5)
>>> college_filtered.head()

Just looking at the output may not be indicative of what actually happened. The DataFrame starts with state Arizona (AZ) and not Alaska (AK) so we can visually confirm that something changed. Let's compare the shape of this filtered DataFrame with the original. Looking at the results, about 60% of the rows have been filtered, and only 20 states remain that have a minority majority:

>>> college.shape
(7535, 26)

>>> college_filtered.shape
(3028, 26)

>>> college_filtered['STABBR'].nunique()
20

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.