Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

How it works...

How to do it...

Read in the flights dataset, and use the pivot_table method to find the total number of canceled flights per origin airport for each airline:

>>> flights = pd.read_csv('data/flights.csv')
>>> fp = flights.pivot_table(index='AIRLINE', 
                             columns='ORG_AIR', 
                             values='CANCELLED', 
                             aggfunc='sum',
                             fill_value=0).round(2)
>>> fp.head()

A groupby aggregation cannot directly replicate this table. The trick is to group by all the columns in the index and columns parameters first:

>>> fg = flights.groupby(['AIRLINE', 'ORG_AIR'])['CANCELLED'].sum()
>>> fg.head()
AIRLINE  ORG_AIR
AA       ATL         3
         DEN         4
         DFW        86
         IAH         3
         LAS         3
Name: CANCELLED, dtype: int64

Use the unstack method to pivot the ORG_AIR index level to column names:

>>> fg_unstack = fg.unstack('ORG_AIR', fill_value=0)
>>> fp.equals(fg_unstack)
True

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.