- Read in the flights dataset, and use the pivot_table method to find the total number of canceled flights per origin airport for each airline:
>>> flights = pd.read_csv('data/flights.csv')
>>> fp = flights.pivot_table(index='AIRLINE',
columns='ORG_AIR',
values='CANCELLED',
aggfunc='sum',
fill_value=0).round(2)
>>> fp.head()
- A groupby aggregation cannot directly replicate this table. The trick is to group by all the columns in the index and columns parameters first:
>>> fg = flights.groupby(['AIRLINE', 'ORG_AIR'])['CANCELLED'].sum()
>>> fg.head()
AIRLINE ORG_AIR
AA ATL 3
DEN 4
DFW 86
IAH 3
LAS 3
Name: CANCELLED, dtype: int64
- Use the unstack method to pivot the ORG_AIR index level to column names:
>>> fg_unstack = fg.unstack('ORG_AIR', fill_value=0)
>>> fp.equals(fg_unstack)
True
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.