How to do it...

  1. Read in the flights dataset, and use the pivot_table method to find the total number of canceled flights per origin airport for each airline:
>>> flights = pd.read_csv('data/flights.csv')
>>> fp = flights.pivot_table(index='AIRLINE',
columns='ORG_AIR',
values='CANCELLED',
aggfunc='sum',
fill_value=0).round(2)
>>> fp.head()
  1. A groupby aggregation cannot directly replicate this table. The trick is to group by all the columns in the index and columns parameters first:
>>> fg = flights.groupby(['AIRLINE', 'ORG_AIR'])['CANCELLED'].sum()
>>> fg.head()
AIRLINE ORG_AIR AA ATL 3 DEN 4 DFW 86 IAH 3 LAS 3 Name: CANCELLED, dtype: int64
  1. Use the unstack method to pivot the ORG_AIR index level to column names:
>>> fg_unstack = fg.unstack('ORG_AIR', fill_value=0)
>>> fp.equals(fg_unstack)
True
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset