There's more...

Unfortunately, pandas does not have a direct way to use these additional arguments when using multiple aggregation functions together. For example, if you wish to aggregate using the pct_between and mean functions, you will get the following exception:

>>> college.groupby(['STABBR', 'RELAFFIL'])['UGDS'] 
           .agg(['mean', pct_between], low=100, high=1000) 
TypeError: pct_between() missing 2 required positional arguments: 'low' and 'high'

Pandas is incapable of understanding that the extra arguments need to be passed to pct_between. In order to use our custom function with other built-in functions and even other custom functions, we can define a special type of nested function called a closure. We can use a generic closure to build all of our customized functions:

>>> def make_agg_func(func, name, *args, **kwargs):
        def wrapper(x):
            return func(x, *args, **kwargs)
        wrapper.__name__ = name
        return wrapper

>>> my_agg1 = make_agg_func(pct_between, 'pct_1_3k', low=1000, high=3000)
>>> my_agg2 = make_agg_func(pct_between, 'pct_10_30k', 10000, 30000)

>>> college.groupby(['STABBR', 'RELAFFIL'])['UGDS'] 
           .agg(['mean', my_agg1, my_agg2]).head()

The make_agg_func function acts as a factory to create customized aggregation functions. It accepts the customized aggregation function that you already built (pct_between in this case), a name argument, and an arbitrary number of extra arguments. It returns a function with the extra arguments already set. For instance, my_agg1 is a specific customized aggregating function that finds the percentage of schools with an undergraduate population between one and three thousand. The extra arguments (*args and **kwargs) specify an exact set of parameters for your customized function (pct_between in this case). The name parameter is very important and must be unique each time make_agg_func is called. It will eventually be used to rename the aggregated column.

A closure is a function that contains a function inside of it (a nested function) and returns this nested function. This nested function must refer to variables in the scope of the outer function in order to be a closure. In this example, make_agg_func is the outer function and returns the nested function wrapper, which accesses the variables func, args, and kwargs from the outer function.

Table of Contents for There's more...

Create new playlist

Sign In

Sign Up

Table of Contents for
There's more...