There's more...

This recipe shows how to add Series with only a single index together. It is also entirely possible to add DataFrames together. Adding DataFrames together will align both the index and columns before computation and yield missing values for non-matching indexes. Let's start by selecting a few of the columns from the 2014 baseball dataset.

>>> df_14 = baseball_14[['G','AB', 'R', 'H']]
>>> df_14.head()
Let's also select a few of the same and a few different columns from the 2015 baseball dataset:
>>> df_15 = baseball_15[['AB', 'R', 'H', 'HR']]
>>> df_15.head()

Adding the two DataFrames together create missing values wherever rows or column labels cannot align. Use the style attribute to access the highlight_null method to easily see where the missing values are:

>>> (df_14 + df_15).head(10).style.highlight_null('yellow')

Only the rows with playerID appearing in both DataFrames will be non-missing. Similarly, the columns AB, H, and R are the only ones that appear in both DataFrames. Even if we use the add method with the fill_value parameter specified, we still have missing values. This is because some combinations of rows and columns never existed in our input data. For example, the intersection of playerID congeha01 and column G. He only appeared in the 2015 dataset that did not have the G column. Therefore, no value was filled with it:

>>> df_14.add(df_15, fill_value=0).head(10) 
.style.highlight_null('yellow')
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset