How it works...

The key to this recipe is recognizing that the columns all represent the same unit of information. We can compare these columns with each other, which is usually not the case. For instance, it wouldn't make sense to directly compare SAT verbal scores with the undergraduate population. As the data is structured in this manner, we can apply the idxmax method to each row of data to find the column with the largest value. We need to alter its default behavior with the axis parameter.

Step 2 completes this operation and returns a Series, to which we can now simply apply the value_counts method to return the distribution. We pass True to the normalize parameter as we are interested in the distribution (relative frequency) and not the raw counts.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset