How it works...

The indexing operator changes behavior based on what type of object is passed to it. The following pseudocode outlines how DataFrame indexing operator handles the object that it is passed:

>>> df[item]  # Where `df` is a DataFrame and item is some object

If item is a string then
Find a column name that matches the item exactly
Raise KeyError if there is no match
Return the column as a Series

If item is a list of strings then
Raise KeyError if one or more strings in item don't match columns
Return a DataFrame with just the columns in the list
If item is a slice object then
Works with either integer or string slices
Raise KeyError if label from label slice is not in index
Return all ROWS that are selected by the slice

If item is a list, Series or ndarray of booleans then
Raise ValueError if length of item not equal to length of DataFrame
Use the booleans to return only the rows with True in same location

The preceding logic covers all the most common cases but is not an exhaustive list. The logic for a Series is slightly different and actually more complex than it is for a DataFrame. Due to its complexity, it is probably a good idea to avoid using just the indexing operator itself on a Series and instead use the explicit .iloc and .loc indexers.

One acceptable use case of the Series indexing operator is when doing boolean indexing. See Chapter 12Index Alignment for more details.

I titled this type of row slicing in this section as lazy, as it does not use the more explicit .iloc or .loc. Personally, I always use these indexers whenever slicing rows, as there is never a question of exactly what I am doing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset