How it works...

The indexing operator changes behavior based on what type of object is passed to it. The following pseudocode outlines how DataFrame indexing operator handles the object that it is passed:

>>> df[item]  # Where `df` is a DataFrame and item is some object

If item is a string then
    Find a column name that matches the item exactly
    Raise KeyError if there is no match
    Return the column as a Series

If item is a list of strings then
    Raise KeyError if one or more strings in item don't match columns
    Return a DataFrame with just the columns in the list

If item is a slice object then
   Works with either integer or string slices
   Raise KeyError if label from label slice is not in index
   Return all ROWS that are selected by the slice

If item is a list, Series or ndarray of booleans then
   Raise ValueError if length of item not equal to length of DataFrame
   Use the booleans to return only the rows with True in same location

The preceding logic covers all the most common cases but is not an exhaustive list. The logic for a Series is slightly different and actually more complex than it is for a DataFrame. Due to its complexity, it is probably a good idea to avoid using just the indexing operator itself on a Series and instead use the explicit .iloc and .loc indexers.

One acceptable use case of the Series indexing operator is when doing boolean indexing. See Chapter 12, Index Alignment for more details.

I titled this type of row slicing in this section as lazy, as it does not use the more explicit .iloc or .loc. Personally, I always use these indexers whenever slicing rows, as there is never a question of exactly what I am doing.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...