Boolean Indexing

Filtering data from a dataset is one of the most common and basic operations. There are numerous ways to filter (or subset) data in pandas with boolean indexing. Boolean indexing (also known as boolean selection) can be a confusing term, but for the purposes of pandas, it refers to selecting rows by providing a boolean value (True or False) for each row. These boolean values are usually stored in a Series or NumPy ndarray and are usually created by applying a boolean condition to one or more columns in a DataFrame. We begin by creating boolean Series and calculating statistics on them and then move on to creating more complex conditionals before using boolean indexing in a wide variety of ways to filter data.

In this chapter, we will cover the following topics:

  • Calculating boolean statistics
  • Constructing multiple boolean conditions
  • Filtering with boolean indexing
  • Replicating boolean indexing with index selection
  • Selecting with unique and sorted indexes
  • Gaining perspective on stock prices
  • Translating SQL WHERE clauses
  • Determining the normality of stock market returns
  • Improving readability of boolean indexing with the query method
  • Preserving Series with the where method
  • Masking DataFrame rows
  • Selecting with booleans, integer location, and labels
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset