Introducing pandas

Beside NumPy and SciPy, pandas is one of the most common scientific computing libraries for Python. Its authors aim to make pandas the most powerful and flexible open source data analysis and manipulation tool available in any language, and in fact, they are almost achieving that goal. Its powerful and efficient library is a perfect match for data scientists. Like other Python packages, Pandas can easily be installed via PyPI:

pip install pandas

First introduced in version 1.5, Matplotlib supports the use of pandas DataFrame as the input in various plotting classes. Pandas DataFrame is a powerful two-dimensional labeled data structure that supports indexing, querying, grouping, merging, and some other common relational database operations. DataFrame is similar to spreadsheets in the sense that each row of the DataFrame contains different variables of an instance, while each column contains a vector of a specific variable across all instances.

pandas DataFrame supports heterogeneous data types, such as string, integer, and float. By default, rows are indexed sequentially and columns are composed of pandas Series. Optional row labels or column labels can be specified through the index and columns attributes.

Table of Contents for Introducing pandas

Create new playlist

Sign In

Sign Up

Table of Contents for
Introducing pandas