Single and multivariate analysis

Statistics, from a certain perspective, is the practice of studying variables, and specifically the observation of those variables. Much of statistics is based upon doing this analysis for a single variable, which is referred to as univariate analysis. Univariate analysis is the simplest form of analyzing data. It does not deal with causes or relationships and is normally used to describe or summarize data, and to find patterns in it.

Multivariate analysis is a modeling technique where there exist two or more output variables that affect the outcome of an experiment. Multivariate analysis is often related to concepts such as correlation and regression, which help us understand the relationships between multiple variables, as well as how those relationships affect the outcome.

pandas primarily provides fundamental univariate analysis capabilities. And these capabilities are generally descriptive statistics, although there is inherent support for concepts such as correlations (as they are very common in finance and other domains).

Other more complex statistics can be performed with StatsModels. Again, this is not per se a weakness of pandas, but a specific design decision to let those concepts be handled by other dedicated Python libraries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset