Introduction to Multivariate Analysis
Overview of Multivariate Techniques
This book describes the following techniques for analyzing several variables simultaneously:
• The Principal Components platform derives a small number of independent linear combinations (principal components) of a set of measured variables that capture as much of the variability in the original variables as possible. It is a useful exploratory technique and can help you to create predictive models. See
Chapter 4, “Principal Components”.
• The Discriminant platform looks to find a way to predict a classification (X) variable (nominal or ordinal) based on known continuous responses (Y). It can be regarded as inverse prediction from a multivariate analysis of variance (MANOVA). See
Chapter 5, “Discriminant Analysis”.
• The Partial Least Squares platform fits linear models based on factors, namely, linear combinations of the explanatory variables (Xs). PLS exploits the correlations between the Xs and the Ys to reveal underlying latent structures. See
Chapter 6, “Partial Least Squares Models”.
• The Hierarchical Cluster platform groups rows together that share similar values across a number of variables. It is a useful exploratory technique to help you understand the clumping structure of your data. See
Chapter 7, “Hierarchical Cluster”.
• The KMeans Clustering platform groups observations that share similar values across a number of variables. See
Chapter 8, “K Means Cluster”.
• The Normal Mixtures platform enables you to cluster observations when your data come from overlapping normal distributions.See
Chapter 9, “Normal Mixtures”.
• The Latent Class Analysis platform finds clusters of observations for categorical response variables.The model takes the form of a multinomial mixture model. See
Chapter 10, “Latent Class Analysis”.
• The Cluster Variables platform groups similar variables into representative groups. You can use Cluster Variables as a dimension-reduction method. Instead of using a large set of variables in modeling, the cluster components or the most representative variable in the cluster can be used to explain most of the variation in the data. See
Chapter 11, “Cluster Variables”.