Chapter 2
Introduction to Predictive and Specialized Modeling
Overview of Modeling Techniques
Predictive and Specialized Modeling provides details about more technical modeling techniques, such as Response Screening, Partitioning, and Neural Networks.
• The Modeling Utilities assist in the data cleaning and pre-processing stages of data analysis. Each utility has exploratory tools to give you a more thorough understanding of your data. See
Chapter 3, “Modeling Utilities”.
• The Neural platform implements a fully connected multi-layer perceptron with one or two layers. Use neural networks to predict one or more response variables using a flexible function of the input variables. See
Chapter 4, “Neural Networks”.
• The Partition platform recursively partitions data according to a relationship between the
X and
Y values, creating a decision tree of partitions. See
Chapter 5, “Partition Models”.
• The Bootstrap Forest platform enables you to fit an ensemble model by averaging many decision trees each of which is fit to a random subset of the training data. See
Chapter 6, “Bootstrap Forest”.
• The Boosted Tree platform produces an additive decision tree model that is composed of many smaller decision trees that are constructed in layers. The tree in each layer consists of a small number of splits, typically five or fewer. Each layer is fit using the recursive fitting methodology. See
Chapter 7, “Boosted Tree”.
• The K Nearest Neighbors platform predicts a response value for a given observation using the responses of the observations in that observation’s local neighborhood. It can be used with a categorical response for classification and with a continuous response for prediction. See
Chapter 8, “K Nearest Neighbors”.
• The Naive Bayes platform classifies observations into groups that are defined by the levels of a categorical response variable. The variables (or factors) that are used for classification are often called features in the data mining literature. See
Chapter 9, “Naive Bayes”.
• The Model Comparison platform lets you compare the predictive ability of different models. Measures of fit are provided for each model along with overlaid diagnostic plots. See
Chapter 10, “Model Comparison”.
• The Formula Depot platform enables you to organize, compare, profile, and score models for deployment. For model exploration work, you can use the Formula Depot to store candidate models outside of your JMP data table. See
Chapter 11, “Formula Depot”.
• The Fit Curve platform provides predefined models, such as polynomial, logistic, Gompertz, exponential, peak, and pharmacokinetic models. Compare different groups or subjects using a variety of analytical and graphical techniques. See
Chapter 12, “Fit Curve”.
• The Gaussian Process platform models the relationship between a continuous response and one or more continuous predictors. These models are common in areas like computer simulation experiments, such as the output of finite element codes, and they often perfectly interpolate the data. See
Chapter 14, “Gaussian Process”.
• The Response Screening platform automates the process of conducting tests across a large number of responses. Your test results and summary statistics are presented in data tables, rather than reports, to enable data exploration. See
Chapter 17, “Response Screening”.
• The Process Screening platform enables you to explore a large number of processes across time. The platform calculates control-chart, process stability, and process capability metrics, and detects large process shifts. See
Chapter 18, “Process Screening”.
• The Association Analysis platform enables you to identify items that have an affinity for each other. It is frequently used to analyze transactional data (also called market baskets) to identify items that often appear together in transactions. See
Chapter 20, “Association Analysis”