Example of Gaussian Process
This example uses data from a space filling design in two variables with a deterministic equation for Y (the response). You can use the Gaussian Process platform to find the explanatory power of X1 and X2 on Y. You can view the equation for Y in the column formula.
1. Select Help > Sample Data Library and open 2D Gaussian Process Example.jmp.
2. Select Analyze > Specialized Modeling > Gaussian Process.
3. Select X1 and X2 and click X.
4. Select Y and click Y
5. Select Correlation Type > Cubic
6. Deselect Fast GASP.
7. Click OK.
Figure 14.2 Gaussian Process Report
Gaussian Process Report
Note: The estimated parameters can be different due to different starting points in the minimization routine, the choice of correlation type, and the inclusion of a nugget parameter.
Now, visualize the fitted surface compared to the original surface.
8. Click the red triangle next to Gaussian Process Model of Y and select Save Prediction Formula.
9. Select Graph > Surface Plot.
10. Select X1 through Y Prediction Formula and click Columns.
11. Click OK.
12. In the Surface column, select Both sides for the Y Prediction Formula.
Figure 14.3 3D Surface Plot of the Actual and Predicted Ys
3D Surface Plot of the Actual and Predicted Ys
The two surfaces are similar. The impact of X1 and X2 on the response Y can be visualized. You can rotate the plot to view it from different angles. Marginal plots are another tool to use to understand the impact of the factors on the response.
Launch the Gaussian Process Platform
Launch the Gaussian Process platform by selecting Analyze > Specialized Modeling > Gaussian Process.
Figure 14.4 Gaussian Process Launch Window
Gaussian Process Launch Window
Y
Assigns the continuous columns to analyze.
X
Assigns the columns to use as explanatory variables. Categorical variables are allowed in JMP Pro when the Fast GASP option is specified.
Estimate Nugget Parameter
introduces a ridge parameter into the estimation procedure. A ridge parameter is useful if there is noise or randomness in the response, and you want the prediction model to smooth over the noise instead of perfectly interpolating.
Image shown hereFast GASP
Option to use the Fast GASP algorithm. Fast GASP breaks the Gaussian process model into small pieces (called blocks) to speed computation time. Blocks allow for the use of multiple CPUs and parallel processing.
Note: When there are more than 2,500 observations, the Fast GASP algorithm is required.
For additional information about Fast GASP, see Parker (2015).
Correlation Type
Choose the correlation structure for the model. The platform fits a spatial correlation model to the data, where the correlation of the response between two observations decreases as the values of the independent variables become more distant.
Gaussian restricts the correlation between two points to always be nonzero, no matter the distance between the points.
Cubic allows the correlation between two points to be zero for points that are far enough apart. This method is a generalization of a cubic spline.
Image shown hereThe Fast GASP algorithm does not support the cubic correlation function.
Minimum Theta Value
Sets the minimum theta value to use in the fitted model. The default is 0. The theta values are analogous to a slope parameter in regular regression models. Small theta values indicate that a variable has little influence on the predicted values.
Image shown hereBlock Size
Number of observations in each computational block used by the Fast GASP algorithm. There must be at least 25 observations per block and a maximum of the number of rows in the data set up to a maximum of 2,500.
The Gaussian Process Report
The initial Gaussian Process report shows the actual by predicted plot and a model report. The marginal plots for each factor are initially hidden.
Actual by Predicted Plot
The Actual by Predicted plot shows the actual Y values on the y-axis and the jackknife predicted values on the x-axis. One measure of goodness-of-fit is how well the points lie along the diagonal (Y = X) of the plot.
The jackknife values are not true jackknife values in that the model is not re-fit with the associated row for each Y excluded. Rather, the row is excluded from the prediction model for each associated Y but the correlation parameters retain the contribution of the row in them. For Gaussian processes that perfectly interpolate the data this jackknife procedure provides predictions that are not equal to the input.
Model Report
The Model Report shows a functional ANOVA table for the model parameter estimates. Specifically, it is an analysis of variance table where the variation is computed using a function-driven method.
Theta
Gaussian Process model parameter estimates.
Total Sensitivity
Sum of the main effect and all interaction terms for each factor. It is a measure of the amount of influence a factor and all its two-way interactions have on the response variable.
Total variation is the integrated variability over the entire experimental space.
Main Effect
The functional main effect of each factor is the integrated total variation due to that factor alone. The main effect is the ratio of the functional effect and the total variation for each factor in the model.
Interactions
Functional interaction effects are computed in a similar way to main effects.
Mu and Sigma2
Mean and variance model parameters.
Nugget
Estimated nugget value. A nugget value is reported if you selected estimate nugget parameter in the Gaussian Process launch window. A nugget value is also reported if JMP has added a nugget parameter in order to avoid a singular covariance matrix.
-2LogLikelihood
Estimated value of the minimized -2log likelihood function.
Marginal Model Plots
A marginal plot appears for each factor in the model. It shows the response across the levels of a factor where all other factors are set to their average value.
Gaussian Process Platform Options
Use the options in the Gaussian Process red triangle menu to customize the report according to your individual needs.
Profiler
Opens the standard Profiler.
Contour Profiler
Opens the Contour Profiler.
Surface Profiler
Opens the Surface Profiler.
For more details about these profilers, see the Profilers book.
Save Prediction Formula
Creates a new column in the active data table containing the prediction formula.
Save Variance Formula
Creates a new column in the active data table containing the variance formula.
Image shown herePublish Prediction Formula
Creates a prediction formula and saves it as a formula column script in the Formula Depot platform. If a Formula Depot report is not open, this option creates a Formula Depot report. See the “Formula Depot” chapter.
Image shown herePublish Variance Formula
Creates a variance formula and saves it as a formula column script in the Formula Depot platform. If a Formula Depot report is not open, this option creates a Formula Depot report. See the “Formula Depot” chapter.
Save Jackknife Predicted Values
Saves the jackknife predicted values to the active data table. These are the x-axis values for the Actual by Predicted Plot.
See the JMP Reports chapter in the Using JMP book for more information about the following options:
Local Data Filter
Shows or hides the local data filter that enables you to filter the data used in a specific report.
Redo
Contains options that enable you to repeat or relaunch the analysis. In platforms that support the feature, the Automatic Recalc option immediately reflects the changes that you make to the data table in the corresponding report window.
Save Script
Contains options that enable you to save a script that reproduces the report to several destinations.
Save By-Group Script
Contains options that enable you to save a script that reproduces the platform report for all levels of a By variable to several destinations. Available only when a By variable is specified in the launch window.
Additional Example of the Gaussian Process Platform
This example uses data that demonstrates the flow of water through a Borehole that is drilled from the ground surface through two aquifers. Given a specified engineering model the Gaussian process lets us understand the impact of factors included in the model on the response, Y.
1. Select Help > Sample Data Library > Design Experiment and open Borehole Latin Hypercube.jmp.
2. Select Analyze > Specialized Modeling > Gaussian Process.
3. Select log10 Rw through Kw and click X.
4. Select Y and click Y.
5. Image shown hereIn JMP Pro, to run the analysis faster, leave the Fast GASP checked.
6. Click OK.
Figure 14.5 Borehole Latin Hypercube Report
Borehole Latin Hypercube Report
The data on the actual by predicted plot fall along the Y = X line, indicating that the Gaussian process prediction model is a good approximation of the true function. In the Model Report, you see that the first factor, log10 Rw, has the highest total sensitivity. The estimated total sensitivity for log10 Rw explains more than 90% of the variation in the response. Factors with small theta values have little (or no) impact on the prediction formula.
Note: Your estimates can differ from those shown in Figure 14.5, which were found using the Fast GASP algorithm.
Statistical Details for the Gaussian Process Platform
The Gaussian correlation structure uses the product exponential correlation function with a power of 2 as the estimated model. This model assumes that Y is Normally distributed with mean μ and covariance matrix σ2R. The R matrix consists of the following elements:
Equation shown here
The model parameters are fit via maximum likelihood. The fitted parameters are provided in the platform report. The parameters are as follows:
μ is the Normal distribution mean,
σ2 is the Normal Distribution variance,
Theta corresponds to the values of θk in the definition of R.
Note: If you see Nugget parameters set to avoid singular variance matrix, JMP has added a ridge parameter to the variance matrix so that it is invertible.
The Cubic correlation structure also assumes that Y is Normally distributed with mean μ and covariance matrix σ2R. The R matrix consists of the following elements:
Equation shown here      d = xik–xjk
where
Equation shown here
For more information, see Santer (2003). The theta parameter used in the cubic correlation is the reciprocal of the parameter often used in the literature. The reciprocal is used so that when theta has no effect on the model, then it has a value of zero, rather than infinity.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset