Overview of Space-Filling Designs
Space-filling designs are useful for modeling systems that are deterministic or near-deterministic. One example of a deterministic system is a computer simulation. Such simulations can be very complex involving many variables with complicated interrelationships. A goal of designed experiments on these systems is to find a simpler empirical model that adequately predicts the behavior of the system over limited ranges of the factors.
In experiments on systems where there is substantial random noise, the goal is to minimize the variance of prediction. In experiments on deterministic systems, there is no variance but there is bias. Bias is the difference between the approximation model and the true mathematical function. The goal of space-filling designs is to bound the bias.
There are two schools of thought on how to bound the bias. One approach is to spread the design points out as far from each other as possible consistent with staying inside the experimental boundaries. The other approach is to space the points out evenly over the region of interest.
The Space Filling designer supports the following design methods:
Note: If the number of runs is 500 or less, a Gaussian Process model is saved to the data table. If the number of runs exceeds 500, a Neural model is saved to the data table.
Sphere Packing
maximizes the minimum distance between pairs of design points. See “Sphere-Packing Designs” and “Create the Sphere-Packing Design for the Borehole Data”.
Latin Hypercube
maximizes the minimum distance between design points but requires even spacing of the levels of each factor. This method produces designs that mimic the uniform distribution. The Latin Hypercube method is a compromise between the Sphere-Packing method and the Uniform design method. See “Latin Hypercube Designs”.
Uniform
minimizes the discrepancy between the design points (which have an empirical uniform distribution) and a theoretical uniform distribution. See “Uniform Designs”.
Minimum Potential
spreads points out inside a sphere around the center. See “Minimum Potential Designs”.
Maximum Entropy
measures the amount of information contained in the distribution of a set of data. See “Maximum Entropy Designs”.
Gaussian Process IMSE Optimal
creates a design that minimizes the integrated mean squared error of the Gaussian process over the experimental region. See “Gaussian Process IMSE Optimal Designs”.
Fast Flexible Filling
The Fast Flexible Filling method forms clusters from random points in the design space. These clusters are used to choose design points according to an optimization criterion. This is the only method that can accommodate categorical factors and constraints on the design space. You can specify linear constraints and disallowed combinations. See “Fast Flexible Filling Designs” and “Creating and Viewing a Constrained Fast Flexible Filling Design”.
Space Filling Design Window
The Space Filling Design window updates as you work through the design steps. The outlines that appear, separated by buttons that update the window, follow the flow in Figure 21.2.
Figure 21.2 Space Filling Design Flow
Space Filling Design Flow
This section describes the outlines in the Space Filling Design window.
Responses
Use the Responses outline to specify one or more responses.
Tip: When you have completed the Responses outline, consider selecting Save Responses from the red triangle menu. This option saves the response names, goals, limits, and importance values in a data table that you can later reload in DOE platforms.
Figure 21.3 Responses Outline
Responses Outline
Add Response
Enters a single response with a goal type of Maximize, Match Target, Minimize, or None. If you select Match Target, enter limits for your target value. If you select Maximize or Minimize, entering limits is not required but can be useful if you intend to use desirability functions.
Remove
Removes the selected responses.
Number of Responses
Enters additional responses so that the number that you enter is the total number of responses. If you have entered a response other than the default Y, the Goal for each of the additional responses is the Goal associated with the last response entered. Otherwise, the Goal defaults to Match Target. Click the Goal type in the table to change it.
The Responses outline contains the following columns:
Response Name
The name of the response. When added, a response is given a default name of Y, Y2, and so on. To change this name, double-click it and enter the desired name.
Goal, Lower Limit, Upper Limit
The Goal tells JMP whether you want to maximize your response, minimize your response, match a target, or that you have no response goal. JMP assigns a Response Limits column property, based on these specifications, to each response column in the design table. It uses this information to define a desirability function for each response. The Profiler and Contour Profiler use these desirability functions to find optimal factor settings. For further details, see the Profiler chapter in the Profilers book and “Response Limits” in the “Column Properties” appendix.
A Goal of Maximize indicates that the best value is the largest possible. If there are natural lower or upper bounds, you can specify these as the Lower Limit or Upper Limit.
A Goal of Minimize indicates that the best value is the smallest possible. If there are natural lower or upper bounds, you can specify these as the Lower Limit or Upper Limit.
A Goal of Match Target indicates that the best value is a specific target value. The default target value is assumed to be midway between the Lower Limit and Upper Limit.
A Goal of None indicates that there is no goal in terms of optimization. No desirability function is constructed.
Note: If your target response is not midway between the Lower Limit and the Upper Limit, you can change the target after you generate your design table. In the data table, open the Column Info window for the response column (Cols > Column Info) and enter the desired target value.
Importance
When you have several responses, the Importance values that you specify are used to compute an overall desirability function. These values are treated as weights for the responses. If there is only one response, then specifying the Importance is unnecessary because it is set to 1 by default.
Editing the Responses Outline
In the Responses outline, note the following:
Double-click a response to edit the response name.
Click the goal to change it.
Click on a limit or importance weight to change it.
For multiple responses, you might want to enter values for the importance weights.
Response Limits Column Property
The Goal, Lower Limit, Upper Limit, and Importance that you specify when you enter a response are used in finding optimal factor settings. For each response, the information is saved in the generated design data table as a Response Limits column property. JMP uses this information to define the desirability function. The desirability function is used in the Prediction Profiler to find optimal factor settings. For further details about the Response Limits column property and examples of its use, see “Response Limits” in the “Column Properties” appendix.
If you do not specify a Lower Limit and Upper Limit, JMP uses the range of the observed data for the response to define the limits for the desirability function. Specifying the Lower Limit and Upper Limit gives you control over the specification of the desirability function. For more details about the construction of the desirability function, see the Profiler chapter in the Profilers book.
Factors
Add factors in the Factors outline.
Figure 21.4 Factors Outline
Factors Outline
The Factors outline contains these options:
Continuous
Enters the number of continuous factors specified in Add N Factors.
Categorical
Enters the number of nominal factors specified in Add N Factors.
Remove
Removes the selected factors.
Add N Factors
Adds multiple factors of a given type. Enter the number of factors to add and click Continuous or Categorical. Repeat Add N Factors to add multiple factors of different types.
Tip: When you have completed your Factors panel, select Save Factors from the red triangle menu. This saves the factor names and values in a data table that you can later reload. See “Space Filling Design Options”.
Factors Outline
The Factors outline contains the following columns:
Name
The name of the factor. When added, a factor is given a default name of X1, X2, and so on. To change this name, double-click it and enter the desired name.
Role
Specifies the Design Role of the factor. The Design Role column property for the factor is saved to the data table. This property ensures that the factor type is modeled appropriately.
Values
The experimental settings for the factors. To insert Values, click on the default values and enter the desired values.
Editing the Factors Outline
In the Factors outline, note the following:
To edit a factor name, double-click the factor name.
Categorical factors have a down arrow to the left of the factor name. Click the arrow to add a level.
To remove a factor level, click the value, click Delete, and click outside the text box.
To edit a value, click the value in the Values column.
Factor Types
Continuous
Numeric data types only. A continuous factor is a factor that you can conceptually set to any value between the lower and upper limits you supply, given the limitations of your process and measurement system.
Categorical
Either numeric or character data types. For a categorical factor, the value ordering is the order of the values as entered from left to right. This ordering is saved in a Value Ordering column property after the design data table is created.
Factor Column Properties
For each factor, various column properties are saved to the data table.
Design Role
Each factor is assigned the Design Role column property. The Role that you specify in defining the factor determines the value of its Design Role column property. The Design Role property reflects how the factor is intended to be used in modeling the experimental data. Design Role values are used in the Augment Design platform.
Factor Changes
Each factor is assigned the Factor Changes column property with a setting of Easy. In space-filling designs, it is assumed that factor levels can be changed for each experimental run. Factor Changes values are used in the Evaluate Design and Augment Design platforms.
Coding
If the Role is Continuous, the Coding column property for the factor is saved. This property transforms the factor values so that the low and high values correspond to –1 and +1, respectively. The estimates and tests in the Fit Least Squares report are based on the transformed values.
Value Ordering
If the Role is Categorical or Blocking, the Value Ordering column property for the factor is saved. This property determines the order in which levels of the factor appear.
Define Factor Constraints
Note: Constraints can be specified only for designs constructed using the Fast Flexible Filling method.
Use Define Factor Constraints to restrict the design space. Unless you have loaded a constraint or included one as part of a script, the None option is selected. To specify constraints, select one of the other options:
Specify Linear Constraints
Specifies inequality constraints on linear combinations of factors. Only available for factors with a Role of Continuous or Mixture. See “Specify Linear Constraints”.
Note: When you save a script for a design that involves a linear constraint, the script expresses the linear constraint as a less than or equal to inequality (Equation shown here).
Use Disallowed Combinations Filter
Defines sets of constraints based on restricting values of individual factors. You can define both AND and OR constraints. See “Use Disallowed Combinations Filter”.
Use Disallowed Combinations Script
Defines disallowed combinations and other constraints as Boolean JSL expressions in a script editor box. See “Use Disallowed Combinations Script”.
Specify Linear Constraints
In cases where it is impossible to vary continuous factors independently over the design space, you can specify linear inequality constraints. Linear inequalities describe factor level settings that are allowed.
Click Add to enter one or more linear inequality constraints.
Add
Adds a template for a linear expression involving all the continuous factors in your design. Enter coefficient values for the factors and select the direction of the inequality to reflect your linear constraint. Specify the constraining value in the box to the right of the inequality. To add more constraints, click Add again.
Note: The Add option is disabled if you have already constrained the design region by specifying a Sphere Radius.
Remove Last Constraint
Removes the last constraint.
Check Constraints
Checks the constraints for consistency. This option removes redundant constraints and conducts feasibility checks. A JMP alert appears if there is a problem. If constraints are equivalent to bounds on the factors, a JMP alert indicates that the bounds in the Factors outline have been updated.
Use Disallowed Combinations Filter
This option uses an adaptation of the Data Filter to facilitate specifying disallowed combinations. For detailed information about using the Data Filter, see the JMP Reports chapter in the Using JMP book.
Select factors from the Add Filter Factors list and click Add. Then specify the disallowed combinations by using the slider (for continuous factors) or by selecting levels (for categorical factors).
The red triangle options for the Add Filter Factors menu are those found in the Select Columns panel of many platform launch windows. See the Get Started chapter in the Using JMP book for additional details about the column selection menu.
When you click Add, the Disallowed Combinations control panel shows the selected factors and provides options for further control. Factors are represented as follows, based on their modeling types:
Continuous Factors
For a continuous factor, a double-arrow slider that spans the range of factor settings appears. An expression that describes the range using an inequality appears above the slider. You can specify disallowed settings by dragging the slider arrows or by clicking on the inequality bounds in the expression and entering your desired constraints. In the slider, a solid blue highlight represents the disallowed values.
Categorical Factor
For a categorical factor, the possible levels are displayed either as labeled blocks or, when the number of levels is large, as list entries. Select a level to disallow it. To select multiple levels, hold the Control key. The block or list entries are highlighted to indicate the levels that have been disallowed. When you add a categorical factor to the Disallowed Combinations panel, the number of levels of the categorical factor is given in parentheses following the factor name.
Disallowed Combinations Options
The control panel has the following controls:
Clear
Clears all disallowed factor level settings that you have specified. This does not clear the selected factors.
Start Over
Removes all selected factors and returns you to the initial list of factors.
AND
Opens the Add Filter Factors list. Selected factors become an AND group. Any combination of factor levels specified within an AND group is disallowed.
To add a factor to an AND group later on, click the group’s outline to see a highlighted rectangle. Select AND and add the factor.
To remove a single factor, select Delete from its red triangle menu.
OR
Opens the Add Filter Factors list. Selected factors become a separate AND group. For AND groups separated by OR, a combination is disallowed if it is specified in at least one AND group.
Red Triangle Options for Factors
A factor can appear in several OR groups. An occurrence of the factor in a specific OR group is referred to as an instance of the factor.
Delete
Removes the selected instance of the factor from the Disallowed Combinations panel.
Clear Selection
Clears any selection for that instance of the factor.
Invert Selection
Deselects the selected values and selects the values not previously selected for that instance of the factor.
Display Options
Available only for categorical factors. Changes the appearance of the display. Options include:
Blocks Display shows each level as a block.
List Display shows each level as a member of a list.
Single Category Display shows each level.
Check Box Display adds a check box next to each value.
Find
Available only for categorical factors. Provides a text box beneath the factor name where you can enter a search string for levels of the factor. Press the Enter key or click outside the text box to perform the search. Once Find is selected, the following Find options appear in the red triangle menu:
Clear Find clears the results of the Find operation and returns the panel to its original state.
Match Case uses the case of the search string to return the correct results.
Contains searches for values that include the search string.
Does not contain searches for values that do not include the search string.
Starts with searches for values that start with the search string.
Ends with searches for values that end with the search string.
Use Disallowed Combinations Script
Use this option to disallow particular combinations of factor levels using a JSL script. This option can be used with continuous factors or mixed continuous and categorical factors.
This option opens a script window where you insert a script that identifies the combinations that you want to disallow. The script must evaluate as a Boolean expression. When the expression evaluates as true, the specified combination is disallowed.
When forming the expression for a categorical factor, use the ordinal value of the level instead of the name of the level. If a factor’s levels are high, medium, and low, specified in that order in the Factors outline, their associated ordinal values are 1, 2, and 3. For example, suppose that you have two continuous factors, X1 and X2, and a categorical factor X3 with three levels: L1, L2, and L3, in order. You want to disallow levels where the following holds:
Equation shown here
Enter the expression (Exp(X1) + 2*X2 < 0) & (X3 == 2) into the script window.
Figure 21.5 Expression in Script Editor
Expression in Script Editor
(In the figure, unnecessary parentheses were removed by parsing.) Notice that functions can be entered as part of the Boolean expression.
Space Filling Design Methods
The following methods for constructing space-filling designs are available:
Design
The Design outline shows the runs for the space-filling screening design.
Design Diagnostics
The Design Diagnostics outline shows the values for the factors scaled from zero to one. The Minimum Distance is based on these scaled values and is the minimum distance from each point to its nearest neighbor. The row number for the nearest neighbor is given in the Nearest Point column. The discrepancy value shown below the table is the integrated difference between the design points based and a uniform distribution.
Design Table
Make Table
Constructs the Space Filling Design data table.
Back
Takes you back to where you were before clicking Make Design. You can make changes to the previous outlines and regenerate the design.
Space Filling Design Options
The red triangle menu in the Space Filling Design platform contains these options:
Save Responses
Saves the information in the Responses panel to a new data table. You can then quickly load the responses and their associated information into most DOE windows. This option is helpful if you anticipate re-using the responses.
Load Responses
Loads responses that you saved using the Save Responses option.
Save Factors
Saves the information in the Factors panel to a new data table. Each factor’s column contains its levels. Other information is stored as column properties. You can then quickly load the factors and their associated information into most DOE windows.
Note: It is possible to create a factors table by entering data into an empty table, but remember to assign each column an appropriate Design Role. Do this by right-clicking on the column name in the data grid and selecting Column Properties > Design Role. In the Design Role area, select the appropriate role.
Load Factors
Loads factors that you saved using the Save Factors option.
Save Constraints
(Unavailable for some platforms) Saves factor constraints that you defined in the Define Factor Constraints or Linear Constraints outline into a data table, with a column for each constraint. You can then quickly load the constraints into most DOE windows.
In the constraint table, the first rows contain the coefficients for each factor. The last row contains the inequality bound. Each constraint’s column contains a column property called ConstraintState that identifies the constraint as a “less than” or a “greater than” constraint. See “ConstraintState” in the “Column Properties” appendix.
Load Constraints
(Unavailable for some platforms) Loads factor constraints that you saved using the Save Constraints option.
Set Random Seed
Sets the random seed that JMP uses to control certain actions that have a random component. These actions include the following:
simulating responses using the Simulate Responses option
randomizing Run Order for design construction
selecting a starting design for designs based on random starts
To reproduce a design or simulated responses, enter the random seed that generated them. For designs using random starts, set the seed before clicking Make Design. To control simulated responses or run order, set the seed before clicking Make Table.
Note: The random seed associated with a design is included in the DOE Dialog script that is saved to the design data table.
Simulate Responses
Adds response values and a column containing a simulation formula to the design table. Select this option before you click Make Table.
When you click Make Table, the following occur:
A set of simulated response values is added to each response column.
For each response, a new a column that contains a simulation model formula is added to the design table. The formula and values are based on the model that is specified in the design window.
A Model window appears where you can set the values of coefficients for model effects and specify one of three distributions: Normal, Binomial, or Poisson.
A script called DOE Simulate is saved to the design table. This script re-opens the Model window, enabling you to re-simulate values or to make changes to the simulated response distribution.
Make selections in the Model window to control the distribution of simulated response values. When you click Apply, a formula for the simulated response values is saved in a new column called <Y> Simulated, where Y is the name of the response. Clicking Apply again updates the formula and values in <Y> Simulated.
Note: Image shown here You can use Simulate Responses to conduct simulation analyses using the JMP Pro Simulate feature. For information about Simulate and some DOE examples, see the Simulate chapter in the Basic Analysis book.
FFF Optimality Criterion
For the Fast Flexible Filling design method, enables you to select between the MaxPro criterion (the default) and the Centroid criterion. See “FFF Optimality Criterion”.
Number of Starts
Specifies the number of times that the algorithm for the chosen design type initiates to construct a new design. The best design, based on the criterion for the given design type, is returned. Set to 1 by default for all design types. Not used for Fast Flexible Filling Designs.
Advanced Options > Set Average Cluster Size
For the Fast Flexible Filling design method, enables you to specify the average number of randomly generated points used to define each cluster or, equivalently, each design point.
Save Script to Script Window
Creates the script for the design that you specified in the Custom Design platform and saves it in an open script window.
Sphere-Packing Designs
The Sphere-Packing design method maximizes the minimum distance between pairs of design points. The effect of this maximization is to spread the points out as much as possible inside the design region.
Creating a Sphere-Packing Design
1. Select DOE > Special Purpose > Space Filling Design.
2. Enter responses and factors.
3. Alter the factor level values, if necessary. For example, Figure 21.6 shows the two existing factors, X1 and X2, with values that range from 0 to 1 (instead of the default –1 to 1).
Figure 21.6 Space-Filling Dialog for Two Factors
Space-Filling Dialog for Two Factors
4. Click Continue.
5. In the design specification dialog, specify a sample size (Number of Runs). Figure 21.7 shows a sample size of eight.
Figure 21.7 Space-Filling Design Dialog
Space-Filling Design Dialog
6. Click Sphere Packing.
JMP creates the design and displays the design runs and the design diagnostics. Figure 21.8 shows the Design Diagnostics panel open with 0.518 as the Minimum Distance. Your results might differ slightly from the ones below, but the minimum distance is the same.
Figure 21.8 Sphere-Packing Design Diagnostics
Sphere-Packing Design Diagnostics
7. Click Make Table. Use this table to complete the visualization example, described next.
Visualizing the Sphere-Packing Design
To visualize the nature of the Sphere-Packing technique:
Create an overlay plot.
Adjust the plot’s frame size.
Add circles using the minimum distance from the diagnostic report shown in Figure 21.8 as the radius for the circles.
Example
Using the table you just created, proceed as follows:
1. Select Graph > Overlay Plot.
2. Specify X1 as X and X2 as Y, and then click OK.
3. Adjust the frame size so that the frame is square by right-clicking the plot and selecting Size/Scale > Size to Isometric.
4. Right-click the plot and select Customize. When the Customize panel appears, click the plus sign to see a text edit area and enter the following script:
For Each Row(Circle({:X1, :X2}, 0.518/2))
where 0.518 is the minimum distance number that you noted in the Design Diagnostics panel. This script draws a circle centered at each design point with radius 0.259 (half the diameter, 0.518), as shown on the left in Figure 21.9. This plot shows the efficient way JMP packs the design points.
5. Now repeat the procedure exactly as described in the previous section, but with a sample size of 10 instead of eight.
Remember to change 0.518 in the graphics script to the minimum distance produced by 10 runs. When the plot appears, again set the frame size and create a graphics script using the minimum distance from the diagnostic report as the diameter for the circle. You should see a graph similar to the one on the right in Figure 21.9. Note the irregular nature of the sphere packing. In fact, you can repeat the process a third time to get a slightly different picture because the arrangement is dependent on the random starting point.
Figure 21.9 Sphere-Packing Example with Eight Runs (left) and 10 Runs (right)
Sphere-Packing Example with Eight Runs (left) and 10 Runs (right)
Latin Hypercube Designs
In a Latin Hypercube, each factor has as many levels as there are runs in the design. The levels are spaced evenly from the lower bound to the upper bound of the factor. Like the sphere-packing method, the Latin Hypercube method chooses points to maximize the minimum distance between design points, but with a constraint. The constraint maintains the even spacing between factor levels.
Creating a Latin Hypercube Design
To use the Latin Hypercube method:
1. Select DOE > Special Purpose > Space Filling Design.
2. Enter responses, if necessary, and factors.
3. Alter the factor level values, if necessary. Figure 21.10 shows adding two factors to the two existing factors and changing their values to 1 and 8 instead of the default –1 and 1.
Figure 21.10 Space-Filling Dialog for Four Factors
Space-Filling Dialog for Four Factors
4. Click Continue.
5. In the design specification dialog, specify a sample size (Number of Runs). This example uses a sample size of eight.
6. Click Latin Hypercube (see Figure 21.7). Factor settings and design diagnostics results appear similar to those in Figure 21.11, which shows the Latin Hypercube design with four factors and eight runs.
Note: The purpose of this example is to show that each column (factor) is assigned each level only once, and each column is a different permutation of the levels.
Figure 21.11 Latin Hypercube Design for Four Factors and Eight Runs with Eight Levels
Latin Hypercube Design for Four Factors and Eight Runs with Eight Levels
Visualizing the Latin Hypercube Design
To visualize the nature of the Latin Hypercube technique:
Create an overlay plot
Adjust the plot’s frame size
Add circles using the minimum distance from the diagnostic report as the radius for the circle
Example
1. Create another Latin Hypercube design using the default X1 and X2 factors.
2. Be sure to change the factor values so that they are 0 and 1 instead of the default –1 and 1.
3. Click Continue.
4. Specify a sample size of eight (Number of Runs).
5. Click Latin Hypercube. Factor settings and design diagnostics are shown in Figure 21.12.
Figure 21.12 Latin Hypercube Design with Two Factors and Eight Runs
Latin Hypercube Design with Two Factors and Eight Runs
6. Click Make Table.
7. Select Graph > Overlay Plot.
8. Specify X1 as X and X2 as Y, and then click OK.
9. Right-click the plot and select Size/Scale > Size to Isometric to adjust the frame size so that the frame is square.
10. Right-click the plot, select Customize from the menu. In the Customize panel, click the large plus sign to see a text edit area, and enter the following script:
For Each Row(Circle({:X1, :X2}, 0.404/2))
where 0.404 is the minimum distance number that you noted in the Design Diagnostics panel (Figure 21.12). This script draws a circle centered at each design point with radius 0.202 (half the diameter, 0.404), as shown on the left in Figure 21.13. This plot shows the efficient way JMP packs the design points.
11. Repeat the above procedure exactly, but with 10 runs instead of eight (step 5). Remember to change 0.404 in the graphics script to the minimum distance produced by 10 runs.
You should see a graph similar to the one on the right in Figure 21.13. Note the irregular nature of the sphere packing. In fact, you can repeat the process to get a slightly different picture because the arrangement is dependent on the random starting point.
Figure 21.13 Comparison of Latin Hypercube Designs with Eight Runs (left) and 10 Runs (right)
Comparison of Latin Hypercube Designs with Eight Runs (left) and 10 Runs (right)
Note that the minimum distance between each pair of points in the Latin Hypercube design is smaller than that for the Sphere-Packing design. This is because the Latin Hypercube design constrains the levels of each factor to be evenly spaced. The Sphere-Packing design maximizes the minimum distance without any constraints.
Uniform Designs
The Uniform design minimizes the discrepancy between the design points (empirical uniform distribution) and a theoretical uniform distribution.
Note: These designs are most useful for getting a simple and precise estimate of the integral of an unknown function. The estimate is the average of the observed responses from the experiment.
1. Select DOE > Special Purpose > Space Filling Design.
2. Enter responses, if necessary, and factors.
3. Alter the factor level values to 0 and 1.
4. Click Continue.
5. In the design specification dialog, specify a sample size. This example uses a sample size of eight (Number of Runs).
6. Click the Uniform button. JMP creates this design and displays the design runs and the design diagnostics as shown in Figure 21.14.
Note: The emphasis of the Uniform design method is not to spread out the points. The minimum distances in Figure 21.14 vary substantially.
Figure 21.14 Factor Settings and Diagnostics for Uniform Space-Filling Designs with Eight Runs
Factor Settings and Diagnostics for Uniform Space-Filling Designs with Eight Runs
7. Click Make Table.
A Uniform design does not guarantee even spacing of the factor levels. However, increasing the number of runs and running a distribution on each factor (use Analyze > Distribution) shows flat histograms.
Figure 21.15 Histograms Are Flat for Each Factor When Number of Runs Is Increased to 20
Histograms Are Flat for Each Factor When Number of Runs Is Increased to 20
Comparing Sphere-Packing, Latin Hypercube, and Uniform Methods
To compare space-filling design methods, create the Sphere Packing, Latin Hypercube, and Uniform designs, as shown in the previous examples. The Design Diagnostics tables show the values for the factors scaled from zero to one. The minimum distance is based on these scaled values and is the minimum distance from each point to its closest neighbor. The discrepancy value is the integrated difference between the design points and the uniform distribution.
Figure 21.16 shows a comparison of the design diagnostics for three eight-run space-filling designs. Note that the discrepancy for the Uniform design is the smallest (best). The discrepancy for the Sphere-Packing design is the largest (worst). The discrepancy for the Latin Hypercube takes an intermediate value that is closer to the optimal value.
Also note that the minimum distance between pairs of points is largest (best) for the Sphere-Packing method. The Uniform design has pairs of points that are only about half as far apart. The Latin Hypercube design behaves more like the Sphere-Packing design in spreading the points out.
For both spread and discrepancy, the Latin Hypercube design represents a healthy compromise solution.
Figure 21.16 Comparison of Diagnostics for Three Eight-Run Space-Filling Methods
Comparison of Diagnostics for Three Eight-Run Space-Filling Methods
Another point of comparison is the time it takes to compute a design. The Uniform design method requires the most time to compute. Also, the time to compute the design increases rapidly with the number of runs. For comparable problems, all the space-filling design methods take longer to compute than the D-optimal designs in the Custom Designer.
Minimum Potential Designs
The Minimum Potential design spreads points out inside a sphere. To understand how this design is created, imagine the points as electrons with springs attached to every other point, as illustrated to the right. The coulomb force pushes the points apart, but the springs pull them together. The design is the spacing of points that minimizes the potential energy of the system.
Figure 21.17 Minimum Potential Design
Minimum Potential Design
Minimum Potential designs:
have spherical symmetry
are nearly orthogonal
have uniform spacing
To see a Minimum Potential example:
1. Select DOE > Special Purpose > Space Filling Design.
2. Add 1 continuous factor.
3. Alter the factor level values to 0 and 1, if necessary.
4. Click Continue.
5. In the design specification dialog (shown on the left in Figure 21.18), enter a sample size (Number of Runs). This example uses a sample size of 12.
6. Click the Minimum Potential button. JMP creates this design and displays the design runs and the design diagnostics (shown on the right in Figure 21.18).
Figure 21.18 Space-Filling Methods and Design Diagnostics for Minimum Potential Design
Space-Filling Methods and Design Diagnostics for Minimum Potential Design
7. Click Make Table.
You can see the spherical symmetry of the Minimum Potential design using the Scatterplot 3D graphics platform.
1. After you make the JMP design table, choose the Graph > Scatterplot 3D command.
2. In the Scatterplot 3D launch dialog, select X1, X2, and X3 as Y, Columns and click OK to see the initial three-dimensional scatterplot of the design points.
3. To see the results similar to those in Figure 21.19:
Select the Normal Contour Ellipsoids option from the menu in the Scatterplot 3D title bar.
Make the points larger. Right-click on the plot and select Settings, and then increase the Marker Size slider.
Now it is easy to see the points spread evenly on the surface of the ellipsoid.
Figure 21.19 Minimum Potential Design Points on Sphere
Minimum Potential Design Points on Sphere
Maximum Entropy Designs
The Latin Hypercube design is currently the most popular design assuming you are going to analyze the data using a Gaussian-Process model. Computer simulation experts like to use the Latin Hypercube design because all projections onto the coordinate axes are uniform.
However, as the example in Figure 21.20 shows, the Latin Hypercube design does not necessarily do a great job of space filling. This is a two-factor Latin Hypercube with 16 runs and with the factor level settings set between -1 and 1. Note that this design seams to leave a hole in the bottom right of the overlay plot.
Figure 21.20 Two-factor Latin Hypercube Design
Two-factor Latin Hypercube Design
The Maximum Entropy design is a competitor to the Latin Hypercube design for computer experiments because it optimizes a measure of the amount of information contained in an experiment. See the technical note below. With the factor levels set between -1 and 1, the two-factor Maximum Entropy design shown in Figure 21.21 covers the region better than the Latin hypercube design in Figure 21.20. The space-filling property generally improves as the number of runs increases without bound.
Figure 21.21 Two-Factor Maximum Entropy Design
Two-Factor Maximum Entropy Design
Technical Maximum Entropy designs maximize the Shannon information (Shewry and Wynn (1987)) of an experiment, assuming that the data come from a normal (m, s2 R) distribution, where
Equation shown here
is the correlation of response values at two different design points, xi and xj. Computationally, these designs maximize |R|, the determinant of the correlation matrix of the sample. If xi and xj are far apart, then Rij approaches zero. If xi and xj are close together, then Rij is near one.
Gaussian Process IMSE Optimal Designs
The Gaussian process IMSE optimal design method constructs designs that are suitable for Gaussian process models. Gaussian process models fit a wide variety of surfaces. Gaussian process IMSE optimal designs minimize the integrated mean squared error of the Gaussian process model over the experimental region. The Gaussian process IMSE optimal design method uses a correlation structure similar to that of the kriging model. See Jones and Johnson (2009).
Covariance Parameter Vector
In a Gaussian Process IMSE Optimal Design formulation of the Gaussian process model, the covariance parameter vector determines the correlation structure. There is a Theta for each factor. A theta equal to 0 corresponds to a correlation of 1, causing the fitted surface to be flat in the corresponding factor’s direction. As theta increases, the correlation decreases, allowing the surface to be flexible in the factor’s direction.
In the Covariance Parameter Vector outline, in the list of values under Thetas, you can enter values that reflect your prior knowledge of the surface.
Comparison of Gaussian Process IMSE Optional Design with Latin Hypercube Design
Gaussian process IMSE optimal designs are competitors to the Latin Hypercube design. You can compare the IMSE optimal design to the Latin Hypercube (shown previously in Figure 21.20). The table and overlay plot in Figure 21.22 show a Gaussian IMSE optimal design. You can see that the design provides uniform coverage of the factor region.
Figure 21.22 Comparison of Two-factor Latin Hypercube and Gaussian IMSE Optimal Designs
Comparison of Two-factor Latin Hypercube and Gaussian IMSE Optimal Designs
Note: Both the Maximum Entropy design and the Gaussian Process IMSE Optimal design were created using 100 random starts.
Fast Flexible Filling Designs
Note: If you have Categorical factors or factor constraints, then Fast Flexible Filling is the only Method available.
FFF Optimality Criterion
The algorithms for Fast Flexible Filling designs begin by generating a large number of random points within the specified design region. These points are then clustered using a Fast Ward algorithm into a number of clusters that equals the Number of Runs that you specified.
The final design points can be obtained by using the default MaxPro (maximum projection) optimality criterion or by selecting the Centroid criterion. You can find these options under FFF Optimality Criterion in the report’s red triangle menu.
MaxPro
For p factors and n equal to the specified Number of Runs, the MaxPro criterion strives to find points in the clusters that minimize the following criterion:
Equation shown here
The MaxPro criterion maximizes the product of the distances between potential design points in a way that involves all factors. This supports the goal of providing good space-filling properties on projections of factors. See Joseph et al. (2015). The Max Pro option is the default.
Centroid
This method places a design point at the centroid of each cluster. It has the property that the average distance from an arbitrary point in the design space to its closest neighboring design point is smaller than for other designs.
Note: You can set a preference to always use a given optimality criterion. Select File > Preferences > Platforms > DOE. Check FFF Optimality Criterion and select your preferred criterion.
Categorical Factors
When you have categorical factors, the algorithm proceeds as follows:
The total number of design points is balanced across the total number of combinations of levels of the categorical factors. Suppose that there are m combinations of levels and that k design points are allocated to each of these.
A large number of points within the design space defined by the continuous variables is generated. These are grouped into k primary clusters.
Each of the k primary clusters of points is further clustered into m sub-clusters.
Within each primary cluster, a design point is calculated for each of the m sub-clusters using the specified FFF optimality criterion.
For each of the k primary clusters, one of the m combinations of levels is randomly assigned to each of the m sub-cluster design points. This yields a total of km design points.
Set Average Cluster Size
The Set Average Cluster Size option is found under Advanced Options in the Space Filling Design red triangle menu. This option enables you to specify the average number of randomly- generated points used to define each cluster or, equivalently, each design point.
By default, if the Number of Runs is set to 200 or less, a total of 10,000 randomly generated points are used as the basis for the clustering algorithm. When the number of Runs exceeds 200, a default value of 50 is used. Increasing this value can be particularly useful in designs with a large number of factors or where disallowed combinations restrict the distribution of points used in the clustering algorithm.
Note: Depending on the number of factors and the specified value for Number of Runs, you might want to increase the average number of initial points per design point by selecting Advanced Options > Set Average Cluster Size.
Constraints
Once you complete the Factors outline, click Continue. The Define Factor Constraints outline appears. Use this outline to restrict the design region. For details about the outline, see “Define Factor Constraints”.
You can use the Use Disallowed Combinations Filter and Use Disallowed Combinations Script options to specify disallowed factor level combinations. Or, you can use the Specify Linear Constraints option to specify bounds in terms of linear inequalities. However, the design is generated differently for these two methods.
Use Disallowed Combinations Filter and Use Disallowed Combinations Script
When disallowed combinations are specified, the random points that form the basis for the clustering algorithm are randomly distributed within the unconstrained design region. Then disallowed points are removed and clustering proceeds with the remaining points.
Note: Depending on the nature of the constraints and the specified Number of Runs, the default coverage of the unconstrained design space by the initial randomly generated points might not be sufficient to produce the required Number of Runs. In this case, you might obtain a JMP Alert indicating that the algorithm “Could not find sufficient number of points.” To increase the initial number of points that form the basis for the clustering algorithm, specify a larger average number of initial points per design point by selecting Advanced Options > Set Average Cluster Size. (See “Set Average Cluster Size”).
Specify Linear Constraints
When you use the Specify Linear Constraints option, the random points that form the basis for the clustering algorithm are randomly distributed within the constrained design region. The clustering algorithm uses these points.
Creating and Viewing a Constrained Fast Flexible Filling Design
Constructing the Design
1. Select DOE > Special Purpose > Space Filling Design.
2. Enter Values of 0 and 1 for both X1 and X2.
3. Click Continue.
4. In the Define Linear Constraints outline, select Specify Linear Constraints.
Notice that Fast Flexible Filling is the only available Space Filling Design Method.
5. Select Add.
6. Enter the following coefficients and bound:
1 for X1
1 for X2
0.8 for the bound
Figure 21.23 Linear Constraint
Linear Constraint
7. Type 200 next to Number of Runs.
8. Select Fast Flexible Filling.
JMP creates a design that satisfies the constraints. Open the Design outline to view the design.
9. Select Make Table to construct the data table.
Constructing the Plot
1. With the data table active, select Graph > Graph Builder.
2. Drag X1 to the drop zone labeled X.
3. Drag X2 to the drop zone labeled Y.
4. Remove the Smoother by clicking the smoother icon.
5. In the Graph Builder red triangle menu, click Show Control Panel to deselect it.
You should see a graph similar to the one in Figure 21.24. Note that the points satisfy the linear constraint Equation shown here.
Figure 21.24 Fast Flexible Filling Design with One Linear Constraint
Fast Flexible Filling Design with One Linear Constraint
Borehole Model: A Sphere-Packing Example
Worley (1987) presented a model of the flow of water through a borehole that is drilled from the ground surface through two aquifers. The response variable y is the flow rate through the borehole in m3/year and is determined by the following equation:
Equation shown here
There are eight inputs to this model:
rw = radius of borehole, 0.05 to 0.15 m
r = radius of influence, 100 to 50,000 m
Tu = transmissivity of upper aquifer, 63,070 to 115,600 m2/year
Hu = potentiometric head of upper aquifer, 990 to 1100 m
Tl = transmissivity of lower aquifer, 63.1 to 116 m2/year
Hl = potentiometric head of lower aquifer, 700 to 820 m
L = length of borehole, 1120 to 1680 m
Kw = hydraulic conductivity of borehole, 9855 to 12,045 m/year
This example is atypical of most computer experiments because the response can be expressed as a simple, explicit function of the input variables. However, this simplicity is useful for explaining the design methods.
Create the Sphere-Packing Design for the Borehole Data
To create a Sphere-Packing design for the borehole problem:
1. Select DOE > Special Purpose > Space Filling Design.
2. Click the red triangle icon on the Space Filling Design title bar and select Load Factors.
3. Select Help > Sample Data Library and open Design Experiment/Borehole Factors.jmp (Figure 21.25).
Figure 21.25 Factors Panel with Factor Values Loaded for Borehole Example
Factors Panel with Factor Values Loaded for Borehole Example
Note: The logarithm of r and rw are used in the following discussion.
4. Click Continue.
5. Specify a sample size (Number of Runs) of 32 as shown in Figure 21.26.
Figure 21.26 Space-Filling Design Method Panel Showing 32 Runs
Space-Filling Design Method Panel Showing 32 Runs
6. Click the Sphere Packing button to produce the design.
7. Click Make Table to make a table showing the design settings for the experiment.
To see a completed data table for this example, select Help > Sample Data Library and open Design Experiment/ Borehole Sphere Packing.jmp. Because the designs are generated from a random seed, the settings that you obtain will differ from those shown in the completed table.
The Borehole Sphere Packing.jmp data table contains a Fit Model script that you can use to analyze the data. Columns containing the true model, the prediction formula, and the prediction bias are included in the data table.
Guidelines for the Analysis of Deterministic Data
It is important to remember that deterministic data have no random component. As a result, p-values from fitted statistical models do not have their usual meanings. A large F statistic (low p-value) is an indication of an effect due to a model term. However, you cannot make valid confidence intervals about the size of the effects or about predictions made using the model.
Residuals from any model fit to deterministic data are not a measure of noise. Instead, a residual shows the model bias for the current model at the current point. Distinct patterns in the residuals indicate new terms to add to the model to reduce model bias.
Results of the Borehole Experiment
The example described in the previous sections produced the following results:
A stepwise regression of the response, log y, versus the full quadratic model in the eight factors, led to the prediction formula column.
The prediction bias column is the difference between the true model column and the prediction formula column.
The prediction bias is relatively small for each of the experimental points. This indicates that the model fits the data well.
In real world examples, the true model is generally not available in a simple analytical form. As a result, it is impossible to know the prediction bias at points other than the observed data without doing additional runs.
In this case, the true model column contains a formula that allows profiling the prediction bias to find its value anywhere in the region of the data. To understand the prediction bias in this example:
1. Select Graph > Profiler.
2. Highlight the prediction bias column and click the Y, Prediction Formula button.
3. Check the Expand Intermediate Formulas box, shown at the bottom on the Profiler dialog in Figure 21.27.
The prediction bias formula is a function of columns that are also created by formulas.
Figure 21.27 Profiler Dialog for Borehole Sphere-Packing Data
Profiler Dialog for Borehole Sphere-Packing Data
4. Click OK.
The profile plots in Figure 21.28 show the prediction bias at the center of the design region. If there were no bias, the profile traces would be constant between the value ranges of each factor. In this example, the variables Hu and Hl show nonlinear effects.
Figure 21.28 Profiler for Prediction Bias for Borehole Sphere-Packing Data
Profiler for Prediction Bias for Borehole Sphere-Packing Data
The range of the prediction bias on the data is smaller than the range of the prediction bias over the entire domain of interest. To see this, look at the distribution analysis (Analyze > Distribution) of the prediction bias in Figure 21.29. Note that the maximum bias is 1.826 and the minimum is –0.684 (the range is 2.51).
Figure 21.29 Distribution of the Prediction Bias
Distribution of the Prediction Bias
The top plot in Figure 21.30 shows the maximum bias (2.91) over the entire domain of the factors. The plot at the bottom shows the comparable minimum bias (–4.84). This gives a range of 7.75. This is more than three times the size of the range over the observed data.
Figure 21.30 Prediction Plots Showing Maximum and Minimum Bias over Factor Domains
Prediction Plots Showing Maximum and Minimum Bias over Factor Domains
Keep in mind that, in this example, the true model is known. In any meaningful application, the response at any factor setting is unknown. The prediction bias over the experimental data underestimates the bias throughout the design domain.
There are two ways to assess the extent of this underestimation:
Cross validation refits the data to the model while holding back a subset of the points and looks at the error in estimating those points.
Verification runs (new runs performed) at different settings to assess the lack of fit of the empirical model.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset