Mark J. Anderson, Patrick J. Whitcomb, Stat-Ease, Minneapolis, MN USA
Response surface methods (RSM) provide statistically validated predictive models that can be manipulated for finding optimal process configurations that exhibit minimal variability.
Response surface methods (RSM) provide statistical tools for design and analysis of experiments aimed at process optimization. At the final stages of process development, RSM illuminates the sweet spot where high yield of in-specification products can be achieved at lowest possible cost. It produces statistically-validated predictive models and, with the aid of specialized software, response surface maps that point the way to pinnacles of process performance. This article introduces two RSM enhancements that focus on achieving robust operating conditions.
Response surface methods (RSM) are powerful optimization tools in the arsenal of statistical design of experiments (DOE). Before employing RSM, process engineers should take full advantage of a far simpler tool for DOE — two-level factorials, which can be very effective for screening the vital few factors (including interactions) from the trivial many that have no significant impact. (For details, see DOE Simplified .) Assuming the potential for further financial gain, it’s best to follow up screening studies by doing an in-depth investigation of the surviving factors via RSM, then generate a response surface map and move the process to the optimum location. This article provides a brief introduction to RSM. (For a complete primer, read RSM Simplified .)
Elementary RSM: one process factor
To illustrate the elements of response surface methods, we present a very simple study that involves only one factor — cure temperature — and its effect on the ultimate shear strength of a polymer. The data are loosely derived from a problem presented in a standard textbook on RSM . Table 1 shows the experimental design in a convenient layout that sorts the “X” variable (input) by level. The actual run order for experiments like this should always be randomized to counteract any time-related effects due to ambient conditions, etc.
Table 1: One-factor RSM design on a curing process.
This RSM design on one factor, generated with the aid of statistical software developed for this purpose  provides seven levels of temperature, with three of them replicated — the two extremes (#1-2 and #11-12) — twice each, and the center point (5-8) — four times over. This provides a total of 5 measures, or degrees of freedom, for pure error. Note that repeated measures or re-sampling from a given run will provide more stable averaged results, but only a complete re-run, for example — recharging a vessel, bringing it up to temperature and so forth, will suit for measuring overall process/sample/test variation. In general, the minimum requirement for an RSM design is that each factor be tested at three levels over a continuous scale. Additional levels provide for a statistical test on lack of fit measured against the pure error obtained via replications of one or more design points.
There is no significant lack of fit in this case as one can infer by inspection of Fig. 1 — the response surface for ultimate shear strength of material cured at varying temperatures. The dotted lines represent the 95% confidence band on the mean prediction for any given factor level.
Fig. 1: Response surface of ultimate shear versus cure temperature.
This curve was created from the following second-order polynomial model, called a “quadratic,” via least squares regression:
Ŷ = 808.77 – 250.45 X – 328.58 X2
The experiment design (Table 1) provides sufficient input levels to fit a third-order (cubic) term — X3. However, statistics show no significant improvement to the model’s predictive capability, thus there will be no advantage to making it cubic — only complication. When modeling data, it is best to keep things as simple as possible by a statistical principle called “parsimony.”
The “hat” over the response (output) variable “Y” indicates that this is a predicted value. The coefficients are based on coded values of X (the input variable) scaled from –1 to +1 over the range tested (280°F–315°F). Coded models, a standard practice for RSM, facilitate comparison of coefficients, which becomes more useful with multiple factors, as will be seen in the next example. It pays immediate dividends for predicting the ultimate shear strength at the center point value for cure temperature of 297.5°F: Simply plug in 0 for X, which leaves the model intercept of 808.77 as the expected outcome for ultimate shear in units of pounds/sq. in (psi).
Of much greater interest for predictive purposes is the location of the maximum shear strength. For a single response-measure, the polynomial model lends itself to simple calculus. However, numerical search algorithms, such as simplex hill-climbing, work better in general and they can be done quickly with the aid of computers. In this case, the optimal cure temperature is found at 290.8°F (-0.381 coded) at which the ultimate shear strength reaches its peak at 856.5psi. To convey the uncertainty of a point estimate derived by modeling sample data from a particular experiment, it helps to provide its associated prediction interval (PI), in this case: 799–914psi (p=0.05 or 95%). Note that the PI will always be wider than the confidence interval (CI) on the mean prediction. As a practical matter for the engineer or scientist, reporting the PI will lessen any unrealistic expectations of confirming precisely the value predicted in a one-shot follow-up test.
Calculating propagation of error
Propagation of error (POE) measures the variation transmitted from input factors to the response as a function of the shape of the surface. It facilitates finding the flats — stable spots to locate your process, for example a high plateau of yield. For example, in Fig. 2 you can see how a constant 5° variation in cure temperature creates a very small response (ultimate shear) variation at the mid-range area “A,” but when the set point is at the high end of the scale (“B”), the variation in ultimate shear becomes very large.
Fig. 2: Variation transmitted via the response surface.
The formula for POE, which involves the application of partial derivatives of the function (δf) with respect to the individual factors (Xi), is:
As a convenience to process engineers, this calculation produces an estimate of standard deviation in original units of their response measure. However, for statistical purposes, it’s best to work in terms of variance (σ2), where the σ represents the standard deviation of the predicted response Y-hat, the input factors X and the unexplained residual (error); respectively in the equation.
As noted already, calculus comes into play in the POE equation with the partial derivative of the model function taken with respect to each of the individual inputs (Xi) expressed in actual units, which can be derived by reversing the –1/+1 coding. (Keep in mind that the standard deviations of the input factors (X) are expressed in actual units.)
These calculations become clearer by example. In this case, the actual equation for predicting ultimate shear strength is:
Ŷ = -89892 + 624.06 X – 1.0729 X2
(Actual units of experimental temperatures were in the hundreds of degrees, which become quite large when squared, hence the small coefficient for this term. This exemplifies why, as was discussed earlier, the coded equation serves better for interpretation.)
Assume for the curing process that temperature can be controlled only to within 2.5°F of standard deviation. The residual standard deviation comes from an analysis of variance (ANOVA) done in conjunction with the fitting of the model — it is 23.72psi.
With some further number-crunching, this equation now serves to produce the picture, shown in Fig. 3, of the error transmitted via the surface from the variation in the model input — the temperature of this curing process. The minimum POE occurs at ~290°F — where the shear strength peaks, which is fortuitous.
This simple example provides the basics of RSM enhanced by application of POE. The next case adds another element helpful for robust process design — a second, dual, response: A measure of variation at each experimental setup (run).
Fig. 3: POE surface for cure-process.
RSM on several key factors
Semiconductor manufacturing engineers  desired a more robust result for resistivity (the response output “Y”) as a function of three key factors (the input “X”s) known to affect their single-wafer etching process: a) gas flow rate, b) temperature, and c) pressure.
Other variables, for example, radio frequency (RF) power, could not be controlled very precisely. To measure the resulting variation over time, batches of wafers were collected over 11 different days from each of 17 runs in a central composite design (CCD). The process engineers hoped to hit a target resistivity of 350Ω-cm with minimal variation.
The CCD is a popular template for RSM because it requires only a fraction of all the possible combinations from a full three-level factorial (for details on CCD see [2, 3]). Figure 4 shows the CCD structure for three factors.
Fig. 4: Central composite design on three factors.
The star points project from the center point of the cubical two-level factorial. They are located a prescribed distance along the three main factor axes as shown in Table 2, which list factor levels in coded units (the experimenters kept the actual levels proprietary). For example, the star point projecting out to the right on Fig. 4, identified by number 9 in Table 2, is located 1.68 units from the center (coded 0). To clarify the implications of this design geometry for the experiment, let’s say that the current setting of a factor is 100 and the factorial range will be ±10. Then the upper star point for the three-factor CCD would be set at 116.8 (and the lower star an equal interval below the center point at 100). These statistically-desirable distances increase as the number of factors goes up. However, the model-fit will be reliable only within the factorial ‘box.’
Table 2: Design matrix for RSM on single-wafer etching process.
The CCD template calls for replication of the center point a number of times, ideally six for the best predictive properties in the middle region of experimentation. However, these experimenters ran only four center points — still not bad. The actual run order, including center points, should always be done at random. Otherwise, the effects will become biased by time-related lurking variables such as the RF, thus confounding true cause-and-effect relationships.
Modeling the mean and process variance
By collecting repeated samples for each run, experimenters can model both the mean (average) and variance (or standard deviation). This enables the following tactics for process optimization:
– From the mean response, find factor settings that meet the targeted response; and
– Use the statistics on variation to achieve operating conditions that are robust to uncontrolled (noise) variables.
Ideally, the responses measured during the course of any given run will be representative of the long-term process variability. For example, the values for mean and standard deviation in Table 2 are derived from nearly a dozen daily batches over several weeks on the line. However, as few as three samples per experimental run can suffice for this dual response approach. However, no matter what the sample size (n), if the study conditions are not representative of true manufacturing conditions, this method may underestimate the overall variation.
To re-set the stage for this case, here is the experimenters’ purpose statement “…Wafers produced on any given day (i.e., within the same batch) may be different than wafers produced on another day… Variation due to time is designed into the experimentation process by using test wafers chosen at random across several days… It may be possible to minimize… [this] …variation… by manipulating the… control variables.” An engineer from a major chipmaker told one of the authors (Mark) that variations from batch-to-batch can be a “huge” problem in semiconductor manufacturing .
Least-squares regression of the Table 2 data produced these coded predictive models, for mean resistivity and standard deviation:
– Mean = 255.71 + 23.69A – 49.06B – 35.14C – 25.54AC – 16.57B2 + 27.75C2
(p<0.0001, Adjusted R2 0.84)
– Log10 Std Dev = 1.82 – 0.077B + 0.012C + 0.18C2
(p<0.0001, Adjusted R2 0.76)
Both models are quadratic, i.e., second-order polynomials, and they are highly-significant statistically as indicated by their low “p” values and high adjusted R-squared values. The standard deviation has been transformed via a logarithm, which is standard practice for statistical reasons. To keep them simple, these models were reduced by backward regression at p of 0.10. Keep in mind that these predictive models are strictly empirical — constructed only to provide an adequate approximation of the true response surface.
Notice in the model for the mean that it includes squared terms for B and C, but not factor A. Thus, one can infer that the response surface will be less ‘curvy’ along the A dimension. The “perturbation” plot shown in Fig. 5 illustrates this by the straight line for A. This plot originates from the center point of the experimental region and from there it measures the response in each of the three dimensional axes.
Fig. 5: Perturbation plot.
Figure 6 shows the contour plot for factors B and C with A set at its +1 (high factorial) level. (Recall that it’s best to stay within the ‘box’ of factorial settings in the CCD — do not extrapolate to the axial levels — 1.68 coded units in this case.)
Fig. 6: Contour plot of temperature vs pressure with gas flow (A) at +1 level.
The contour for the targeted resistance of 350 cuts through a region where pressure is relatively low, but the range of possible temperatures is fairly broad. Before choosing a specific setup, the POE can be taken into account to minimize manufacturing variation caused by variability in the control-factor settings. However, at this stage the actual factor levels, –1 to +1, and their standard deviations (in parentheses) must be detailed. For illustrative purposes, assume these are:
— 30 to 40 (1.0) sccm *[26.591 to 43.409]
— 30 to 50 (1.0) °C *[23.1821 to 56.8179]
— 80 to 120 (3.0) mTorr *(66.3641 to 133.636)
[The ranges shown in asterisked brackets represent the axial star points that protrude outside the factorial box of the central composite design.] These factor settings produce the following actual predictive equation (rounded):
Mean Resistivity = -3.71 + 30.3 A + 8.35 B – 6.69 C – 0.255 AC – 0.166 B2 + 0.0694 C2
Specialized DOE software  performed the necessary calculations to produce the POE surface displayed by Fig. 7b. For comparison’s sake, the resistivity mean model graph is shown in Fig. 7a. (Both graphs were generated with factor A fixed at its +1 level.) Now you can see how the POE finds the flats — the regions where process response remains most robust to factor variations.
Fig. 7: Surfaces of resistivity mean (a) and POE (b).
The last puzzle piece for determining where to set up the single-wafer etching process is the view of measured variation caused by batch-to-batch differences (Fig. 8).
The least variation occurs at relatively high temperature and mid-pressure. The gas flow causes little or no difference in standard deviation, which may be helpful for making the required tradeoffs — a compromise of meeting product specifications, while maintaining them from batch-to-batch in spite of control-factor variations. For example, if setting temperature and pressure for reduction of variation causes the resistivity to go off target, perhaps the gas flow can be adjusted to get the process outback back in specification.
Fig. 8: Response surface of resistivity standard deviation.
Trade-off between performance/robustness
To determine the most desirable combination of responses, RSM practitioners [7,8] typically establish this objective function:
In this equation the overall desirability, D, is computed by multiplying the individual desirabilities for each response, all of which are scaled from 0 to 1. Figure 9(a) shows how this is done for a targeted response such as resistance in this case. The goal of minimize, desired for POE and standard deviation, is pictured in Fig. 9b.
Fig. 9: Desirability scales for target (a) and minimization (b).
Figure 10 shows the results of a computer  search of the factorial region of the modeled process space for the most desirable setup based on goals of meeting the product specification of 350 (±10), while simultaneously minimizing POE and batch-to-batch deviations.
Fig. 10: Most desirable process settings.
The top row depicts the recommended factor settings that produce the predicted responses in the second row. (Notice, for example, that the resistivity hits the targeted spot for maximum desirability.) Figure 11 presents a view of the desirability surface.
Fig. 11: 3D view of desirable combinations of temperature vs pressure (gas flow set at +1 level).
The ideal setup coordinate (A,B,C) for meeting the specification with the least variation is (1,-1,-0.5). The authors of this original case study  recommended coordinate (1.18, -0.80, -0.57), which extrapolates factor A (gas flow) beyond the factorial region. We were more conservative. Nevertheless, the results do not differ appreciably. Follow-up runs are always recommended to put predictions to the test.
Response surface methods (RSM) provide statistically-validated predictive models that can then be manipulated for finding optimal process configurations. Variation transmitted to responses from poorly-controlled process factors can be accounted for by the mathematical technique of propagation of error (POE), which facilitates ‘finding the flats’ on the surfaces generated by RSM. The dual response approach to RSM captures the standard deviation of the output(s) as well as the average. It accounts for unknown sources of variation. Dual response plus POE provides a more useful model of overall response variation. The end-result of applying these statistical tools for design and analysis of experiments will be in-specification products that exhibit minimal variability — the ultimate objective of robust design.
Design-Expert is a registered trademark of Stat-Ease.
Mark J. Anderson received his BS in chemical engineering and MBA from the U. of Minnesota and is a Principal at Stat-Ease, Inc., 2021 East Hennepin Avenue, Suite 480, Minneapolis, MN 55082, USA; ph.: 612-746-2032; email firstname.lastname@example.org.
Patrick J. Whitcomb received his BS and MS in chemical engineering from the U. of Minnesota and is a Principal at Stat-Ease, Inc.
1. M. J. Anderson, P.J. Whitcomb, DOE Simplified — Practical Tools for Effective Experimentation, 2 nd Edition, Productivity, Inc., New York NY, 2007.
2. M. J. Anderson, P.J. Whitcomb, RSM Simplified — Optimizing Processes Using Response Surface Methods for Design of Experiments, Productivity, Inc., New York NY, 2005.
3. R. H. Myers, D. C. Montgomery, Response Surface Methodology, problem 2.21, John Wiley and Sons, Inc., New York, NY, 2002.
4. T. J. Helseth, et al., “Design-Expert Version 7.1″ software for Windows, Stat-Ease, Inc., Minneapolis.
5. T. J. Robinson, S. S. Wulff, D. C. Montgomery, “Robust Parameter Design Using Generalized Linear Mixed Models,” Jour. of Quality Tech., Vol. 38, No. 1, p. 70, Table 1, Jan. 2006.
6. Private conversation, ISMI (International SEMATECH Manufacturing Initiative) Symposium on Manufacturing Effectiveness, Austin, TX, October, 2007.
7. G. C. Derringer, “A Balancing Act: Optimizing a Product’s Properties,” Quality Progress, pp. 51-58 (posted with publishers permission at www.statease.com/pubs/derringer.pdf), June 1994.
8. “Multiple responses: The desirability approach,” section 188.8.131.52.2., “NIST/SEMATECH e-Handbook of Statistical Methods,” http://www.itl.nist.gov/div898/handbook/.