fuzzy logic based regression models for electronics ... - CiteSeerX

7 downloads 0 Views 2MB Size Report
These fuzzy models can accurately represent a wide variety of processes, from the type usually modeled with traditional methods such as polynomial regression, ...
FUZZY LOGIC BASED REGRESSION MODELS FOR ELECTRONICS MANUFACTURING APPLICATIONS

Brian Schaible and Y. C. Lee Department of Mechanical Engineering University of Colorado Boulder, CO 80309-0427

ABSTRACT In recent years, much attention has been focused on the ability of fuzzy logic to solve problems that were previously assumed difficult to solve. These include, for example, the use of fuzzy logic for control and pattern recognition. However, the application of fuzzy logic to the establishment of correlations between independent and dependent process variables, i.e. regression modeling, has received less attention. Fuzzy logic based regression models have proven capable of accurately representing a wide variety of processes, from the type usually modeled with traditional methods such as polynomial regression, to very complex multi-dimensional relationships often modeled with neural networks. The fuzzy models can be used for a variety of tasks such as gaining an increased understanding of the process itself or determining the input settings for optimum process operation. Often, the nature of the fuzzy model results in some saving or benefit when compared with other modeling techniques. In this paper, we will describe a fuzzy logic regression modeling technique (often termed “TSK fuzzy reasoning”). We will then detail some particular electronics manufacturing applications for which the fuzzy logic based regression models have proven useful. These applications include the optimization of a simulated vertical CVD process and the use of specialized fuzzy logic based regression models for ranking process effects (ANOVA).

the fuzzy modeling technique represents the correlation with numerous “local” equations that are combined to globally represent the process (Figure 1). In some ways, these models are similar to piecewise polynomial regression models with one important exception - the transition from one “piece” of the model to the next is not crisp. Instead, the “pieces,” or local equations, overlap and the transitions between them are defined by the membership functions which are common to all fuzzy systems. Fuzzy rules of the type proposed by Takagi and Sugeno (1985) allow the local equations to be introduced into the fuzzy system. An efficient structure identification algorithm (Tan, et al., 1995) ensures that the fuzzy model contains the appropriate fuzzy rules so that the predictive capability of the model is maximized.

Keywords: Fuzzy Logic, Fuzzy Systems, Nonlinear Systems, Optimization, Modeling. INTRODUCTION In recent years, much attention has been focused on the ability of fuzzy logic to solve problems that were previously assumed difficult to solve. These include, for example, the use of fuzzy logic for control and pattern recognition. However, the application of fuzzy logic to the establishment of correlations between independent and dependent process variables, i.e. regression modeling, has received less attention. In this paper we first outline the fuzzy logic based regression modeling technique that has proven to be very capable of modeling highly nonlinear, multi-dimensional processes. This technique is sometimes referred to as “TSK fuzzy reasoning,” the “TSK” representing the initials of the three people who pioneered its use. Unlike many traditional modeling methods that typically represent an input-output relationship with a single equation that applies globally,

Figure 1. Local/global nature of the fuzzy logic regression models. These fuzzy models can accurately represent a wide variety of processes, from the type usually modeled with traditional methods such as polynomial regression, to very complex multi-dimensional relationships often modeled with neural networks. For example, Figure 2 shows a fuzzy model identified with data obtained from a mathematical function, y ( x1, x2 ) = cos(6 x1 + 7) cos(6 x2 + 6) , and illustrates the ability of this technique to model complex interactions.

Because fuzzy models have the ability to closely approximate a wide variety of processes, they can potentially be used for many engineering tasks. Two such tasks will be briefly addressed in this paper. The first of these is the problem of improving knowledge of a process through an increased understanding of the relative importance of the process inputs. This problem is usually solved with ANOVA techniques. However, traditional ANOVA methods may not give accurate results due to the limited correlation capability of polynomial regression models. As will be shown, in cases where polynomial models fail, the approach based on fuzzy models can be used successfully. The second application to be discussed involves the use of fuzzy models and the response surface method to optimize a simulated manufacturing process. In the classical response surface method approach to optimization, simple linear models are built sequentially from different sets of data, with previous sets of data being omitted from later models. With the fuzzy method, sequential models are also built but each model makes use of all previous data so that model accuracy improves during the optimization. As a result, the optimum point can be reached sooner and with less cost with the fuzzy approach than with the polynomial approach.

technique can be thought of as lying somewhere between the spline and polynomial regression technique. From the table, it can be seen that the strengths of fuzzy models include their ability to approximate very complex, multi-dimensional processes and their insensitivity to noisy data. Like neural network models, their identification is computationally intensive but, once established, they provide quick responses. While the fuzzy models themselves are quite complex, they are not the “black boxes” that neural network models are. As will be demonstrated in a latter section, this property allows us to manipulate the fuzzy models to achieve specific goals. The ability to incorporate physical understanding and prior knowledge is an important aspect of fuzzy models that is just beginning to be explored (Schaible, et al., 1997), (Johansen and Foss, 1997). Table I. Comparison of several regression modeling techniques Polynomial Regression

Spline

Neural Network

Fuzzy Logic

Global

Local

Combined Local and Global

Combined Local and Global

Poor

Good

Excellent

Excellent

Simple

Difficult

Simple

Simple

Sensitivity To Noisy Example Data

Insensitive

Very Sensitive

Insensitive

Insensitive

Incorporate and Extract Physical Understanding

Possible

Difficult

Difficult

Possible

Computation Time to Identify Model

Fast

Fast

Slow

Slow

Computation Time to Obtain Model Output

Fast

Fast

Fast

Fast

Difficult (Use Multiple Models)

Difficult (Use Multiple Models)

Simple

Difficult (Use Multiple Models)

Nature of Correlation Approximation of Very Complex Processes Extension to Multiple Dimensions

Multi-Output From Single Model

Figure 2. Example of a fuzzy model of a complex process. Strengths and Weaknesses of Fuzzy Logic Regression Models To encourage an understanding of the role that fuzzy logic regression models can play in regression modeling tasks, Table I compares several regression modeling techniques and summarizes the key characteristics of them. It is important to note that spline methods have limited applications but have been included in the table because their purely local nature contrasts with the global correlations obtained with polynomial regression. From this point of view, the fuzzy

THE FUZZY MODELING TECHNIQUE The fuzzy modeling technique employed here makes use of the method introduced by Takagi and Sugeno (1985), where the premise, or “if” part, of each implication contains the fuzzy variables, and the consequent, or “then” part, of each fuzzy implication is a parameterized equation of the model’s input variables. For the i-th implication, Ri: R i : If f i ( x1 is µ1i , x 2 is µ2i ,K, x k is µki ), then y i = P i ( x1 , x 2 ,K, x k )

(1)

where xl is the l-th input variable, l= 1, 2, … ,k, µli is a fuzzy variable, f i is a logical function of the propositions in the premise, yi is the output from the i-th implication, and Pi is a function of the k inputs, (x1, x2, … , xk) = x, that implies the value of yi when f i is satisfied. From here on, we will refer to the function, Pi, as an internal function. Internal functions may be linear or non-linear functions of the input variables. The internal functions used for the models discussed here will be linear in the input variables, and of the form

Pi (x) = p0i + p1i x1 + ...+ pli xl + ...+ pki xk

(2)

The pli ’s are model parameters to be determined from example data obtained from the process being modeled. The number of internal functions associated with a given fuzzy model is equal to the number of fuzzy implications associated with the model, and is dependent on the model structure. The model structure is determined by the number of input variables, k, and the number of fuzzy variables or membership functions associated with each input variable. The structure of the model may be prespecified based on prior knowledge of the process to be modeled, or it may be determined by some structure identification scheme. A method of determining the appropriate number of membership functions to assign to each input variable will be discussed below. We will use Φ to denote the total number of implications, or internal functions, for a given fuzzy model such that i=1,2,… ,Φ , and Nl to denote the total number of membership functions associated with the input variable xl such that nl =1,2,… ,Nl. In other words, each input, xl, has associated with it a “family” of membership functions consisting of Nl distinct functions. Φ is equal to the total number of distinct combinations of k membership functions that can be constructed using one membership function from the family associated with each input variable. Thus, Φ is equal to the product of all of the Nl's, l=1,2,… ,k. Furthermore, the following relationship exists between the indices, i, k, l, nl, and Nl. i=

k

functions, by optimizing the shapes of the membership functions, or both. The membership functions used for the fuzzy modeling technique discussed here are defined such that their range spans the interval [0,1] within the domain of interest. In addition, the forms used are generally of a type that do not exhibit multiple extrema within the domain of interest. Often, the mathematical functions chosen as membership functions include one or more parameters that can be changed discretely so that families of functions with different shapes can be generated. Examples of piecewise linear and piecewise quadratic functions of this type are shown in Figures 2 and 3, with families of one through seven functions being represented for each.

Figure 3. Families of piecewise linear membership functions.

Nl

∑ ∑1

(3)

l =1 nl =1

The output of the fuzzy model, y$ , is the weighted average of the Φ internal functions where the weight of the i-th internal function is determined by an operation performed on the membership functions associated with that internal function.

∑ {op[µ1i ( x1 ),...., µli ( x l ),..., µki ( x k )]P i (x )} Φ

y$ =

i =1

Φ



op[µ1i ( x1 ),..., µli ( x l ),..., µki ( x k

(4)

)]

i =1

In Eq. (4), “op” denotes an operation performed on the membership functions associated with the implication in question. For the models being discussed here, we focus on the use of the minimum operator and the product operator. With the structure and general form of the output specified for a given fuzzy model, the model is typically optimized to minimize the error between the actual and desired model output. This minimization is accomplished by either optimizing the coefficients of the internal

Figure 4. Families of linear and piecewise quadratic membership functions. For the models used in this paper, the parameters, pli , of the internal functions are adjusted using a gradient descent algorithm to minimize a certain cost function over a set of training data obtained from the process in question. We choose SSE as our cost function, defined as SSE =

m

∑ ( y$ j −

y j )2

(5)

j =1

where y$ j is the output of the fuzzy model for the jth input vector, y j is the actual value at the jth input, and m is the number of data points over which the error is computed.

A measure of the model’s accuracy is found by computing the multiple correlation coefficient, R 2 , R2 = 1 −

SSE m



(6)

( y j − y )2

j =1 2

where y is the mean of the m data points. R = 1.0 indicates an exact fit of the model to the data, although R 2 between 0.9 and 1.0 may be considered sufficiently accurate for many applications. The interested reader may wish to refer to Takagi and Sugeno (1985), Tan, et al. (1995), or Schaible and Lee (1996) for additional explanation of this fuzzy modeling methodology. Model structure is a term used to describe the number and arrangement of membership functions, and thus the number of fuzzy cells, associated with a fuzzy model. Identification of the appropriate model structure for a given problem is critical if the model is to accurately predict the phenomenon being modeled. Several structure identification schemes have been proposed for fuzzy models, including the ones by Sugeno and Kang (1988), Sugeno and Yasukawa (1993), Yeager and Filev (1993), and Tan, et al. (1995). The approach proposed by Tan, et al. (1995) will be used for the models developed in this paper. The structure identification scheme proposed by Tan, et al. (1995) splits the available data into two sets, one used to fit or “train” various models and one used to test their performance. A model with the simplest structure is established from the training data set. The simplest possible model structure uses one membership function per input variable. R2 is computed for the testing data set, and then the model complexity is incrementally increased in a “mother-daughter” scheme by adding membership functions. The testing R 2 is computed for each model, and model complexity continues to be increased until the testing R 2 values begin to decrease. The model structure having the highest test R 2 is assumed to have the optimum structure, as this is the structure that most naturally fits the process being modeled. A final fuzzy model is obtained with the optimum structure, with all data from both the training and testing sets used for training. RANKING PROCESS EFFECTS USING FUZZY LOGIC BASED REGRESSION MODELS Motivation When attempting to introduce or improve a manufacturing process, it is often helpful to know the relative importance of the process inputs and their various interactions. Typically, the ranking of process effects is accomplished through some analysis of variance (ANOVA) scheme. However, if the modeling technique used in the analysis cannot accurately model the process that is to be analyzed, the results of the analysis will likely be incorrect. Because many effect ranking techniques are based, either explicitly or implicitly, on polynomial regression models, they are well suited to processes that can successfully be modeled using polynomial regression models. However, if the process to be studied cannot be readily modeled with polynomial regression techniques, another approach must be used if accurate results are to be obtained. One alternate approach is based on the fuzzy logic regression models discussed in this paper. Because of the ability of these models

to fit a wide range of processes, and to appropriately (and automatically) partition the input space based only on information contained in the data, they can accurately model a wider range of complex processes. As a result, they are capable of giving accurate results for process that cannot be handled using traditional methods. Effect Ranking Procedure Several methods that make use of fuzzy logic to determine the importance of process inputs have been proposed recently. For example, Sugeno and Yasukawa (1993) consider this to be a system structure identification problem. They proposed a method of obtaining models with successively more and more inputs, choosing the input that most improves the predictive capability of the model at each step. As a result, inputs are added to the model in descending order of importance. Lin and Cunningham (1994) proposed the use of “fuzzy curves” to determine the importance of system inputs but do not employ the fuzzy logic modeling method used here. Both of these methods are capable of dealing with nonlinear, multi-dimensional processes, and both are capable of ranking the importance of the process inputs. However, neither method is useful in ranking the importance of the various interactions between the inputs of the process. Here, we review the method developed by Xie and Lee (1994) and Schaible (1996) which makes use of fuzzy logic regression models with specialized membership functions. These membership functions ensure that the model output is continuous throughout the input space and allow the final model to be decomposed into a polynomial form. The contribution of each input and interactions between the inputs to the overall accuracy of the model can then be determined using methods similar to the traditional hypothesis testing used for polynomial model based ANOVA. The models used for the effect ranking procedure are based on the model output given by Eq. (4). In general, this equation can be expanded to reveal that the fuzzy model represents the process being modeled with numerous rational functions. However, by appropriately defining the membership functions and the operator that are used, we can cause the denominator of Eq. (4) to be identically one throughout the input space of interest. When this is the case, expansion of Eq. (4) reveals that the output of the model is actually comprised of numerous overlapping polynomial equations. In order to ensure that the denominator of Eq. (4) is identically one everywhere in the input space we use the product operator and piecewise linear membership functions of a form that, for each input variable, sum to one everywhere, i.e. Nl

∑ µl.n ( x l ) = 1

nl =1

l

(7)

for any l and any x in the domain of xl. These membership functions are shown in Figure 5, and are fully defined by Xie and Lee (1994) and Schaible (1996).

Figure 5. Membership functions used for ranking. Using the product operator and membership functions like those shown in Figure 5, the general fuzzy model defined by Eq. (4) becomes y$ =

Φ

∑ (bi,1 + i =1

c i,1 x1 ) ×(bi,2 + ci,2 x 2 )×L

L ×( bi,k +

c i,k x k ) ×( p0i

+

(8) p1i x1 +

...+

p ki x k )

The coefficients, bi,l and ci,l are dependent on the values of xl and arise from the membership functions shown in Figure 5. The dependence of these coefficients on xl is due to the piecewise nature of the membership functions. The model given by Eq. (8) is a fuzzy logic model, and we can use previously proven techniques of fuzzy model structure and parameter identification to obtain a model that represents the process well throughout the input space of interest. However, because of the constraints we have put on the membership functions we are capable of decomposing the fuzzy model into one or more polynomial models. We can then apply concepts employed with polynomial models to rank the contributions of the various effects to the overall accuracy of the model. For a polynomial regression model of the form

y$ = β0 + β1 x1 + β2 x2 + L + βp x p

(9)

where the regression coefficients, β0,...,βp, are determined, for example, by a least squares method from the data, it is a simple matter to determine the order of importance of the model inputs, x1,...,xp. This is accomplished by investigating the effect of sequentially setting each of the regression coefficients to zero, which is equivalent to sequentially removing each of the polynomial model inputs, one at a time. Note that the x’s in Eq. (9) need not be linear with respect to the natural inputs of the process; x3 could be defined as a function of x1 and x2, x3=x12x2, for example. We will refer to the full form of Eq. (9) with none of the regression coefficients artificially set to zero as the full model (FM). We will denote by RM(xl) the reduced model with the effect of xl eliminated. If we compare the SSE of the full model and the SSE of each RM(xl), l=1,2,… ,p, we can relatively rank, in the order of their importance to the process model, the effects present in the model. Furthermore, by computing certain ratios based on the various SSE

values, we can statistically determine the significance of each effect. Readers interested in a more detailed description of the general ranking procedure are encouraged to refer to Chatterjee and Price (1991), Montgomery (1991), or other suitable texts that address hypothesis testing and uses of regression models. We can apply the same hypothesis testing technique used with polynomial models to a fuzzy model whose output is given by Eq. (8). A fuzzy model based on piecewise linear membership functions like those shown in Figure 5 and internal functions of the form given by Eq. (2) will consist of one or more polynomial equations. For such models, expansion of the argument of the summation in Eq. (8) will reveal that each term of the summation is actually of the form given by Eq. (9), where p is determined by the number of inputs in the fuzzy model, k, and is equal to 2 k ( k + 1) − 2( k − 1) k . The coefficients for the model (the β’s) are actually functions of the ai,l’s, bi,l’s, and pli ’s given in Eq. (8). This is discussed in much more detail by Xie and Lee (1994) and Schaible (1996). Because a fuzzy model, subject to the conditions discussed above can be represented by a sum whose terms are of the form given by Eq. (9), it is possible to define a “reduced” fuzzy model by subtracting the contribution of an effect of interest from the full fuzzy model. This is the equivalent of setting one of the regression coefficients to zero in a polynomial model. It is important to note that this approach is not the same as removing an input from the fuzzy model, because interactions can be removed without removing the contributions of the inputs that create the interaction. Case Study In order to illustrate the advantages of the fuzzy model based effect ranking method, we present the results of a case study for which the method is well suited. Consider a process represented by the two input function: f ( x1, x2 ) =

1

1

1 + 9x12 1 + 9x22

(10)

over the input range of − 1 ≤ x1 ≤1 and − 1 ≤ x2 ≤1 . A plot of this function is shown in Figure 6. The effect ranking process was performed for three sets of data obtained from Eq. (10). Each data set consisted of 36 points uniformly distributed throughout the input space of interest. One of the three sets of data had no noise added (σ=0), while the remaining two had random noise fitting a normal distribution added to the response variable. For the two sets, the added noise was proportional to five and ten percent of the response (σ=0.05 and σ=0.10, respectively). At each noise level, the process effects were ranked using the fuzzy model method discussed in the previous section as well as a polynomial regression model based method. The polynomial model analyses were performed using the commercial statistical software package, SAS.

Figure 6. Function used for the case study. For the fuzzy models, each data set was split into two subsets of 18 points each for the structure identification step. The structure identification algorithm (Tan, et al. 1995) yielded the same model structure for each noise level; three membership functions assigned to x1 and three assigned to x2. The results of the effect ranking are given in Table II while the reduced model SSE data on which the ranks are based is shown in Table III. Table II. Ranking results for the case study. Effect x1 x2 x1x2 x12 x22 x12x2 x1 x22

σ=0.0 Fuzzy SAS 1 2 1 2 2 2 3 1 3 1 4 2 4 2

σ=0.05 Fuzzy SAS 2 4 1 3 3 7 5 1 4 2 7 5 6 6

σ=0.10 Fuzzy SAS 2 4 1 3 3 7 5 1 4 2 7 5 6 6

Table III. SSE results for the ranking. Effect FM x1 x2 x1x2 x12 x22 x12x2 x1 x22

σ=0.0 Fuzzy SAS 0.0020 0.3633 0.1201 0.0000 0.1209 0.0000 0.0607 0.0000 0.0440 0.2610 0.0441 0.2610 0.0198 0.0000 0.0198 0.0000

σ=0.05 Fuzzy SAS 0.0041 0.3588 0.1195 0.0000 0.1229 0.0001 0.0626 0.0000 0.0440 0.2686 0.0462 0.2592 0.0213 0.0000 0.0221 0.0000

σ=0.10 Fuzzy SAS 0.0101 0.3582 0.1227 0.0001 0.1290 0.0002 0.0685 0.0000 0.0479 0.2763 0.0522 0.2576 0.0267 0.0001 0.0283 0.0000

The data in Table II and Table III clearly indicate a significant difference between the ranking obtained from the fuzzy models and the ranking obtained from the polynomial models. The reason for this lies in the inability of a single polynomial model to accurately model the function. For the polynomial models generated for this case, the multiple correlation coefficient, R2, computed over the 36 points, ranged between 0.590 and 0.599. This inability to model the function is also reflected in the relatively high SSE values for the full models given in Table III. The fuzzy models, on the other hand, closely fit the

data, and, more importantly, closely approximated the function. SSE values for the fuzzy models were relatively low for all three noise levels, and R2 for the three fuzzy models ranged between 0.989 and 0.998. To demonstrate that the ranking obtained from the accurate fuzzy models is the correct ranking we note that the function given by Eq. (10) is symmetric about the x1 and x2 axes and consider one-fourth of the original domain, namely 0 ≤ x1 ≤1 and 0 ≤ x2 ≤1 . Since the function is symmetric about two edges of the reduced domain, the ranking obtained in the reduced domain will be the same as the ranking in the full domain. In addition, a single polynomial model is capable of closely modeling the function in the reduced domain. Data consisting of nine points evenly spaced in the reduced domain were obtained from Eq. (10) and the ranking was performed again. The fuzzy model structure identified for these data corresponded to two membership functions for each input variable. Table IV and Table V summarize the results for the reduced domain. R2 for all six of the models summarized in Table IV and Table V was between 0.9980 and 0.9985, indicating all models fit the data well. Note that the ordering of the data shown in Table IV and Table V for all six models is essentially the same as the ordering of the data for the fuzzy models in Table II and Table III. Due to the symmetry of the function being modeled, pairs of effects theoretically have equal importance, thus the SSE of the reduced models without these effects should be equal. However, numerical accuracy and noise caused the calculated SSE of the reduced models to be slightly different, resulting in differences in ranking. Inspection of the data in the tables showing SSE will reveal the minor differences in SSE for these pairs of effects. Table IV. Ranking results for the reduced domain. Effect x1 x2 x1x2 x12 x22 x12x2 x1 x22

σ=0.0 Fuzzy SAS 1 1 1 1 2 2 3 3 3 3 4 4 4 4

σ=0.05 Fuzzy SAS 1 1 2 2 3 3 4 4 5 5 6 6 7 7

σ=0.10 Fuzzy SAS 1 1 2 2 3 3 4 4 5 5 6 6 7 7

Table V. SSE results for the reduced domain. Effect FM x1 x2 x1x2 x12 x22 x12x2 x1 x22

σ=0.0 Fuzzy SAS 0.0016 0.0015 0.1475 0.1458 0.1475 0.1458 0.0675 0.0652 0.0413 0.0397 0.0413 0.0397 0.0174 0.0159 0.0174 0.0159

σ=0.05 Fuzzy SAS 0.0014 0.0014 0.1704 0.1688 0.1596 0.1581 0.7657 0.0746 0.0488 0.0474 0.0428 0.0414 0.0207 0.0193 0.0182 0.0168

σ=0.10 Fuzzy SAS 0.0012 0.0012 0.1950 0.1936 0.1723 0.1709 0.0867 0.0845 0.0571 0.0558 0.0444 0.0432 0.0244 0.0231 0.0191 0.0179

Based on the data in Table II through Table V and our knowledge of the symmetry of the function that was modeled, we conclude that the fuzzy method essentially ranked the effects of the “process” using data from the original, complete domain. In order to properly rank the effects using polynomial models we first needed some knowledge about the symmetry of the process. Ranking with the fuzzy models required no such knowledge as the fuzzy model structure identification algorithm identified a model structure that was capable of modeling

the function over the entire domain. Further discussions of this technique along with additional case studies can be found in Xie and Lee (1994) and Schaible (1996).

PROCESS OPTIMIZATION USING FUZZY LOGIC RSM Motivation The traditional response surface method (RSM) involves the development of multiple polynomial regression models. At the initial starting point, which is likely distant from the optimum point, a first order model is typically constructed in a local region in an attempt to move towards the optimum point. Several intermediate first order models may be obtained in other local regions subsequent to the initial model before the optimum point is approached. Finally, a local regression model of higher order is developed, hopefully around the optimum point, so that the optimum point can be identified. In the process of obtaining these numerous models, many data are obtained. Unfortunately, the early data may not be useful in identifying the later models. When the cost of obtaining each empirical point is significant it is undesirable and inefficient to use a point for only one or two models out of the several that must be obtained. For a more detailed discussion of the traditional RSM approach the interested reader may wish to refer to Montgomery (1991) or Box and Draper (1987). The excellent data correlation capabilities of fuzzy models can be taken advantage of in this situation. With a fuzzy model we can begin with a global experiment design, obtain the necessary initial points, build a globally accurate model, and proceed towards the optimum point of the process. As additional confirmation points are obtained during the optimization process these points can be added to the data set to incrementally improve the accuracy of the global model. As a result, an economy of experiments and associated cost saving is realized while a final globally accurate model of the process is developed. The Fuzzy Logic Response Surface Method Xie, et al. (1994) initially proposed the fuzzy logic response surface method (FL-RSM). Aside from some minor differences related to whether the process is to be minimized or maximized, the procedure follows the same steps for any optimization problem. We will consider process minimization here. To accomplish the minimization procedure using the FL-RSM technique we begin by obtaining m initial DOE (Design of Experiments) points from the process. Using these points, an optimum fuzzy model structure should be identified. A starting point from which the optimum search will begin is specified. This starting point may be the point at which the minimum is expected to occur, based on knowledge of the process, or it may be a point from the DOE data. With the model structure, the DOE data, and the starting point specified, an initial fuzzy model is identified. From this fuzzy model, first derivatives are obtained at the starting point. To compute the first derivatives the central difference approximation or other numerical derivative calculation methods are used. The initial working point, which is also the starting point, is updated based on a specified step size. The direction of the movement is based on the numerically computed derivatives. A confirming experiment is conducted at each step. The search continues in the

same direction until confirming experiments no longer show improvement. A new fuzzy model is then identified from the original DOE data and the data obtained from the confirming experiments. Derivatives are computed from the new model at the most recent working point and the procedure is repeated. When only marginal improvement is realized using this technique, it is assumed the working point is near the optimum point. At this time, the search strategy changes to one based on the now mature fuzzy model. The optimum point within the model is given as the next working point and a confirming experiment is conducted. This procedure is also repeated until the exit criterion is met. The exit criterion is based on small improvement between successive confirmation runs. A full description of the procedure is given by Xie, et al. (1994). The FL-RSM technique allows for an expansion of the original domain of interest defined by the DOE data while avoiding uncontrolled extrapolation from the model. The key difference between the fuzzy model based optimization scheme and traditional polynomial based RSM methods lies in the nature of the models during the final stages of optimization. For traditional polynomial based RSM, many local models are constructed during the optimization process. Model-1 covers region 1 while Model-2 covers region 2, etc. Once the search moves to a new region the old data are discarded, and new experiments must be conducted to construct the model for the new region. Models in the latter stages of the optimization do not necessarily represent the process any better than the initial model. Instead, they simply hold in a region closer to the optimum point. In contrast to this is the FL-RSM approach. With this method, the model covers the entire input domain throughout the optimization. All available data is used to establish the model. This is possible because fuzzy models have the ability to model very non-linear relations over a wide input range. The initial model may not represent the process very accurately, but it is accurate enough to provide qualitative directions for the optimization. As more data is obtained and new models are identified, the error between the model and the process decreases. Compared with typical RSM optimization, FLRSM has the potential to be more effective and to identify the optimum point with fewer experiments. As a bonus, after the FL-RSM optimization is complete, an accurate global model of the process exists for other uses. Case Study Xie, et al. (1994) demonstrated their proposed FL-RSM on a simulated six-input chemical vapor deposition (CVD) process model developed by Lord (1987). Their results will be reviewed here. The CVD model simulates deposition of silicon on silicon wafers. The output represents silicon deposit thickness. The model is a fourthorder polynomial of two variables, x and y, which represent (x,y) coordinates of a point on a wafer. The coefficients of the polynomial are functions of the six process inputs for which optimum settings are to be found. The response to be optimized is the thickness variance, TVAR, computed over 15 points: 2

TVAR =

1 15  T − Ti   ∑ 15 i =1  T 

(11)

where Ti represents each of the 15 individual thicknesses and T represents the mean of the 15 thicknesses. The process inputs and their initial ranges are given in Table VI. The process inputs and response represent dimensionless, normalized quantities. Table VI. CVD process inputs and ranges. Variable Lower Limit Upper Limit

LBV 58 74

RBV 68 84

JetX 3 7

JetY 3 7

H2M 34 56

performing well under various noise conditions. The total number of experiments does not increase significantly when noise is included. However, as the noise level increases, the located optimum moves father away from the real optimum due to errors in the model induced by the noise. One potential solution to this problem is to conduct replicate experiments at each point in order to filter some of the noise.

H2R 44 66

Xie et al. (1994) considered four cases; one with no noise and one each with five, ten, and twenty percent random Gaussian noise. The DOE data were obtained from a Taguchi Orthogonal Array (OA), L16 plus one center point. Note that L16 is actually a fractional factorial design, 26-2. For the noiseless process, the exit criterion was satisfied after 41 total experiments (DOE plus confirming), indicating that an optimum point had been reached. The inputs for the found optimum were LBV=69.02 JetX=4.85 H2M=46.09 RBV=77.34 JetY=2.42 H2R=58.01 with a resulting TVAR=1.59. These values are very close to the actual optimum values of LBV=68.73 JetX=5.03 H2M=46.00 RBV=77.60 JetY=2.18 H2R=58.00 with a resulting TVAR=1.52. The “actual optimum” was determined directly from the CVD simulation model. Figure 7 shows the FL-RSM results for the noiseless process along with the results of an equivalent polynomial based optimization for comparison. From the figure, it is apparent that both methods proceeded similarly at the beginning of the optimization process. However, once additional data became available, the differences between the process and the fuzzy model decreased and the fuzzy approach was able to proceed to an optimum. In contrast, the polynomial approach was not able to capitalize on the new data and stalled at a sub-optimal solution.

Figure 7. CVD Optimization with fuzzy and polynomial models. Figure 8 shows the performance of the FL-RSM algorithm when it is presented with noisy data. For comparison purposes, the corresponding “clean” values of TVAR are shown in the figure rather than the noisy data used during the optimization process. From this figure, it is apparent that the FL-RSM algorithm is capable of

Figure 8. Optimization of the noisy CVD process. CONCLUSIONS The fuzzy logic based regression models discussed here differ substantially from other common regression modeling techniques in that their input-output relationships are determined by numerous local models which are blended together to achieve a continuous global model. Because of this, they can approximate a wide range of processes from very simple to very complex and in some cases this local/global nature has distinct advantages. For the ranking scheme discussed here, one advantage of the fuzzy models is their ability to model complex problems automatically, without the need for prior knowledge of the process such as how the input space should be partitioned. In addition, the model output can be manipulated to achieve specific goals. For example, it is quite simple to force the fuzzy model to be comprised of numerous polynomial models in order to perform the effect ranking. In the case of the optimization problem, the global correlation capabilities of the fuzzy models resulted in a significant improvement over traditional methods. While the cases presented here are somewhat specialized, they do represent potentially real problems. The fuzzy logic regression modeling technique is a tool with which some of these difficult problems can be solved. REFERENCES Box, G. E. P., and Draper, N. R., 1987, Empirical Model Building and Response Surfaces, John Wiley and Sons, 1987. Chatterjee, S., and Price, B., 1991, Regression Analysis by Example, John Wiley and Sons, second ed. Johansen, T. A., and Foss, B. A., 1997, “Operating Regime Based Process Modeling and Identification,” Computers and Chemical Engineering, Vol. 21, No. 2, pp. 159-176 Lin, Y. and Cunningham, G. A. III, 1994, “A Fuzzy Approach to Input Variable Identification,” IEEE International Conference on Fuzzy Systems, Vol. III pp. 1231-1236

Lord, H. A., 1987, “Convective Transport in Silicon Epitaxial Deposition in a Barrel Reactor,” Journal of the Electrochemical Society, Vol. 134, No. 5, pp. 1227-1235. Montgomery, D. C., 1991, Design and Analysis of Experiments, John Wiley and Sons, third ed. Schaible, B., 1996, “Modeling Nonlinear Processes Using Fuzzy Logic,” M.S. Thesis, University of Colorado, Boulder, CO. Schaible, B. and Lee, Y. C., 1996, “Fuzzy Logic Models with Improved Accuracy and Continuous Differentiability,” IEEE Trans. On Components, Packaging, and Manufacturing Technology, Vol. 19, No. 1, pp. 37-47. Schaible, B., Lee, Y. C., and Xie, H., 1997, “Efficient Design Using Fuzzy Logic Based Regression Models,” Proceedings, 47th Electronic Components and Technology Conference. Sugeno, M., and Kang, G. T., 1988, “Structure Identification of Fuzzy Model,” Fuzzy Sets and Systems, Vol. 28, No. 15, pp. 15-33. Sugeno, M. and Yasukawa, T., 1993, “A Fuzzy Logic Based Approach to Qualitative Modeling,” IEEE Trans. On Fuzzy Systems, Vol. 1, No. 1, pp. 7-31. Takagi, T. and Sugeno, M., 1985, “Fuzzy Identification of Systems and Its Applications to Modeling and Control,” IEEE Trans. on Systems, Man, and Cybernetics, Vol. 15, No. 1, pp. 116-132. Tan, J., Xie, H., and Lee, Y. C., 1995, “Efficient Establishment of a Fuzzy Logic Model for Process Modeling and Control”, IEEE Trans. on Semiconductor Manufacturing, Vol. 8, No. 1, pp. 50-61. Yeager, R. R., and Filev, D. P., 1993, “Unified Structure and Parameter Identification of Fuzzy Models,” IEEE Trans. On Systems, Man, and Cybernetics, Vol. 23, No. 4, pp. 1198-1205. Xie, H. and Lee, Y. C., 1994, “Analysis of Variance Using Fuzzy Logic Models,” IEEE International Conference on Fuzzy Systems,” Vol. II, pp. 1235-1239. Xie, H., Lee, Y. C., Mahajan, R. L., and Su, R., 1994, “Process Optimization Using a Fuzzy Logic Response Surface Method,” IEEE Trans. on Components, Packaging, and Manufacturing Technology, Part A, Vol. 17, No. 2, pp. 202-211.