UNIVERSITÀ DEGLI STUDI DI PADOVA Sede Amministrativa: Università degli Studi di Padova Dipartimento di Principi e Impianti di Ingegneria Chimica “I. Sorgato”

SCUOLA DI DOTTORATO DI RICERCA IN INGEGNERIA INDUSTRIALE INDIRIZZO: INGEGNERIA CHIMICA CICLO XXI

DEVELOPMENT OF MULTIVARIATE STATISTICAL TECHNIQUES FOR QUALITY MONITORING IN THE BATCH MANUFACTURING OF HIGH VALUE ADDED PRODUCTS

Direttore della Scuola: Prof. Paolo Bariani Supervisore: Prof. Massimiliano Barolo

Dottorando: Pierantonio Facco

To Claudia To my parents, Luciana and Adriano

Foreword The realization of this work has involved the intellectual and financial support of many people and institutions, to whom the author is most grateful. Most of the research activity that led to the results summarized in this Thesis has been carried out at DIPIC, the Department of Chemical Engineering Principles and Practice of the University of Padova, under the supervision of Prof. Massimiliano Barolo and Dr. Fabrizio Bezzo. Part of the work has been conducted under the supervision of Prof. Josè A. Romagnoli, at the “Gordon A. and Mary Cain” Department of Chemical Engineering of the Louisiana State University, Baton Rouge (LA, U.S.A.). The realization of this study has been made possible also through the financial support of SIRCA S.p.A. (Massanzago, Padova, Italy; www.sirca.it) and the scholarship of the “Fondazione Ing. Aldo Gini” (Padova, Italy). All the material reported in this Thesis is original, unless explicit references to the authors are provided. In the following, a list of the publications stemmed from this project is reported. PUBLICATIONS IN INTERNATIONAL JOURNALS Facco, P., F. Doplicher, F. Bezzo and M. Barolo (2009) Moving-average PLS soft sensor for online product quality estimation in an industrial batch polymerization process J. Process Control, in press. doi:10.1016/j.jprocont.2008.05.002 Faggian, A., P. Facco, F. Doplicher, F. Bezzo and M. Barolo (2009) Multivariate statistical real-time monitoring of an industrial fed-batch process for the production of specialty chemical Chem. Eng. Res. Des., in press. doi:10.1016/j.cherd.2008.08.019 Facco, P., R. Mukherjee, F. Bezzo, M. Barolo and J. A. Romagnoli (2009) Monitoring Roughness and edge shape on semiconductors through multiresolution and multivariate image analysis AIChE J., in press. PUBLICATION IN CONFERENCE PROCEEDINGS Facco, P., M. Olivi, C. Rebuscini, F. Bezzo and M. Barolo (2007) Multivariate Statistical Estimation of Product Quality in the Industrial Batch Production of a Resin Proc. DYCOPS 2007 – 8th IFAC Symposium on Dynamics and Control of Process Systems (B. Foss and J. Alvarez, Eds.), Cancun (Mexico), June 6-8, vol. 2, 93-98. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008) Using digital images for fast, reliable, and automatic characterization of surface quality: a case study on the manufacturing of semiconductors In: Workshop on nanomaterials production, characterization and industrial applications, December 3, Milan (Italy). Facco, P., A. Faggian, F. Doplicher, F. Bezzo and M. Barolo (2008) Virtual sensors can reduce lab analysis requirements in the industrial production of specialty chemicals In: Proc. EMCC5 – 5th Chemical Engineering Conference for Collaborative Research in Eastern Mediterranean Countries, Cetraro (CS, Italy), May 24-29, 178-181. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008). Monitoraggio multivariato e multiscala di processi di fotolitografia per la produzione di semiconduttori In: Proc. Congresso Gr.I.C.U 2008: Ingegneria Chimica, le nuove sfide, Le Castella (KR, Italy), September 14-17, 1383-1388. PUBLICATION IN TECHNICAL JOURNALS Barolo, M., F. Bezzo and P. Facco (2008) Sensori virtuali per monitorare la qualità di prodotti e processi ICP, 36 (4), 82-84.

Abstract Although batch processes are “simple” in terms of equipment and operation design, it is often difficult to ensure consistently high product quality. The aim of this PhD project is the development of multivariate statistical methodologies for the realtime monitoring of quality in batch processes for the production of high value added products. Two classes of products are considered: those whose quality is determined by chemical/physical characteristics, and those where surface properties define quality. In particular, the challenges related to the instantaneous estimation of the product quality and the realtime prediction of the time required to manufacture a product in batch processes are addressed using multivariate statistical techniques. Furthermore, novel techniques are proposed to characterize the surface quality of a product using multiresolution and multivariate image analysis. For the first class of products, multivariate statistical soft sensors are proposed for the realtime estimation of the product quality and for the online prediction of the length of batch processes. It is shown that, to the purpose of realtime quality estimation, the complex series of operating steps of a batch can be simplified to a sequence of estimation phases in which linear PLS models can be applied to regress the quality from the process data available online. The resulting estimation accuracy is satisfactory, but can be substantially improved if dynamic information is included into the models. Dynamic information is provided either by augmenting the process data matrix with lagged measurements, or by averaging the process measurements values on a moving window of fixed length. The process data progressively collected from the plant can be exploited also by designing time-evolving PLS models to predict the batch length. These monitoring strategies are tested in a real-world industrial batch polymerization process for the production of resins, and prototypes of the soft sensor are implemented online. For products where surface properties define the overall quality, novel multiresolution and multivariate techniques are proposed to characterize the surface of a product from image analysis. After analyzing an image of the product surface on different levels of resolutions via wavelet decomposition, the application of multivariate statistical monitoring tools allow the in-depth examination of the product features. A two-level “nested” principal component analysis (PCA) model is used for surface roughness monitoring, while a new strategy based on “spatial moving window” PCA is proposed to analyze the shape of the surface pattern. The proposed approach identifies the abnormalities on the surface and localizes defects in a sensitive fashion. Its effectiveness is tested in the case of scanning electron microscope images of semiconductor surfaces after the photolithography process in the production of integrated circuits.

Riassunto Nonostante i processi batch siano relativamente semplici da configurare e da gestire anche con un livello limitato di automazione e una conoscenza ridotta dei meccanismi che ne stanno alla base, spesso è difficile assicurare una qualità del prodotto finito riproducibile ed elevata. La strumentazione comunemente utilizzata nella pratica industriale riesce solo raramente a fornire misure in tempo reale della qualità di un prodotto. Inoltre, molte complicazioni nascono dalla natura multivariata della qualità, la quale dipende da una serie di parametri fisici, operativi o addirittura soggettivi. Sebbene le informazioni sulla qualità del prodotto non siano facilmente accessibili, esse sono racchiuse nelle variabili di processo abitualmente registrate dai calcolatori di processo e memorizzate in banche di dati storici. I metodi statistici multivariati permettono di ridurre la dimensione del problema proiettando le variabili di processo in uno spazio di dimensioni ridotte costituito di variabili fittizie che sono in grado di mantenere tutto il contenuto informativo sulla qualità, superando i problemi del rumore di misura delle variabili, della ridondanza e dell’elevato grado di correlazione. Inoltre, questi metodi sono in grado di trattare dati anomali o dati mancanti. Lo scopo di questa Tesi di Dottorato è di sviluppare dei sistemi innovativi per il monitoraggio della qualità di prodotti dall’alto valore aggiunto mediante tecniche statistiche multivariate. In particolare, i contributi scientifici di questo progetto di Dottorato sono: • l’elaborazione di tecniche per lo sviluppo di sensori virtuali per la stima in tempo reale della qualità del prodotto in sistemi produttivi di tipo batch; • l’applicazione non convenzionale di tecniche di proiezione su sottospazi latenti al fine di prevedere la durata di un batch o delle relative fasi operative; • lo sviluppo di metodiche innovative per il monitoraggio multirisoluzione e multivariato della qualità mediante l’analisi di immagini di un prodotto dall’alto valore aggiunto. Innanzi tutto, in questa Tesi vengono proposti sensori virtuali per la stima in linea della qualità del prodotto. Essi sono stati sviluppati e implementati prendendo in considerazione il caso di studio un processo industriale reale per la produzione di resine mediante polimerizzazione batch. I sensori virtuali proposti sono basati sulla tecnica statistica multivariate della proiezione su strutture latenti (PLS), che opera una regressione delle misure di processo usualmente disponibili in linea in tempo reale. Questo sistema riesce a garantire una accuratezza delle stime della qualità che è dello stesso ordine di grandezza delle misure di qualità fatte in laboratorio, col vantaggio che le stime in linea sono disponibili con altissima frequenza (sull’ordine di grandezza di s-1), cioè una frequenza centinaia di volte superiore delle misure che possono essere fatte in laboratorio (sull’ordine di grandezza di h-1). Inoltre, le stime sono accessibili in tempo reale e senza il ritardo che è tipico delle misure di laboratorio.

Riassunto

Al fine di compensare le non linearità dei dati e i cambiamenti nella struttura di correlazione fra le variabili, la procedura adottata divide il batch in una sequenza di un numero limitato di fasi di stima, all’interno delle quali lo stimatore virtuale è in grado di dare stime molto accurate per mezzo di modelli PLS lineari. Il passaggio da una fase a quella successiva avviene in corrispondenza di alcuni “eventi” facilmente riconoscibili nelle stesse variabili di processo. La caratteristica principale del sensore virtuale proposto è che esso tiene conto di informazioni sulla dinamica del processo per mezzo di modelli a “variabili ritardate” (i quali aggiungono informazioni sulla dinamica del processo da valori passati delle variabili di processo) o modelli a media mobile. Il filtro a media mobile aggiunge una “memoria temporale” al sensore virtuale che migliora l’accuratezza di stima e, mediando le variabili di processo all’interno di una finestra temporale di dimensione fissata, riesce ad eliminare il rumore di misura, attenuare il rumore di processo, appiattire valori anomali e compensare l’effetto di temporanee mancanze di dati. L’ampiezza della finestra deve comunque essere scelta con cautela, dato che una finestra temporale troppo larga potrebbe ritardare gli allarmi sull’attendibilità della stima. Da un punto di vista operativo, il sistema proposto aiuta il personale che opera nell’impianto a rilevare delle derive sulla qualità del prodotto, suggerisce tempestivamente le correzioni da apportare alla ricetta del processo, e aiuta a minimizzare i fuori specifica del prodotto finale. Inoltre, il numero di campioni per la misura della qualità in laboratorio può essere ridotto drasticamente, la qual cosa determina un guadagno sia sul tempo totale del batch, sia sui costi relativi al laboratorio che alla manodopera e alla sua organizzazione. Anche una seconda tipologia di sensori virtuali è stata sviluppata per assistere il monitoraggio in linea della qualità del prodotto e per fornire informazioni utili per una programmazione efficace della produzione: un sensore virtuale per la previsione in tempo reale della durata del batch. Questa strategia di monitoraggio si basa su modelli PLS evolutivi che sfruttano le informazioni progressivamente raccolte nel tempo durante il batch per prevedere la durata del batch o di ciascuno dei relativi stadi operativi. Anche l’accuratezza ottenuta dalle previsioni ottenute con questo sensore virtuale è del tutto soddisfacente, dato che l’errore di previsione è molto inferiore sia alla variabilità delle durata del batch che alla durata dei turni di lavoro degli operatori. Inoltre, la parte iniziale del batch conferma di essere di importanza fondamentale per la durata, in quanto le condizioni iniziali delle attrezzature, lo stato delle materie prime, e la fase di riscaldamento iniziale del reattore esercitano una grandissima influenza sulle prestazioni del batch stesso. Le informazioni che si ricavano sulla durata con grande anticipo rispetto alla fine del batch permettono una migliore organizzazione degli interventi sull’impianto, degli operatori d’impianto e dell’utilizzazione delle apparecchiature. L’efficacia dei sensori per la stima della qualità e per la previsione della durata del batch è stata verificata applicandoli ed implementandoli in linea nel caso della produzione di resine mediante polimerizzazione batch.

Riassunto

Infine, i metodi statistici multivariati sono stati utilizzati anche nel campo dell’analisi dell’immagine. Abitualmente, nella pratica industriale, le ispezioni di un prodotto mediante analisi dell’immagine vengono svolte con semplici misurazioni dei più importanti parametri fisici opportunamente messi in evidenza per mezzo di tecniche di filtrazione. Inoltre, queste misure vengono ottenute in modo non sistematico. Molte informazioni utili restano però “nascoste” nelle immagini. Queste permettono di identificare la natura complessa della qualità del prodotto finale. Per questo è stato sviluppato un sistema totalmente automatizzato per il monitoraggio in tempo reale da immagini di un manufatto dall’alto valore aggiunto. Questo sistema di monitoraggio basato su tecniche multirisoluzione e multivariate è stato applicato al caso della caratterizzazione della superficie di un semiconduttore dopo fotolitografia, un’operazione fra le più importanti nella fabbricazione di circuiti integrati. Tecniche avanzate di analisi multivariata dell’immagine estraggono le tracce che il processo lascia sul prodotto, aiutando sia il rilevamento di situazioni critiche nel processo che l’intervento con azioni correttive a neutralizzare eventuali problemi. L’approccio proposto in questa Tesi si basa su un filtraggio preliminare multirisoluzione dell’immagine mediante wavelet, seguito da uno schema di monitoraggio che conduce in parallelo un’analisi della rugosità superficiale e della forma della superficie di un prodotto. Ad esempio, la rugosità della superficie può essere esaminata con una analisi delle componenti principali “nidificata”. Questa è una strategia che si articola su due differenti livelli: il livello esterno che permette di discriminare parti differenti della superficie per mezzo di una analisi dei gruppi con PCA; il livello interno esegue il monitoraggio della rugosità superficiale con PCA. La forma della superficie viene analizzata per mezzo di un approccio PCA a “finestra mobile nello spazio”, il quale coglie l’informazione dell’immagine secondo il relativo ordine nello spazio e riesce anche a tener conto sia delle non linearità che delle differenze strutturali della superficie. Questo sistema è in grado di rilevare alcune delle caratteristiche qualitative del prodotto che abitualmente non sono accessibili senza richiedere l’intervento dell’uomo. Inoltre, il monitoraggio risulta essere veloce, attendibile e non ambiguo, ed esegue una scansione di un’immagine del prodotto localizzando in modo preciso difetti e anomalie e rilevando eventuali derive del processo. In conclusione, nonostante le metodologie proposte siano state testate su specifici casi di studio, esse hanno dimostrato di essere generali e vantano un grande potenziale. Per questo si ritiene sia possibile estenderle a differenti campi di ricerca e a diverse applicazioni industriali (ad esempio: ingegneria alimentare; industria farmaceutica; biotecnologie; etc…), nonché a differenti scale di indagine, dalla scala macroscopica alla microscopica o nanoscopica.

Table of contents LIST OF SYMBOLS ............................................................................................................................................ 1 General symbols, vectors and matrices .......................................................................................................... 1 Greek symbols................................................................................................................................................ 7 Acronyms ....................................................................................................................................................... 8

CHAPTER 1 - THESIS OVERVIEW AND LITERATURE SURVEY......................................................... 11 1.1 AIM OF THE PROJECT................................................................................................................................ 11 1.2 INTRODUCTION TO QUALITY AND STATISTICAL QUALITY MONITORING ................................................... 12 1.3 MULTIVARIATE STATISTICAL TECHNIQUES FOR PROCESS MONITORING ................................................... 14 1.3.1 Multivariate statistical process control for batch processes ....................................................... 18 1.3.1.1 Nonlinear multivariate models ......................................................................................................... 19 1.3.1.2 Multiway multivariate models.......................................................................................................... 19 1.3.1.3 Multiple multivariate models............................................................................................................ 21 1.3.1.4 Preliminary data treatment for multivariate statistical methods........................................................ 22

1.3.2 Multivariate image analysis......................................................................................................... 23 1.4 THESIS OVERVIEW ................................................................................................................................... 24 1.4.1 Realtime quality estimation and length prediction in batch processes ........................................ 24 1.4.2 Multivariate statistical quality monitoring through image analysis............................................ 27 1.4.3 Thesis roadmap............................................................................................................................ 29 CHAPTER 2 - MATHEMATICAL AND STATISTICAL BACKGROUND ............................................... 31 2.1 MULTIVARIATE STATISTICAL TECHNIQUES .............................................................................................. 31 2.1.1 Principal component analysis (PCA)........................................................................................... 31 2.1.1.1 PCA algorithm ................................................................................................................................. 34 2.1.1.2 Data collection, variable selection and data pre-treatment............................................................... 36 2.1.1.3 Selection of the principal component subspace dimension .............................................................. 37

2.1.2 Projection on latent structures (PLS; partial least squares regression)...................................... 38 2.1.2.1 Non-iterative partial least squares algorithm ................................................................................... 40 2.1.2.2 Variable selection in PLS models .................................................................................................... 41

2.1.3 Monitoring charts ........................................................................................................................ 42 2.1.3.1 Contribution plots, limits on the contribution plots, and relative contributions ............................... 47

2.1.4 Enhancement for multivariate statistical methods....................................................................... 50 2.1.4.1 Multi-way methods, data unfolding and data synchronization/alignment........................................ 51

2.2 MULTIRESOLUTION DECOMPOSITION METHODS ....................................................................................... 54

2.2.1 Continuous and discrete wavelet transform ................................................................................ 55 2.2.1.1 Bi-dimensional wavelet transform ................................................................................................... 59

CHAPTER 3 - INDUSTRIAL PROCESS FOR THE PRODUCTION OF RESINS BY BATCH POLYMERIZATION......................................................................................................................................... 63 3.1 THE INDUSTRIAL PRODUCTION PLANT AND THE OPERATING RECIPE ........................................................ 63 3.1.1 Resin A......................................................................................................................................... 64 3.1.2 Resin B......................................................................................................................................... 66 3.1.3 P&ID of the production facility ................................................................................................... 67 3.2 DATA ACQUISITION .................................................................................................................................. 68 3.2.1 Monitoring of the process variables ............................................................................................ 68 3.3 EMPIRICAL MONITORING OF THE PRODUCT QUALITY ............................................................................... 70 3.4 CHALLENGES FOR THE STATISTICAL MONITORING OF PRODUCT QUALITY................................................ 72 3.5 AUTOMATED QUALITY MONITORING THROUGH SOFT-SENSORS ............................................................... 72 CHAPTER 4 - SOFT SENSORS FOR THE REALTIME QUALITY ESTIMATION IN BATCH PROCESSES ....................................................................................................................................................... 75 4.1 QUALITY ESTIMATION IN RESIN A USING PLS MODELS ............................................................................ 75 4.1.1 Single-phase PLS model .............................................................................................................. 77 4.1.2 Multi-phase PLS model................................................................................................................ 78 4.2 INCLUDING TIME INFORMATION TO IMPROVE THE ESTIMATION PERFORMANCE ....................................... 82 4.2.1 Improving soft sensor performance through lagged process variables ....................................... 82 4.2.2 Improving soft sensor performance through moving-average process data................................ 85 4.3 COMPARISON OF THE ESTIMATION PERFORMANCES ................................................................................. 87 4.3.1 Reliability of the estimations........................................................................................................ 89 4.3.2 Diagnosis of the soft sensors faults.............................................................................................. 90 4.4 SOFT SENSOR FOR ESTIMATION OF QUALITY IN RESIN B........................................................................... 91 4.4.1 Estimation of the quality indicators............................................................................................. 92 4.5 CONCLUDING REMARKS ........................................................................................................................... 95 CHAPTER 5 - REALTIME PREDICTION OF BATCH LENGTH ............................................................. 97 5.1 DESIGN OF AN EVOLVING PLS MODEL FOR THE PREDICTION OF BATCH LENGTH ..................................... 97 5.2 PREDICTION OF BATCH LENGTH IN THE PRODUCTION OF RESIN B .......................................................... 100 5.2.1 Prediction of Stage 1 length....................................................................................................... 100 5.2.2 Prediction of Stage 2 length....................................................................................................... 103 5.3 PREDICTION OF BATCH LENGTH IN THE PRODUCTION OF RESIN A .......................................................... 105 5.4 CONCLUDING REMARKS ......................................................................................................................... 106 CHAPTER 6 - INDUSTRIAL IMPLEMENTATION OF A SOFT SENSOR PROTOTYPE................... 107

6.1 INDUSTRIAL SUPERVISION SYSTEM ........................................................................................................ 107 6.2 IMPLEMENTATION OF THE SOFT SENSOR ................................................................................................ 108 6.2.1 Architecture of the soft sensor ................................................................................................... 109 6.2.2 MatlabTM codes of the soft sensors ............................................................................................ 110 6.2.2.1 Prototype A.................................................................................................................................... 111 6.2.2.2 Prototypes B1 and B2 .................................................................................................................... 113

CHAPTER 7 - SURFACE CHARACTERIZATION THROUGH MULTIRESOLUTION AND MULTIVARIATE IMAGE ANALYSIS......................................................................................................... 115 7.1 PHOTOLITHOGRAPHY PROCESS AND INSPECTION TOOLS ........................................................................ 115 7.2 IMAGE ANALYSIS THROUGH MULTIRESOLUTION AND MULTIVARIATE STATISTICAL TECHNIQUES .......... 118 7.2.1 Image multiresolution denoising ............................................................................................... 119 7.2.2 Multivariate statistical surface monitoring methods ................................................................. 123 7.2.2.1 LER monitoring ............................................................................................................................. 123 7.2.2.2 Surface roughness monitoring ....................................................................................................... 124 7.2.2.3 Edge shape monitoring .................................................................................................................. 126

7.3 CASE STUDY: MONITORING RESULTS ..................................................................................................... 129 7.3.1 LER monitoring system.............................................................................................................. 129 7.3.2 Surface roughness monitoring system ....................................................................................... 130 7.3.3 Edge shape monitoring system .................................................................................................. 133 7.4 THE EDGE3

MONITORING INTERFACE ................................................................................................... 135

7.5 CONCLUDING REMARKS ......................................................................................................................... 136 CONCLUSIONS AND PERSPECTIVES....................................................................................................... 139 REFERENCES.................................................................................................................................................. 143 Web sites .................................................................................................................................................... 157

ACKNOWLEDGEMENTS.............................................................................................................................. 159

List of symbols General symbols, vectors and matrices a

= wavelet scale

A

= total number of latent variables

_______

AAE

= overall average absolute error

b

= wavelet location

br

= regression coefficient of the rth latent variable

B

= matrix of regression coefficients 2

2

ciT, j

= contribution ciT, j of the variable j to the Ti 2 of the ith observation

cit, j

= contribution of the variable j to the scores that compose the Ti 2 of the ith observation = contribution of the variable j to the square predicting error SPEi of the ith

ciE, j

observation = average contributions of variable j over all the I observations of the

c jE

reference for the SPE statistics

c Tj

2

= average contributions of variable j over all the I observations of the reference for the Hotelling statistics

c Ej ,lim (α )

E = the 100(1-α)% confidence intervals for the contributions ci , j

c Tj,lim (α )

= the 100(1-α)% confidence intervals for the contributions ciT, j

CE

= matrix of the contributions to SPE of all the J variables for all the I

2

2

observations in of X matrix

CT

2

= matrix of the contributions to T2 of all the J variables for all the I observations in of X matrix

dm

= detail of the signal x at the mth wavelet decomposition scale

D hm

= reconstruction of the horizontal detail Tmh

D mv

= reconstruction of the vertical detail Tmv

D dm

= reconstruction of the diagonal detail Tmd

2

List of symbols

ΔK

= lag on the process variables in the TP-PLS models

ΔK’

= length of the moving window in the MATP-PLS models

Δnsegm

= edge segment width (pixel)

Δnmw

= size of the moving window (pixel)

ei,j

= element of row i and column j of the residual matrix E

e

= residual of a test sample

eI+1

= error of reconstruction for the projection of xI+1 onto the latent variable space

e TI +1

= transpose of eI+1

E(x I +1 )

= expected value of xI+1

E

= 2D residual matrix of X

E

= 2D residual matrix of X

FJ , I − J ,α

= upper 100αth percentile of the F-distribution with I and I-J degree of freedom

F

= residual matrix of Y

h

= sampling instant of the quality variables in the 3D data matrix of regular shape

hi

= sampling instant of the quality variables in the 3D data matrix of irregular shape

h0

= parameter of the Jackson-Mudholkar equation

Hi

= total number of quality samples for the observation i

H0

= null hypothesis

H1

= alternative hypothesis

i

= observation of the reference dataset

i0

= image (pixel)

iM

= filtered image at the Mth scale of wavelet decomposition (pixel)

I

= total number of observations in the reference dataset

I

= identity matrix

L2 (ℜ )

= Hilbert space of square integrable functions in ℜ

j

= variable of the reference dataset

J

= total number of variables of the reference dataset

k

= sample for the process variables in the 3D data matrix of regular shape

3

List of symbols

ki

= sample for the process variables in the 3D data matrix of irregular shape

Ki

= total number of samples of observation i

m

= decomposition scale

M

= selected decomposition level

M1

= decomposition level selected for image denoising

Mr

= matrix of rank 1 of the rth latent variable

MRPEi,q

= mean relative prediction error for quality variable q in batch i during a single estimation phase

n

= counter

nel

= size of the edge length (pixel)

niwE

= size of the image width for the edges

niwV

= size of the image width for the valleys

nlevels

= number of selected topological levels

nsample

= total number of quality samples in an estimation phase

ntsw

= size of the trans-section width (pixel)

Nx

= length of the signal x

NA

⎛ mg KOH ⎞ ⎟⎟ = acidity number ⎜⎜ ⎝ g resin ⎠

Nimage

= number of images of edge segments

pi,j

= element of the ith row and jth column of the matrix P

pj

= row vector referring to the jth variable of the loading matrix P

pr

= loading of the rth latent variable of X

p Tr

= transpose of the loading of the rth latent variable of X

P

= probability function

P

= loading matrix of X

PT

= transpose of P

Pr

= matrix of the loadings for all the J variables and all the K samples

PRESS

= prediction error sum of squares

q

= quality variable

qr

= loading of the rth latent variable of Y

Q

= total number of quality variables

4

List of symbols

Q

= loading matrix of Y

QT

= transpose of Q

r

= generic counter

R

= rank of the X matrix

ℜ

= space of the real numbers

RMSECV

= root-mean-square error of cross-validation

R(X)

= 100(1-α) % confidence region of the likely value of a population containing

X = generic counter and spatial coordinate

s s

c Tj

= standard deviation of the contributions of variable j over all the I

2

observations of the reference for the Hotelling statistics = standard deviation of the contributions of variable j over all the I

sc E j

observations of the reference for the SPE statistics sr

= semi-axis of the confidence ellipse for the rth latent variable

s

= coordinate in the domain of pixel space (squared pixel)

Sm,n

= approximation coefficient at the mth scale of wavelet decomposition

S m +1,(n1 ,n2 )

= approximation coefficients of the multiresolution decomposition of an image

SPEi

= squared predicting error for the observation i

SPEI+1

= squared predicting error for the validation observation xI+1

SPElim (α )

= upper limit of SPEi at a confidence level 100(1-α)%

S

= estimated value of the covariance matrix Σ

Sm

= approximation matrix of an image at the mth wavelet decomposition scale

t*

= maximum time horizon for the prediction of the batch length (h)

t

I −1,

α 2

= Student t-distribution for I-1 and

α degrees of freedom 2

t lim (r , α )

= univariate limit at 100(1 − α )% confidence level for score tr

t1

= score vector of the first principal component

ti

= row vector referring to the ith observation of the score matrix T

tˆ I +1

= projection of the validation observation x I +1 onto the latent subspace

tr

= score vector of the rth principal component of X

t Tr

= transpose of tr

5

List of symbols

T2

= Hotelling statistics

TAAEi

= time-averaged absolute error of the batch i (h)

T(a,b)

= continuous approximation of the signal x for the wavelet decomposition by means of ψ at location b and scale a

Ti 2

= value of the Hotelling statistics for the ith observation

TI2+1

= value of the Hotelling statistics for a validation observation xI+1

2 ( A, I , α ) Tlim

= confidence limit of the Hotelling statistics at the 100(1-α) % level of confidence for a system of A latent variables and I samples

Tm,n

= detail coefficient at the mth scale of wavelet decomposition

Tmdenoised ,n

= denoised approximation of the wavelet decomposition of i0 at M1 scale

Tmh+1,(n1 ,n2 )

= horizontal detail coefficient of an image

Tmv+1,(n1 ,n2 )

= vertical detail coefficient of an image

Tmd+1,(n1 ,n2 )

= diagonal detail coefficient of an image

T

= score matrix of X

Tmh

= horizontal detail matrix of an image at the mth wavelet decomposition scale

Tmv

= vertical detail matrix of an image at the mth wavelet decomposition scale

Tmd

= diagonal detail matrix of an image at the mth wavelet decomposition scale

ur

= score vector of the rth latent variable of Y

u Tr

= transpose of ur

U

= score matrix of Y

VIPj

= importance of the variable j in the projection methods

wr

= weight of the rth latent variable

w Tr

= transpose of wr

W

= matrix of the weights

x

= generic signal

xi,j

= element of row i and column j of the X matrix

xi,j,k

= element of the X matrix

xi , j , k

= moving average of the variable j on batch i in the k-th time instant, element of the Xi matrix

6

List of symbols

xm

= approximation of the signal x at the mth wavelet decomposition scale

xi

= row vector of the ith observation of the X matrix

x I +1

= vector of a validation observation

xˆ I +1

= projection of x I +1 onto a latent space

xi,j

= jth variable time profile in batch i in form of column array of the Xi matrix

xj

= jth variable column vector of the X matrix

xj

= average value of the jth variable (column) of X

x i−,Δj K

= vector of the jth variable time trajectory for the ith batch lagged of –ΔK time instants

X

= reference data matrix of the process

X

= array of the mean values of the variables of X

ˆ X

= projection of the X matrix onto the space of the latent variables

X

= 3Dreference data matrix of the process variables

X0

= 2D data matrix at the zero decomposition scale

XBWU

= 2D data matrix derived form X form batch-wise unfolding

X BWU

= input matrix of the moving averages for the MATP-PLS model

XD

= matrix of lagged variables

Xi

= ith horizontal slice of X, i.e. matrix of the trajectories of all the J variables in all the Ki samples in time or space for the observation i

Xi

= matrix of the moving average data of the ith batch

X iD

= matrix of lagged variables for the ith batch

Xj

= jth vertical slice of X, i.e. matrix of the time/space evolution of the variable j for all the samples K and all the observations I

Xk

= kth vertical slice of X, i.e. matrix of the time/space sample k for all the J variables and all the I observations

XL

= augmented matrix with lagged variables for the LTP-PLS model

XM

= 2D data matrix at the Mth decomposition scale

XT

= transpose of X

XVWU

= bi-dimensional data matrix derived by variable-wise unfolding X

yi,q,h

= element of the Y matrix

yˆ i ,q ,h

= estimated value of yi,q,h

7

List of symbols

yˆ I +1

= estimated value of a quality index for the (I+1)th observation

Y

= matrix of the quality variables

Y

= three dimensional reference matrix of the quality variables

Yi

= ith horizontal slice of Y, i.e. matrix of the trajectories of all the Q quality variables in all the Hi samples in time or space for the observation i

zα

= normal standard deviate corresponding to the upper 100(1-α)% percentile

Greek symbols α

= percentile of the confidence limits

δ r ,s

= Kronecker delta

εi

= instantaneous error of estimation of stage length in batch i

θ

= generic parameter

θn

= parameter of the Jackson-Mudholkar equation

Θ

= space of all the possible parameters θ

λ

= forgetting factor

Λ

= diagonal matrix of the eigenvalues λr

λr

= eigenvalue of the rth latent variable

μ

= viscosity

μ0

= vector of the expected values of the J variables of the matrix X

φ m,n

= discretized father wavelet

φ(s )

= bidimensional wavelet function

Σ

= covariance matrix

τ

= batch length (h)

τi

= actual length of the stage in the same batch

τ∗

= number of samples corresponding to the time horizon t*

τˆ i (t )

= prediction at time t of the stage length in batch i

ψ

= mother wavelet function

8

ψ a,b

List of symbols

= mother wavelet function for a dilation parameter a and a location parameter b

ψ *a ,b

= complex conjugate of a “mother” wavelet function ψ a,b

ψ m ,n

= discretization of the mother wavelet function

ψ h (s )

= bidimensional horizontal wavelet

ψ v (s )

= bidimensional vertical wavelet

ψ d (s ) χ v2,α

bidimensional diagonal wavelet = χ2-distribution with v and α degrees of freedom

Acronyms

2D

= bi-dimesional

3D

= three-dimensional

AR

= autoregressive

ARMA

= auto-regressive moving average

BWU

= batch-wise unfolding

CA1

= carboxylic acid 1

CA2

= carboxylic acid 2

CD

= critical dimension

CD-SEM

= tool for the measurement of the CD through a SEM

D1

= diol 1

D2

= diol 2

DA1

= dioic acid

DPCA

= dynamic PCA

DPLS

= dynamic PLS

IC

= integrated circuit

IID

= independent identically distributed

LAN

= local area network

LER

= line edge roughness

LTP-PLS

= lagged three-phase PLS

9

List of symbols

LV

= latent variable

LV1

= first latent variable

LV2

= second latent variable

MATP-PLS = moving-average three-phase PLS MIA

= multivariate image analysis

MPCA

= multiway PCA

MPLS

= multiway PLS

NIPALS

= non-iterative partial least squares algorithm

NOC

= normal operating conditions

OLE

= object linking and embedding

OPC

= OLE for process control

PC

= principal component

PC1

= first principal component

PC2

= second principal component

PCA

= principal component analysis

PLC

= programmable logic controllers

PLS

= partial least squares method (projection on latent structures)

PV

= process value

P&ID

= pipelines and instrumentation diagram

RGB

= red, green, blue

RTU

= remote terminal units

SCADA

= supervisory control and data acquisition

SEM

= scanning electron microscopy (or microscope)

SIMPLS

= straightforward implementation of modified PLS

SP

= setpoint

SPC

= statistical process control

SQC

= statistical quality control

SQL

= structured query language

SWA

= side wall angle

TP-PLS

= three-phase PLS method

UV

= ultra violet

VIP

= variable importance in the projection methods

10

List of symbols

VO

= valve opening

VWU

= variable-wise unfolding

Chapter 1 Thesis overview and literature survey This Thesis is concerned with the development of technologies for product quality monitoring in the batch manufacturing of high value added goods. Two kinds of products are considered: those whose “quality” is determined by chemical/physical characteristics (e.g., viscosity, concentration, …), and those where surface properties (e.g. texture, roughness, …) define “quality”. Two main issues are investigated: i) the development of a strategy to design of soft sensors for the online estimation of product quality and the realtime prediction of batch length in batch chemical processes; and ii) the development of a strategy to design of automatic systems for surface characterization in the manufacturing of hardware devices. Tools from multivariate statistical analysis (namely, projection to latent subspaces) are used to develop the proposed technologies. In this Chapter, after an outline of the aims of the Thesis, the concepts of quality and statistical quality monitoring are briefly reviewed. Then, a survey will follow on the use of multivariate statistical tools for statistical process control, with particular reference to batch processes, for which several challenges are still open for investigation. A roadmap to the reading of the Thesis will conclude the Chapter.

1.1 Aim of the project Ensuring the conformance of the final product to a predetermined standard is of vital importance in high value added manufacturing in order to achieve the success in today’s increasing competitiveness of the global market. However, satisfying the requirements of the customers and meeting reproducibility and high quality of the final product is particularly difficult in most processes. Furthermore, most of the manufacturing processes are inherently multivariate, and quality itself is the multivariate expression of a plurality of indices that are related to process, possibly subject to visual features, and sometimes to personal judgement as well. The aim of this project is the development of multivariate statistical tools that enable to monitor the product quality in batch manufacturing systems in a systematic manner, in such a way as to analyze quality through the information embedded in process data or in images of the product. The proposed techniques are applied to different case studies:

12

Chapter 1

• the development of a strategy to design multivariate statistical soft sensors for the estimation of the product quality and for the prediction of the batch length in batch processes; • the development of a strategy to design an automatic method for the monitoring of the surface quality of a product through multiresolution and multivariate image analysis. The systems for the realtime estimation of product quality and for the realtime prediction of the batch length are applied to the case of a real-world industrial process for the production of resins by batch polymerization. This case study demonstrates that the proposed techniques are effective strategies to help the online adjustment of the process recipe when the quality deviates from the nominal conditions and before the final product is affected. Furthermore, these are a valid support for the organization of the production and for the scheduling of the use of the equipment and the coordination of the labour resources. The novel methodologies developed for the automatic characterization of the surface quality by image analysis are applied to the case of the surface monitoring in the afterphotolithography inspections that are carried out in the manufacturing of integrated circuits. In detail, a fully automatic system for the assessment of the surface characteristics of a semiconductor is developed to perform the monitoring of both the surface roughness and the surface patterns. To sum up, the main contributions of the PhD project are: • the development of innovative technologies for the online estimation of the product quality in batch processes; • the non-conventional application of latent variables subspace methods for the prediction of the length of batch processes; • the development of new methodologies for the multiresolution and multivariate systematic monitoring of the product quality from images of manufactured products.

1.2 Introduction to quality and statistical quality monitoring The quality movement traces its roots back to the late 13th century, when European craftsman began organizing into “guilds”, responsible for suggesting strict rules on the product and service quality, for adopting inspection committees, and for promoting special marks for flawless goods. Later, the industrial revolution followed this example. However, it was only after World War II that the idea of the “total quality” was introduced, and the notion of “inspection” extended to process technology improvement. Nowadays, “quality” embraces the entire organization of a company and, in the increasing competition of the global market, it is of critical importance that every process can manufacture high quality products with maximum yield. Meeting quality requirements is especially difficult when products consist of large numbers of components, or when processes consist of dozens, even hundreds, of

Thesis overview and literature survey

13

individual steps (Seborg et al., 2004). For example, batch processes for chemical manufacturing and microelectronic fabrication are carried out through a series of operating steps, where quality in each stage is strictly related to the quality of the other stages and heavily influence the final product quality. This results in the need of quality-oriented technologies. On October 1st, 2008, during the meeting on the “Future of quality” of the American Society for Quality (Milwaukee, WI, USA), it was pinpointed that the 21st century technologies are one of the key forces that will shape the future of the quality (http://www.asq.org/index.html). This PhD Thesis inserts in this scenario, developing automatic techniques for the realtime quality assessment in the high value added productions. The concept of quality is still not completely defined. In the common sense, quality is the degree of excellence of a product, a process, or a service. From the engineering point of view, quality is assumed to be a measurement of the conformance to a required standard, to guarantee high performances in terms of reliability, serviceability, durability, etc… (Montgomery, 2005). Namely, the purpose of quality is not only to force a product or a process to respond to predetermined features in order to reach a target or a nominal value in terms of physical, sensory, or time-oriented characteristics (quality of design), but also to improve the product and the process performances in order to reduce the defectiveness, the scraps, the costumer complaints, the rates of waste and of rework (quality of conformance). Therefore, the aim of quality monitoring is not only to monitor the quality of design, but also the quality of conformance (Montgomery and Runger, 2003). In summary, quality is inversely proportional to variability. Since the variability is an inconsistency that introduces unevenness and determines the major sources of poor quality, the improvement of quality can be reached through the decrease of the variability in products and processes. To reduce the variability, one of the most effective tools is the systematic use of statistics. In his pioneering work, Shewhart (1931) showed how the fundamental steps of the engineering quality control (i.e.: specification of the process goals; fabrication of in-spec products; and tests on the fabricated devices) can be traced by statistical quality control (SQC). SQC fixes (statistical) limits on the state of the production, and improves the uniformity of the quality, assessing the agreement of the product/process to an optimal reference. SQC has gained increasing interest both by the research community and by the industrial one (Hare, 2003). It should be acknowledged that quality is a synopsis of multiform attributes, depending on a composite combination of related parameters, which are often not accessible by common instrumentation hardware, sometimes not even measurable or quantifiable. Otherwise stated, quality is an inherently multivariable attribute. Furthermore, quality is often related to the values of all the process variables that can be measured during the product manufacturing. On this basis, classical SQC has moved a step forward to statistical process control (SPC) (Geladi and Kowalski, 1986; Wold et al., 1987; MacGregor et al., 1991; Jackson, 1991). SPC unveils

14

Chapter 1

the multivariate nature of a system and, furthermore, it can relate the quality parameters to the conditions in which the production process is carried out (Kresta et al., 1991; MacGregor et al., 1991).

1.3 Multivariate statistical techniques for process monitoring Generally speaking, SPC is a field of technology expansion, whose philosophy is to supervise the process performances over time for emphasizing the anomalous events leading to the degradation of the quality specifications (Kresta et al., 1991; Romagnoli and Palazoglu, 2006). Therefore, the goal of SPC is the quick and reliable detection of the existence, the amplitude and the time of occurrence of the changes that cause a process or a quality feature to deviate from a prescribed standard in the manufacturing of a product. SPC supports this task (MacGregor et al., 1991; Kourti and MacGregor, 1995; Seborg et al., 2004) and facilitates to quantify the probability in observing a process behaviour that does not conform to the expected one (Nomikos and MacGregor, 1994; Flores-Cerrillo and MacGregor, 2002 end 2003; García-Muñoz et al., 2003). Consequently, SPC not only provides underlying information on the state of a plant or of a product, but also assists the operators and the process engineers to remedy a process abnormality (fault1). The results are safer operations, downtime minimization, yield maximization, quality improvement, and reduced manufacturing costs (Chiang et al., 2001; Edgar, 2004). Since in the industrial practice every process exhibits some variability regardless how well it is designed, operated, and instrumented, it is important to discriminate between the common cause (natural and random) variability, which is a cumulative outcome of a series of unavoidable phenomena, and the abnormal (non-random) variability triggered by assignable causes, such as process changes, faulty conditions, errors, etc… The common cause variability is a sort of “background noise” that should operate with only “chance causes of variation” (Montgomery, 2005). This allow processes/products to stay in a state of statistical control. Unfortunately, other kinds of variability may occasionally be present in the output of a process, arising form improperly maintained (or controlled) machinery, operator errors, defective raw materials, unavoidable events, etc… The assignable causes lead to unacceptable levels of process performances or product defectiveness, and determine an out-of-control state. SPC helps investigating what does not work in a process and assists in undertaking the corrective actions before non-conforming products are manufactured. Therefore, monitoring is not only understanding the status of the process, but also the possibility of controlling the product quality. Direct inspection of the quality is usually impractical or, at least, delays the discovery of the abnormal process conditions, because the appearance of the defects in the 1

A fault is an unpermitted deviation in a system (i.e.: process changes, disturbances, problems to sensors or actuators), which is often not handled adequately by process controllers.

Thesis overview and literature survey

15

final product takes time. However, information about the quality is encoded in the process variables, which are often measured online, frequently and in an automatic fashion, thus enabling the refinement of the measure information and the inference of the product quality (Kresta et al., 1994; Çinar et al., 2003). In this way one can examine both the process performance and the product quality, ensuring repeatability, stability and the capability of the process to operate with little variability around an assigned target (i.e., the nominal conditions). Accordingly, SPC is a powerful tool to achieve process stability and improving process capability (Montgomery and Runger, 2003). Traditional monitoring methods consist of limit sensing and discrepancy detection (Chiang et al., 2001). The limit sensing raises an alarm if the state of the observed system crosses predetermined thresholds, while the discrepancy detection raises an alarm depending on model accuracy. The limit sensing imposes some limits to the observations of every process variable, but ignores the relation of each variable with the other ones (i.e., it is univariate). To detect the departures from a prescribed state of statistical control, control charts can be used. Their use is entrusted because they are proven techniques for improving productivity, are effective in defect avoidance, prevent unnecessary process adjustments, and provide diagnostic and process capability information. In statistical terms, the control charts are hypothesis testing techniques2 that verify if a process/product is in a state of statistical control. The in-statistical-control condition is the null hypothesis3 to be proved. The null hypothesis is verified, with a certain degree of uncertainty (level of confidence or significance) when the status of the observed phenomenon stays in proximity of the nominal conditions. Being the nominal conditions identified by the process average conditions, and the amplitude of the confidence limits identified by the common cause variability, moving the limits farther from the average conditions (rising the degree of uncertainty) decreases the risk of type I error4 (false alarm), and increases the chance of type II error5 (scarce sensitivity). The procedure suggested by Kourti (2003) for statistical process control develops through: • selection of the most representative observations (process data) from an historical database to the purpose of the model building. The selected observation should identify the so-called normal operating conditions (NOC) ; • pre-treating of the input data to facilitate the statistical analysis;

2

The statistical hypothesis testing is a methodology to make statistical decisions based on experimental data, almost always made rejecting, or failing to reject a null hypothesis. 3 The null hypothesis is a statement about a plausible scenario which may explain a given set of data and is presumed to be sufficient unless statistical evidence. The null hypothesis is tested to determine whether the data provide sufficient reasons to pursue some alternative hypotheses. 4 The type I error (or α-error, or false positive) is rejecting a correct null hypothesis, i.e. a false alarm. It occurs every time an out-of-control state is called by the monitoring charts when there is no assignable cause. 5 The type II error (or β-error, or false negative) is failing to reject a null hypothesis when it is false, i.e. an inadequate sensitivity. This is the risk that a point may still fall within the confidence limits of the monitoring charts when the status is really out of control.

16

Chapter 1

• model calibration; • checking the “observability” of the model, to test the efficiency of the monitoring model through a validatory procedure; • checking the performances of the monitoring model in the diagnosis of the special causes that affect a process or a product and determine a detriment of the quality or a loss of process performances. In typical industrial scenarios, hundreds, if not thousand of process data are available every few seconds, being collected online from process computers and stored in the supervision systems (Nomikos and MacGregor, 1995a; Nomikos, 1996). These data are characterized by spatial correlation (i.e. relations among variables) and serial correlation (i.e. relations among measurement of the same variable taken at different times or locations). Spatial correlation is due to the fact that several process variables are usually sampled throughout the process, and the response to a certain assignable cause affects several process variables. This means that the process variability is usually restricted to a much lower dimension than the one related to the number of variables collected in a process. The process data are serially correlated, as well, because of the relatively small sampling intervals. Furthermore, missing data and noise are often present. The need to handle correlation, noise, and missing data and the requirement to keep the dimensionality of highly correlated data to a reasonably low level calls for the calibration of multivariate statistical models, such as principal component analysis (PCA) and projection to latent structures (PLS, or partial least squares regression). PCA and PLS are data-driven methodologies with computationally non-expensive input-output model structures (Kresta et al., 1994; Cinar et al., 2003), whose frame is a typical black-box representation that derives from the historical data collected during experiments or industrial practice. For the purpose of SPC, PCA and PLS can be used to analyze process data, and to develop inferential models or statistical process control schemes (MacGregor et al., 1991). Both PCA and PLS extract the most important, systematic information hidden into process data, usually assembled in bidimensional (2D) matrices (observations×variables), and compress it through algebraic concepts, in such a way that the information is found in the correlation pattern rather than in the individual variables’ signals (Eriksson et al., 2001). Hence, massive volumes of highly collinear and noisy variables can be examined by projecting them onto a subspace made of few fictitious variables, called principal components (PCs) or latent variables (LVs), which explain the direction of maximum variability of the data and contain the greatest part of the relevant information embedded into data. Therefore, both methods are concerned with explaining the variance and covariance structure of a dataset through linear combinations (i.e.: PCs and LVs) of the original ones. This is the reason why PCA and PLS models are linear correlative representations, but not causal models. Note that PCA and PLS have slightly different meanings. In particular, if the case is interpreting and modelling one block of data (e.g., process data), PCA is the proper solution (Jackson, 1991; MacGregor et al., 1991;

17

Thesis overview and literature survey

Kourti and MacGregor, 1995). If it is necessary to investigate the relationship between two groups of data (e.g., process variables and quality variables) to solve a regression problem, the proper method is PLS, which can estimate or predict some response variables from a collection of predictor variables (Geladi and Kowalski, 1986; Höskuldsson, 1988; Kresta et al., 1991; Burnham et al., 1999; Wold et al., 2001). In summary, the former method maximizes the variance captured from the input data, while the latter maximizes the covariance between the predictor variables and the predicted ones. Although in this Thesis the main interest is in process engineering applications of multivariate statistical methods, several applications of these techniques are reported in the most diverse fields. An incomplete excerpt of some recent applications outside the process engineering community is reported in Table 1.1. Table 1.1 Topics of recent papers on applications of multivariate statistical methods in non-process engineering areas. Reference Dokker and Devis (2007) Giri et al. (2006) Škrbić and Onjia (2007) Viñasa et al. (2007) Harrison et al. (2006) Lee et al. (2008b) Tan et al. (2005) Whelehan (2006) Übeyli (2007) Giordani et al. (2008) Kirdar et al. (2008) Trendafilova (2008) Durante et al. (2006) Apetrei et al. (2007) Marín et al. (2007) Arvisenet et al. (2008) Clément et al. (2008) ElMasry et al., (2008) Quevedo et al. (2002) Doneski et al. (2008) Schievano et al. (2008) Viggiani et al. (2008) Liu et al. (2008) Qiao et al. (2007) Kim and Choi (2007) Liu et al. (2007b) Liu et al. (2005)

Area biology biology biology geology medicine medicine medicine medicine medicine energy bioprocessing mechanics food processing food processing food processing food processing food processing food processing food processing food processing food processing food processing food processing food processing image analysis image analysis image analysis

Topic sunflower and maize root cell structure study examination of the metabolism of nut alkaloids in mice detection of microelement content of wheat volcano surveillance texture analysis of non-Hodgkin lymphoma citoxicity of substances for cancer treatment persistence of pollutants in adipose tissue detection of ovarian cancer by proteomic profiles: automated diagnostic system for breast cancer electronic-nose for bio-diesel sources identification supporting key activities for bioprocessing vibration based damage detection in aircrafts wings fragrance sensing and taste estimation fragrance sensing and taste estimation fragrance sensing and taste estimation fragrance sensing and taste estimation fragrance sensing and taste estimation defects detection food classification and characterization food classification and characterization food classification and characterization food classification and characterization evaluation of aging or maturity quality survey face recognition mineral processing wood manufacturing

Finally, multivariate statistical techniques can be extremely useful in the analysis of data from non-conventional sensors (e.g., cameras) and are applied to the field of image analysis as multivariate image analysis (MIA; Geladi, 1995), either in some of the classic fields of chemical engineering, such as plastic material processing (Liu and MacGregor, 2005), steel industry (Bharati et al., 2004; Liu et al., 2007a), and furnaces flames control (Szatvanyi et al.,

18

Chapter 1

2006), or in other applications for high value added productions, namely wood manufacturing (Bharati et al., 2003), snack-food statistical quality monitoring and control (Yu and MacGregor, 2003; Yu et al., 2003) and food processing and packaging (Du and Sun, 2004; Brosnan and Sun, 2004; Du and Sun, 2008). Because batch manufacturing is the main focus of this project, in the following subsections a survey on how the SPC is applied to batch processes is presented.

1.3.1 Multivariate statistical process control for batch processes Batch and semi-batch processes are used to manufacture high value added goods, such as specialty chemicals and biochemicals, polymers, composites, pharmaceuticals, and materials for food, agriculture or microelectronics. With respect to their continuous counterpart, batch processes can accommodate multiple products in the same production facility, are flexible, easy to set up, and relatively simple to carry out, because the processing recipe usually evolves through a sequence of elementary steps performed in a assigned order to yield relatively small volumes of product with specified quality. Furthermore, for a batch process to be set up, it is often sufficient to have limited fundamental knowledge of the underlying process mechanisms. Although the batch manufacturing of a product is performed according to a given recipe, the product quality may show great variability, if no corrective actions are taken, and it is often difficult to manufacture multiple consistent products in accordance to strict requirements. In many instances, to meet the quality specification, only the batch duration is adjusted. Sometimes, the operating recipe can be corrected in real time in addition. There are several reasons that make batch monitoring and control an hard task (Seborg et al., 2004): the time varying characteristic of batch processes; their nonlinear and irreversible behaviour; lack of adequate mechanistic and fundamental models; lack of online sensors, sensor inaccuracy and infrequent sampling of quality indices; existence of constrains; unmeasured disturbances (i.e.: operators errors, fouling, impurities of raw materials, etc…). The data routinely obtained online from batch processes are not only multivariate in nature, but also nonlinear, highly auto-correlated and cross-correlated6, and time varying. The time variation implies that a new dimension should be taken into account in the data, i.e. the time. Namely, the data from batch processes can be collected in three-dimensional (3D) matrices (observations×variables×time) that hold both the variation between batches and the variation in time within a batch. PCA and PLS models are linear correlative models, which are valid when the correlation structure of the data remains unchanged in time. However, the correlation structure of the data usually changes during a batch run (Kourti, 2003). Moreover,

6

The auto-correlation identifies repeating pattern during time or along the space in a periodic signal. The cross-correlation is a measure of similarity between signals.

Thesis overview and literature survey

19

it changes not only within a batch, but also between batches, due to process changes, plant maintenance, sensor drifts, seasonal effects, etc… For this reason the multivariate statistical techniques evolved to embody not only the multivariable and correlative structure of the data, but also the nonlinearity and the time-varying nature of the batch data. To face the problem of time variation and change in the correlation structure of the data several methods have been suggested. Basically, four classes of approaches are highlighted in the literature: • nonlinear multivariate statistical methods, which are the traditional multivariate statistical techniques modified in a nonlinear manner and tailored for the nonlinear nature of the input data and the nonlinear correlation structure of the data; • multiway models, in which time is considered as an additional dimension of the data and the variability during time evolution can be assessed; • multiphase models, which split the data in series of segments in which a steady correlation structure of the data is preserved; • preliminary treatment of the data, in such a way as to rectify the inputs to a multivariate statistical method, either by decomposing the data signals in different resolution scales (e.g. through wavelets transform), or by de-correlating the dataset through auto-regressive moving average (ARMA) models or state space modelling. In the following sub-sections, the main characteristics and the limits of the abovementioned four classes of multivariate statistical methodologies are overviewed. 1.3.1.1 Nonlinear multivariate models Nonlinear multivariate statistical techniques were developed to overcome the problem of nonlinearity of the input data and of the nonlinear correlation structure of the data. The key strategy is to alter the algorithm of the PCA and the PLS to include the nonlinearity in the model, either through imposing nonlinear relation between variables (Wold et al., 1989; Baffi et al., 1999a), or through a neural network framework (Baffi et al., 1999b; Doymaz et al., 2003; Zhao et al., 2006b). The search for the right nonlinear structure of the model can be very demanding. 1.3.1.2 Multiway multivariate models When batch processes have to be examined and the third dimension (i.e.: time) is present in the data, the most popular multivariate statistical strategy is multiway SPC (Nomikos and MacGregor, 1994). Multiway PCA (MPCA) and multiway PLS (MPLS) are statistically and algorithmically consistent with PCA and PLS, respectively. In fact, MPCA and MPLS are equivalent to perform respectively PCA and PLS on an augmented 2D matrix derived by unfolding the 3D matrix.

20

Chapter 1

In the so-called batch-wise unfolding (BWU) method, the data are spread out in a 2D matrix that considers the data time order (Wise and Gallagher, 1996), putting side-by-side the time slices of the original 3D matrix. Simple pre-treatment of the input data (i.e., mean-centring7) can remove the major nonlinearity of the variables (Nomikos and MacGregor, 1995b). The result is that BWU-MPCA and BWU-PLS summarize the variability of the data with respect to both the variables and their time evolution (Kourti and MacGregor, 1995). Accordingly, the cross-correlation between variables is explained together with the auto-correlation within each variable. Namely, the entire history of the batch is taken into account and the batch dynamics is properly represented into the model. This is an effective approach for a batch-to-batch monitoring strategy, but some problems arise in the realtime monitoring during a batch run. In fact, not only the BWU approach starts to work well only by the time that at least the 10% of the batch history is already available (Nomikos and MacGregor, 1995b), but also it has two main drawbacks: i) the batch processes to be monitored must all have the same length, and ii) the entire history of a batch should be available during the batch evolution in order to be able to complete the 2D process data matrix. To solve the latter problem, Nomikos and MacGregor (1995a) suggested to fill the incomplete matrix under the hypothesis that either the future unknown observations conform to the mean reference conditions, or the current deviation from the mean variables’ trajectory remain unchanged for the rest of the batch duration. The problem of uneven batch duration is very demanding. Using the BWU-MPCA or BWUMPLS requires effective methods for the alignment and synchronization of the variables time trajectories, by stretching or shrinking the batch run to the length of a reference one. The most popular methods for the synchronization of the variables profiles are dynamic time warping (Kassidas et al., 1999) and indicator variable (Westerhuis et al., 1999). The latter method uses a monotonic variable as a batch maturity index, so that it is possible to align the batches, being the indicator variable an index of the percentage of batch completion. Otherwise, an indicator variable is not always available among the data. On the other hand, the dynamic time warping is a signal synchronization technique based on a pattern matching scheme of couples of trajectories, expanding or compressing a variable profile to match a reference one. Despite some attempts to streamline the computational burden (Kaistha and Moore, 2001; Ündey et al., 2002), the warping requires a very expensive algorithm structure and only few online applications of synchronizing strategies have been reported (Fransson and Folestad, 2006; Srinivasan and Qian, 2005 and 2007). Additionally, the synchronization is not always practicable, because it often entails the interpolation of the existing data in fictitious time points that can alter the auto- and cross-correlation structure of the data (Kourti, 2003).

7

Mean-centring is a pre-treating procedure operated subtract the mean of each variable to the actual value.

Thesis overview and literature survey

21

Alternative MPCA or MPLS strategies were developed. One such approach refers to a different unfolding methodology of the 3D data structure, i.e. the so-called variable-wise unfolding (VWU). VWU (Wold et al., 1987) spreads out the batch data in 2D matrices that preserve the direction of the variables, but do not consider the data time order. Variable-wise unfolded matrices are constituted by putting the horizontal slices of the original 3D matrix (i.e. observations) in vertical position one underneath the other. Using this procedure, neither estimating the future unknown part of the batch, nor synchronizing the batches are necessary. This results in easier online application than BWU approach, because filling the incomplete matrix with fictitious observations and aligning variables profiles of uneven length would introduce a certain degree of arbitrariness. On the contrary, VWU has the disadvantages that: i) it does not consider the time order, so the dynamics of the batch is lost, and the autocorrelation of the variables’ signals is not considered, and ii) the correlation structure is forced to be constant during the entire batch (Kourti, 2003). Accordingly, the issue in the VWU scheme is to take into account the dynamics of the process, the data auto-correlation, and the change of cross-correlation during time. The dynamics of a process can be included into a VWU framework assuming an autoregressive (AR) structure. An AR model regresses the present (or future) values of a variable through a linear combination of the values of the same variable at the previous time instants. This is completely consistent with the fact that in dynamic processes the current state depends on the past time points (Ku et al., 1995). This idea can be easily integrated into the VWU scheme by putting side-by-side the VWU data with the lagged version of the variables’ time signals in the so-called dynamic PCA (DPCA) and dynamic PLS (DPLS) procedures. Lu et al. (2005b) introduced a dynamic structure to compute the dynamic effect both within a batch and between consecutive batches. In general, DPCA and DPLS are straightforward methods to take into account the process dynamics, and the result is a much more limited correlation of the system (Chen and Liu, 2002). However, the issue of the data nonlinearity and the change in the correlation structure for the VWU approach are still present. 1.3.1.3 Multiple multivariate models Multiple model approaches based on a BWU strategy are: i) the local models (one model per sampling instant; Rännar et al., 1998); ii) the evolving models (one model for every sampling instants and all the past sampling instants; Louwerse et al., 2000; Ramaker et al., 2005); and iii) the moving window models (models for a limited part of the batch, the current sampling instant and few past observations; Lennox et al., 2001; Lee et al., 2004). The abovementioned multiple model approaches do not necessitate the filling of the incomplete data matrix with future observations. However, they require the synchronization of the batches, and involve a very large number of models, that is not always feasible.

22

Chapter 1

The alternative is splitting the process into a sequence of approximately linear segments (Kourti, 2003), following a multi-model structure based on the VWU-MPCA (or MPLS) analysis. At first, the need for phase division was introduced for the monitoring of multiple operating modes in continuous processes (Hwang and Han, 1999), but it revealed to be a viable and efficient solution for batch processes, too (Ündey and Çinar, 2002; Ündey et al., 2003a; Camacho and Picò, 2008a and 2008b). Therefore, more than one model is derived for a batch, each one for a different phase within the batch (Zhao et al., 2006a). The multiple phase modeling attenuates the problems related to the nonlinearity, and tracks the changes of correlation between variables during the batch. Camacho and Picò (2006a and 2006b), Lu et al. (2004a, 2004b and 2004c), Lu and Gao (2005a and 2006), Zhao et al. (2007a; 2007b) and Yao and Gao (2009) have designed different strategies for the automatic phase detection and switching. 1.3.1.4 Preliminary data treatment for multivariate statistical methods The preliminary treatment of the multivariate input data can be performed through: i) multiresolution methodologies of decomposition of the input signals on different frequency scales, or through ii) ARMA models and state space modelling to remove the correlation between data. The latter methods intend to erase any correlation on the latent space of the PCs (or LVs). Indeed, the multivariate statistical representations usually show high degree of autocorrelation of the PCs and the LVs. This determines high rate of false alarms in SPC systems. Furthermore, the filtering of the PCs (or LVs) with ARMA models can remove the autocorrelation. However, the univariate ARMA approach may not be sufficient for clearing the correlation, as demonstrated by Xie et al. (2006). Furthermore, the faults magnitude and time signatures of a process may be distorted by the ARMA filtering action (Lieftucht et al., 2006), so a Kalman innovation or state space models result to be preferable (Table 1.2) to better represent the multivariate case (Ljung, 1999). Table 1.2 Some papers on the methods for the data linearization based on Kalman innovations and state space models. Paper Xie et al. (2006) Lieftucht et al. (2006) Shi and MacGregor (2000) Li and Qin (2001) Treasure et al. (2004) Lee and Dorsey (2004)

Topic Kalman innovation Kalman innovation state space models state space models state space models state space models

In order to de-correlate the variables and to extract the deterministic features of a signal the wavelet transform can be used. The wavelet transformation produces a rectification of a signal

Thesis overview and literature survey

23

for any aperiodic, noisy, intermittent and transient signal, examining it in both the time and frequency domain (Addison, 2002). Mathematically speaking, the wavelet transform is a convolution of a wavelet function with a signal, which converts the signal in a more amenable way (Addison, 2002). In fact, the transformed version of the signal is filtered in such a way as to result more easily manageable (linear and stable white noise) by multivariate statistical techniques, making it suitable to work with data that are typically non-stationary and represent the cumulative effect of many underlying phenomena, each operating at a different scale, such as in batch processes (Kosanovich and Piovoso, 1997). In this way, the contributions of different scales of resolution are detected for all the events whose behaviour change over time and frequency. Once the signal is decomposed in different scales of resolution, the multivariate statistical model can be built both in the domain of the frequency (through the approximations and the details of the signal) and in the time domain (reconstructing the filtered version of the signal). Usually, one model is built for each decomposition scale (Bakshi, 1998; Bakshi et al., 2001; Yoon and MacGregor, 2004; Lee et al., 2005b; Maulud et al., 2006; Chang et al., 2006), and considers only the most interesting scales to the purpose of the monitoring, either by denoising the signal (Shao et al., 1999) or by removing the higher frequencies to avoid the effects of the process drifts or the seasonal fluctuations (Teppola and Minkkinen, 2000). Moreover, these techniques are very useful for an unambiguous fault detection (Misra et al., 2002) and isolation (Reis et al., 2008).

1.3.2 Multivariate image analysis In recent years, some attractive industrial applications involve the use of non-conventional and non-invasive sensors, such as cameras, for product quality characterization. Images are 2D light intensity mapping of a 3D scene, and are characterized by several challenging issues: • high dimensionality of the space, because images are may not only be monochromatic representations on gray levels, but may also have several channels of transmission (e.g.: RGB8 images, hyperspectyral images, etc…); • multivariate nature, because an image is an aggregation of a wide plurality of pixels9; • different characteristics in different scales of resolution; • high spatial correlation, because of the effect of neighbourhood of the pixels; • non-linearity, because of the physical structure of the object that is represented in the image; • combination of spatial and spectral information; • presence of noise, a random fluctuation of the light intensity that is an artefact of the signal.

8 9

RGB is a representation of the colours from an additive model derived by the primary colours red, blue and green. In digital imaging, the pixel is the smallest piece of information of an image arranged in a 2D grid.

24

Chapter 1

Multivariate statistical methods are ideal techniques to deal with the high dimensionality of the images and their inherent multivariate nature. Accordingly, multivariate image analysis (MIA) gained increasing interest (Geladi and Grahn, 1996) for both inferential modeling and statistical process control. MIA is a set of multivariate statistical techniques that allow to analyze images in a reduced dimension space rather than in the image space (Kourti, 2005). The aim of this approach is to extract subtle multivariate information from the image, in a different way from the usual digital image processing where the image is enhanced in such a way that its features become visible. Note that the problems of spatial correlation (correlation between pixel), neighbourhood, nonlinearity and noise can be faced analogously to what was suggested in the Section 1.3.1. Indeed, nonlinear models, as well as multiway, multi-model and multiresolution approaches, can be extremely useful and well tailored to the purpose of the image inspection. In fact, to a certain extent, it is possible to associate the concepts of neighbourhood and spatial correlation with the ones of process dynamics, auto- and crosscorrelation, and the concept of spatial nonlinearity to the one of temporal nonlinearity. Moreover, images combine spectral (in terms of both light intensity and colour) and spatial information. In the literature, the use of multi-resolution MIA is often suggested (Liu and MacGregor, 2007; Bortolacci et al., 2006), where the spectral information are properly studied by MIA classical approach, while the wavelet transform (Mallat, 1989; Ruttimann et al., 1998) is adopted to grasp the spatial information. Furthermore, the spatial information can be assessed including the study of the textural features of the inspected image (Salari and Ling, 1995; Tessier et al., 2007). In this way, effective frameworks are developed through image analysis for the task of either quality monitoring and control (Yu and MacGregor, 2003; Yu et al., 2003; Borah et al., 2007), or quality classification (Bharati et al., 2004), or quality prediction (Tessier et al., 2006).

1.4 Thesis overview As was mentioned earlier, the two main topics of this Thesis are the design for multivariate statistical techniques for: i) the realtime product quality estimation and length prediction in batch chemical processes, and ii) product quality monitoring through image analysis in batch manufacturing. The challenges of both topics are presented and discussed in the following.

1.4.1 Realtime quality estimation and length prediction in batch processes In principle, the operation of a batch process is easy, because the processing usually evolves through a “recipe”, i.e. a series of elementary steps (e.g.: charge; mix; heat-up/cool; react; discharge) that can be easily carried out even without supervision if the production facility is outfitted with a fairly large degree of automation. However, it is often the case that batch

Thesis overview and literature survey

25

plants are poorly instrumented and automated, and may require intervention by the operating personnel to provide online adjustments of the operating recipe with midcourse corrections to avoid the production of off-specification products. In fact, if the instantaneous product quality is not found to track a specified trajectory, the processing recipe must be adjusted in real time (possibly several times during a batch), and the batch is kept running until the end-point quality meets the specification. Unfortunately, most of the batch processes are run in an openloop fashion with respect to product quality control, because information about product quality is not available online, but is obtained offline from laboratory assays of few product samples. To contain the laboratory-related expenses (in terms of: need of dedicated personnel, consumption of chemicals, use of analysis equipment, etc…) only few product samples are taken during the course of a batch and sent to the lab for analysis. Even so, in a typical industrial scenario where several productions are run in parallel, 15,000-20,000 samples may need to be taken and analyzed each year, which add up to an important fraction of the total product cost. Because of the lack of real time information on the product quality, it may be difficult to promptly detect quality shifts and to counteract them by adjusting the operating recipe accordingly. Therefore, significant drifts on the quality profiles may be experienced before any intervention can be done on the batch. The net result is that the recipe adjustments are delayed, the total length of the batch is increased, and the economic performance of the process is further penalized. In this context, two typical challenges need to be addressed by a monitoring system in the production of specialty chemicals: the real time estimation of the instantaneous quality of the product, and the real time estimation of the length of the batch (or the length of any production stage within the batch). In fact, the performance of a batch process could be highly improved if accurate and frequent information on the product quality were available. Software sensors (also called virtual sensors or inferential estimators) are powerful tools for this task. They are able to reconstruct online the estimate of “primary” quality variables from the measurements of some “secondary” process variables (typically, temperatures, flow rates, pressures, valve openings), by using a model to relate the secondary variables to the primary ones. These issues are faced in this Thesis with reference to a real-world industrial case study, i.e. a batch process for the production of resins by polymerization. It is well known that developing a first-principles model to accurately describe the chemistry, mixing and heat, mass and energy transfer phenomena occurring in a batch process (e.g.: polymerization; crystallization; etc…) requires a very significant effort. Several designed experiments may be needed to identify the most representative set of equations and all the related parameters. Furthermore, if the plant is a multi-purpose one, this effort must be replicated for all the products obtained in the same facility. Finally, the resulting firstprinciples soft sensor may be computationally very demanding for online use.

26

Chapter 1

Multivariate statistical soft sensors may overcome these difficulties (Kresta et al., 1994; Chen et al., 1998; Neogi and Schlags, 1998; Chen and Wang, 2000; Kano et al., 2003; Kamohara et al., 2004; Zamprogna et al., 2004; Lin et al., 2007; Kano and Nakakagawa, 2008; Gunther et al., 2009). This class of inferential estimators does not require to develop extra information on the process in terms of mechanistic equations or values assigned to physical parameters. Rather, they extract and exploit the information already embedded in the data as these data become available in real time from the measurement sensors. Very often, a multivariate statistical method, i.e. PLS, can be exploited to design a soft sensor for the online estimation of quality properties. Several studies about the online estimation of product quality through multivariate statistical techniques are available for continuous polymerization processes. Most of the literature on the application of multivariate statistical methods to batch polymerization processes is related to the prediction of the end-point product quality only, or to batch classification, or is limited to simulation studies, as can be seen in Table 1.3. Table 1.3 Literature review on the estimation of the product quality in polymerization processes: papers and topics. Reference Russel et al. (1998) Komulainen et al. (2004) Lee et al. (2004) Lu et al. (2004b) Warne et al. (2004) Kim et al. (2005) Aguado et al. (2006) Sharmin et al. (2006) Zhang and Dudzic (2006) Zhao et al. (2006a) Yabuki and MacGregor (1997) Kaitsha and Moore (2001) Flores-Cerrillo and MacGregor (2004) Ündey et al. (2004) Zhao et al. (2008b) Nomikos and MacGregor (1995) Rännar et al. (1998) Chen and Liu (2002) Ündey et al. (2003a) Ündey et al. (2003b) Zhang and Lennox (2004) Lu and Gao (2005) Camacho and Picò (2006) Doan and Scrinivasan (2008) Zhao et al. (2008a)

Processing continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous batch batch batch batch batch batch batch batch batch batch batch batch batch batch batch

Problem realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation end-point estimation end-point estimation end-point estimation end-point estimation end-point estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation

Data industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial simulation simulation simulation simulation simulation simulation simulation simulation simulation simulation

Very few papers present industrial applications of multivariate statistical software sensors for the realtime estimation of the product quality for industrial batch processes (Marjanovic et al., 2006; Chiang and Colegrove, 2007). In this PhD Thesis, multivariate statistical techniques are

Thesis overview and literature survey

27

proposed to provide the online estimation of product quality in batch industrial polymerization processes. There are several specialty productions for which the total batch length is not known a priori, nor is it the length and the number of the processing stages within the batch. Knowing in advance the processing time is useful for several reasons. In fed-batch processes, for example, fresh raw material and catalysts should be loaded into the process vessels at a convenient time instant to adjust the batch run in real time. The ability to estimate in real time this instant (which may change from batch to batch) can result in savings both in the number of quality measurements to be processed by the laboratory and in the required total processing time (Marjanovic et al., 2006). On a different perspective, realtime estimation of the total length of the batch can be very useful for production planning, scheduling of equipment use, as well as to coordinate the operating labor resources. For these reasons, the non-conventional use of multivariate statistical techniques for the realtime prediction of the batch length is suggested and discussed in the Thesis. The abovementioned multivariate statistical techniques are applied and implemented to an industrial case study of batch polymerization for the production of resins. This process is monitored online through a fairly large number of process measurements. Several challenging features are present in this case study: • process measurements are noisy, auto-correlated and cross-correlated; • quality measurements are available offline from lab assays, but are scarce, delayed with respect to the sampling instant and unevenly spaced in time (a case which is rarely considered in literature); • the batches evolve through a nominal recipe, which is subject to several online adjustments made by the plant personnel depending on the actual evolution of the batch, as it is monitored by the offline quality measurements, and their personal judgment; • the process is poorly automated; • the batch length exhibits a large variability. All of these features make each batch hardly reproducible, and the online quality estimation a challenge.

1.4.2 Multivariate statistical quality monitoring through image analysis There is a class of products whose quality is not related to chemical or physical properties, but to surface properties (like roughness, pattern, colour, texture, and the like). For these products, quality is assessed by the analysis of an image of the manufactured device. For example in semiconductor manufacturing image analysis is used for quality monitoring, but only for the task of measuring the most important physical parameters of the manufactured device, despite several other key features of the semiconductor which determine the device quality are hidden and remain unmeasured. In particular, image inspections are used in

28

Chapter 1

photolithography. Photolithography is a process that selectively removes parts from a thin film using light, so that a geometric pattern can be transferred (often from a mask) to a light sensitive chemical (the resist) deposited on a substrate. This process is used during the fabrication of integrated circuits (IC) as well as in many other micro-fabrication processes (e.g., micro-compressors in mechanics: Waits et al., 2005; in biotechnology applications: Lee et al., 2008a). In particular, a microelectronics manufacturing process comprises an extensive sequence of complex semi-batch processes (Helbert and Daou, 2001), among which photolithography is referred to as one of the most important (Blais et al., 2001). In fact, photolithography: i) recurs up to 35 times for a given device; ii) defines the wafer critical dimension (CD) and the other most influencing parameters; and iii) affects all the successive processing phases (e.g., the doping) and the interconnection between different segments of the device. From an economical point of view, the lithography is responsible for about 60% of the processing time and 35-40% of the total cost of the IC fabrication (Blais et al., 2001). As a consequence, it is quite clear that monitoring the product quality during photolithography through a fast, sensitive, and reliable system is highly advocated. Although considerable effort has been dedicated to define technologies and procedures to meet the requirements on the product quality (Guldi, 2004; Yaakobovitz et al., 2007), automatic process control has not yet been implemented on a large scale in semiconductor manufacturing, and the industrial practice is often carried out empirically with relatively little understanding of the underlying physics and chemistry (Edgar et al., 2000), or through run-torun control strategies (Zhang et al., 2007 and 2008). Statistical process control techniques, too, are sometimes adopted (Edgar et al., 2000; Yue et al., 2000; Waldo, 2001) in order to monitor the variability of the process, to detect the abnormal conditions, and to identify the cause for a perceived anomaly. Currently, the most advanced monitoring strategies exploit hardware and software devices for both signal filtering and image processing (Rao, 1996; Lee, 2001). For instance, the use of scanning electron microscopy (SEM) images is common for the measurement of the physical parameters of a device (Knight et al., 2006) such as the CD (Constantoudis et al., 2003; Patsis et al., 2003). However, the typical inspecting tools focus on inline optical metrology systems measuring the CD of the pattern and its variability; only the most sophisticated instruments also determine the edge height and the side-wall angle (SWA; El Chemali et al., 2004). Several important quality features like the line edge roughness (LER), the edge surface smoothness, the actual shape of an edge (and its variability) are still rather resilient to effective, fast and low-cost monitoring technologies. Only recently some researchers (e.g., Zhang et al., 2007; Yaakobovitz et al., 2007; Khan et al., 2008) have suggested procedures to start tackling some of the above issues. Thus, the demand of satisfying the multiple requirements of wafer fabrication and the dynamics of a quickly changeable microelectronics market call for new and more powerful

Thesis overview and literature survey

29

monitoring tools. The quality of the manufacturing could be greatly improved if fast and more meaningful information were retrieved in a reliable fashion. For this reason, an innovative methodology is presented to inspect the surface of a product. In particular, the main components of the proposed quality monitoring strategy are: • sensitive filtering pre-treatment, to denoise the image signal removing the artifacts (i.e., the non-systematic fluctuations of the image light intensity) without affecting the featured parts and their peculiar characteristics (i.e., the real surface roughness); • tailored multivariate statistical monitoring models, based on a principal component analysis approach, which extract the information content on surface roughness and patterned shape. In particular, the analysis is performed by PCA on different scales of resolutions. Innovative modifications of the PCA model are proposed to analyze both the surface roughness and the shape of the patterned surface. The effectiveness of the proposed approach is tested in the case of semiconductor surface SEM images after the photolithography process, but the approaches are general and can be applied also to inspect a product through different types of images or different phases of the same production systems, or through different types of processes.

1.4.3 Thesis roadmap Chapter 2 overviews the mathematical and statistical background of the methods adopted in this Thesis, i.e. multivariate statistical models and multiresolution techniques. In particular, PCA and PLS are presented, and the issue of both data pre-treatment and model enhancement are discussed. Finally, multiresolution methodologies are recalled. Chapter 3 describes the industrial process under study (i.e. production of resins by batch polymerization). Details on the plant and on the production recipe are provided. The industrial system of supervision is briefly presented. Chapter 4 show how to design a multivariate statistical estimators of the product quality for the processes under study. Different architectures of the soft sensor are presented, and improvements of the estimation performance are proposed by including a multiphase structure and dynamic information on the process. The problem of the prediction of the batch length is the topic of Chapter 5, in which the effectiveness of time-evolving methods is demonstrated. In Chapter 6, the industrial implementation of prototypes of the abovementioned soft sensors is briefly described. Chapter 7 deals with the development of a fully automatic monitoring systems for the characterization of the surface of high value added products by means of multiresolution and multivariate image analysis. Reference is made to the manufacturing of integrated circuits. A prototype interface for photolithography monitoring is also presented.

30

Final remarks conclude the Thesis.

Chapter 1

Chapter 2 Mathematical and statistical background This Chapter overviews the mathematical and statistical techniques that are adopted in the development of the multivariate and multiresolution quality monitoring strategies. Details about the multivariate statistical techniques and the multiresolution wavelet transformation are presented and discussed. In particular, the theoretical formulation of PCA and PLS is recalled. After that, it is shown how these techniques can be integrated in monitoring frameworks of batch processes. Finally, the wavelet transform techniques are reviewed, describing their ability to extract the properties of a signal through a multiscale decomposition.

2.1 Multivariate statistical techniques In the following sections, the mathematical and statistical background of the multivariate statistical techniques used in this Thesis is overviewed. In particular, details are given on both the principal component analysis and the projection on latent structures, from both the theoretical and the algorithmic points of view.

2.1.1 Principal component analysis (PCA) PCA is a multivariate statistical method that allows to summarize the information of a wide set of correlated data projecting them onto few fictitious orthogonal variables which capture the variability of and the correlation between the original data. Let suppose that a set of data (i.e.: I observations of J variables) are collected in an (I×J) X matrix from an in-control reference, after being conveniently pre-treated (see §2.1.1.2). PCA performs a decomposition of the original variables to a system of eigenvalues of the covariance matrix of X through a principal axis transformation (Jackson, 1991). In this way, PCA can find the combination of the J original variables that describe the most meaningful trend of the dataset. From a mathematical point of view, PCA relies on an eigenvector decomposition of the covariance matrix of X: Σ = cov(X ) .

This method splits the X matrix of rank R in a sum of R matrices Mr of rank 1:

(2.1)

32

Chapter 2

X = M1 + M 2 + K + M r + K + M R ,

(2.2)

in which every Mr matrix can be represented by the outer product of two vectors: the scores tr and the loadings pr: X = t1p1T + t 2p T2 + K + t r p Tr + K + t r p TR ,

(2.3)

where p Tr is the transpose of pr. This operation is a principal axis transformation that shifts the data in a set of uncorrelated data tr described by orthogonal loading vectors pr. In fact, the simplest way to reduce the dimensionality of the original dataset is to find a standardized linear combination p Tr X of the original variables (Härdle and Simar, 2007), which maximizes the covariance of the system to deal with the correlation between the original J variables:

[

]

[

]

(

arg max cov(p Tr X) = arg max p Tr cov(X )p r = arg max p Tr Σp r {p r : p r =1}

{p r : p r =1}

{p r : p r =1}

)

with r = 1,K, R . (2.4)

The solution of the optimization problem corresponds to the maximization of a quadratic form for points on a unit sphere, which is the following eigenvector problem (Johnson and Wichern, 2007):

(

)

arg max p Tr Σp r = λ r with r = 1,K, R , {p r : p r =1}

(2.5)

where the loadings pr are eigenvectors of Σ, and λr are the eigenvalues associated to pr: Σp r − λ r Ip r = 0 ,

(2.6)

being I the identity matrix, and pr the director cosines of the new coordination system on which the original data are projected. As a result, λr is a measure of the variance explained by the product t r p Tr , where variance assume the meaning of quantity of information embedded into the model. Geometrically, the scores are orthogonal: ⎧⎪var(t r ) = p Tr Σp r = λ r , ⎨ ⎪⎩cov(t r , t s ) = p Tr Σp s = 0 for r ≠ s

(2.7)

while the loadings are orthonormal:

⎧⎪p Tr p s = 0 for r ≠ s . ⎨ T ⎪⎩p r p s = 1 for r ≠ s Furthermore, if the PCs are zero mean and zero covariance, it follows that:

(2.8)

33

Mathematical and statistical background

⎧R ⎪∑ var(t r ) = tr (Σ ) ⎪ r =1 . ⎨ R ⎪ var(t ) = Σ r ⎪⎩∏ r =1

(2.9)

where tr(Σ) is the trace of the covariance matrix and Σ is the determinant of Σ. As underlined by Jackson (1991), the new variables tr are principal components of X and the terms of equation (2.3) are usually presented in descending order of the eigenvalues (explained variance). When data have a large number of highly correlated variables, X is not a full rank matrix and it is possible to represent it through a small number of PCs, in such a way that the greatest part of the variance can be captured by a limited number of latent variables, defining A PCs with A Tlim model (SPE I +1 > SPElim (α )) . This monitoring procedure is equivalent to test the hypothesis of conformance of xI+1 to the reference set for both TI2+1 and SPE I +1 . However, a word of caution regarding the use of confidence limits on scores and Hotelling statistics is advised by Wise and Gallagher (1996). In fact, the confidence limits can be found only under specified conditions. Indeed, when the hypothesis of normal and uncorrelated input data (IID, independent and identically distributed variables) is assumed, the central limit theorem1 can be invoked. The assumption that a sample is drawn from a IID population is necessary to obtain the distribution of the test statistics, to build the confidence limits, and to estimate the proportion of a population that falls within certain limits (Jackson, 1991). Actually, it is possible to invoke the central limit theorem, which states that the scores, which are linear combinations of the original variables, derived from a sufficiently large X dataset are normally distributed, only if the J variables of X are IID random variables. On the contrary, if the original variables are not IID, the abovementioned fundamental assumption is violated, and the scores are not normally distributed. This determines that the confidence limits can not be valid in the aforesaid form. Therefore, PCA and PLS models can be adequate representation of a phenomenon only if there is no data autocorrelation, the cross-correlation among variables is constant through the available samples, and the original variables are normally distributed (Kourti, 2003). In other words, the multivariate statistical techniques are successful only when common cause variability affects a process and when the process variables are normally distributed and independent over time or space. These conditions are rarely satisfied, and the processes products often show clear non-linear behaviour and changes in the correlation structure between variables, and in space and time. Finally, a geometrical interpretation of PCA and PLS is shown in Figure 2.2. The samples of an optimal reference are projected form the original space onto a space of reduced dimensions made of latent variables, which are the directions of maximum variability of the data. Within this sub-space, the new observations can be analyzed through the T2 and SPE indices. In particular, the T2 index indicates the distance of the new observation from the average conditions of the reference, and the SPE indicates the distance of the new observation from the hyper-plane of latent variables.

1

The central limit theorem states that the sum of a sufficiently large number of IID random variables will tend to be normally distributed.

47

Mathematical and statistical background

x1 PC1

High Ti 2: anomalies within the model

High SPEi: lack of model representativity

x2

PC2

x3

Figure 2.2 Geometrical interpretation of the confidence limits in multivariate SPC and SQM.

A large value of the Hotelling statistics (i.e. the new observation is out of the elliptical limits) is an indicator of an unusual variation within the model, while a large SPE value (i.e. the new observation overcome the limit perpendicular distance from the hyper-plane of the latent variables) identifies anomalies outside the model. 2.1.3.1 Contribution plots, limits on the contribution plots, and relative contributions When a new observation xI+1 does not meet the NOC and an abnormal variation is detected by

the monitoring charts, further analyses are needed to find which variable (or set of variables) causes the current state of the process (product) to be out-of-control (out-of-spec). The contribution of the J variables to the observed value of TI2+1 or SPE I +1 helps to make a sound guess for the assignable causes of the abnormality (Nomikos, 1996). The use of contribution plots is the most common approach to detect the root cause of the problem. The contribution plots evaluate the contribution of each primary variable j to the relevant monitoring statistics, either T2 or SPE. When an anomaly is detected by the Hotelling statistics or the residuals, it is helpful to compare the contribution of every original variable to the relevant statistics with the usual value of the contribution in the NOC identified by the reference dataset. For this reason, the use of confidence bounds for the contribution to the Hotelling statistics and to the residuals were proposed (Conlin et al., 2000).

48

Chapter 2

2

The contribution ciT, j of every variable j to the Ti 2 for an observation xi is determined by the square root of the Hotelling statistics of equation (2.44): c

−

T2 i, j

1 2

= t i Λ p Tj .

(2.61)

This is derived by the contribution cit, j of every variable j to the scores that compose the Ti 2 : cit, j = x i pi , j ,

(2.62)

where xi is the row vector of the X data matrix referring to the ith observation, ti is the row vector referring to the ith observation of the score matrix T, pj is the row vector referring to the jth variable of the loading matrix P, pi,j is an element of P. Similarly, the contribution ciE, j of every variable j to the square predicting error SPEi of the ith observation is a single element ei,j of the residual matrix E: ciE, j = ei , j

.

(2.63) 2

The values of ciT, j , cit, j , and ciE, j describe how each variable contributes to the Hotelling statistics, to the scores and to the residuals, respectively, and can be positive or negative (Westerhuis et al., 2000). In summary, it is possible to collect the contributions of all the J variables for all the I observations in the (I×J) X matrices of the contributions to T2 and SPE: 2

{ } 2

CT = ciT, j

i = 1,K, I and j = 1,K, J

,

(2.64)

and:

{ }

C E = ciE, j

i = 1,K, I and j = 1,K, J

,

(2.65)

respectively. T2 Based on the assumption that both the contributions ci , j to the T2 statistics and the E contributions ci , j to the residuals are IID, the 100(1-α)% confidence intervals for the contributions can be found by: c Tj,lim (α ) = c Tj ± z α s 2

2

2

c Tj

2

c Ej ,lim (α ) = c jE ± z α sc E , 2

j

,

(2.66)

(2.67)

49

Mathematical and statistical background

which are the upper (when the sign + is retained) and lower (when the sign - is retained) confidence bounds. The confidence limits of the contribution plots of equations (2.67) and (2.68) are calculated by the average contributions that a determined variable j assumes over 2 all the I observations of the reference c Tj and c jE : c Tj =

1 I T2 ∑ ci, j I i =1

c jE =

1 I E ∑ ci , j I i =1

2

,

(2.68)

,

(2.69)

and the respective standard deviations s s

2 c Tj

=

sc E = j

(

2 1 I T2 ci , j − c Tj ∑ I i =1

(

1 I E ∑ ci, j − c jE I i =1

)

)

,

,

c Tj

2

and sc E : j

(2.70)

(2.71)

2

where the mean values c Tj and c jE should be zero, because they should derive from standard normal distributions. Therefore, when the values of TI2+1 or SPE I +1 exceed the respective confidence limits during the monitoring of a new observation xI+1, instead of considering the absolute value of the contributions, the relative size of the contribution have to be inspected (Choi and Lee, 2005), and this can be done by comparing the contribution of a single variable to the average contribution of the same variable in the reference NOC. The cause of the TI2+1 or SPE I +1 E T2 alarm can be diagnosed by comparing the current values of the contributions ci , j or ci , j to the respective limits for the entire set of the original variables j=1,…, J. In particular, if 2 ( A, I , α ) and a variable j* (with j*=1,…, J) is found to satisfy: TI2+1 > Tlim c IT+1, j* > c Tj*,lim (α ) , 2

2

(2.72)

then j* is the variable that “feels” the effect of the fault on TI2+1 . In the same way, if SPE I +1 > SPElim (α ) and: c IE+1, j* > c Ej *,lim (α ) ,

(2.73)

for a determined j*, the variable j* is suspected to be the variable which mainly affected by the root cause of the anomaly. When (2.73) or (2.74) are satisfied for more then one value of j*, this means that the effect of a certain fault distributes on different variables. This situation

50

Chapter 2

can arise when the effect of the anomaly impacts on more than one variable. Otherwise, if the anomaly distributes on all the J variables, the variable with the highest contribution-toE E T2 T2 contribution limit ratio c I +1, j c j ,lim (α ) or c I +1, j c j ,lim (α ) is the most responsible for the perturbation of the system. Therefore, interrogating the relative contributions demonstrates to be one of the most powerful methods to get a diagnosis, whenever a fault is detected (Facco, 2005; Choi and Lee, 2005).

2.1.4 Enhancement for multivariate statistical methods In Chapter 1 it was mentioned that some of the main complications that may arise when dealing with data through multivariate statistical methods are: i) the varying nature of the data along time or space, and ii) the changeable correlation structure between data. In addition to being multivariate in nature, process data are often highly auto- and cross-correlated, and often non-linear. This situation determines that data are far from being normally distributed or independent either from other variables and from observations which are neighbour in time/space. For these reasons, the time/space varying nature and the change in the correlation structure of the data have to be taken into account. Furthermore, non-normally distributed input data make the application of the abovementioned control limits in the monitoring charts impossible. In particular, the methods described in the previews sections can be applied only to bidimensional matrices X (I×J) and Y (I×Q) of IID data. These methodologies are good mathematical representation of the relationship between variables only when the correlation between the J variables remain the same throughout the evolution of a batch. It is often the case that data have a determined order in time or space. This adds a third dimension on the data array, and the variability in the third dimension should be considered, in addition to the correlation between variables. For example, in chemical batch processes the variables are not steady state, but show time trajectories. Another example is the one of images which can be represented by matrices of light intensities (also in different spectral channels), where neighbouring data (i.e. pixels) are correlated in space. In these examples, process/product data can be collected in 3D matrices X (I×J×Ki) or Y (I×Q×Hi), where I different batches (or different images) are treated as different observations, while time, space or different spectral channels represent the third dimension. Ki and Hi are the number of the samples collected along time (space) for the observation i, respectively in the matrix X and Y. Correlation is present in both the direction of the variables j (cross-correlation) and the direction of the time/space samples ki or hi (auto-correlation). Further complications are added when the 3D matrices X and Y have irregular shape, due to the differences in the number of samples (ki or hi) taken in time (or space). Moreover, the time trajectories of different processes variables or quality variables are sometimes not

Mathematical and statistical background

51

synchronized between the I batches, or the spatial characteristics of an image are not aligned between the I observations, so K r ≠ K s and H r ≠ H s with r ≠ s , and r=1,…, I and s = 1,K, I . To deal with the changeable nature of the correlation structure between data and the varying nature of the data, multi-way multivariate statistical techniques are commonly used. 2.1.4.1 Multi-way methods, data unfolding and data synchronization/alignment When the third dimension (i.e. time or space), is present in the data, and when the data collected in an ordered manner are assembled in regular 3D matrices X (I observations× J variables× K samples), the multi-way SPC (Nomikos and MacGregor, 1994) demonstrated to be a very effective strategy. Multi-way PCA (MPCA) and multi-way PLS (MPLS) are consistent with PCA and PLS, respectively, from both the mathematical and the algorithmic point of view. In fact, MPCA/MPLS have the same aims and benefits of PCA/PLS, because they are equivalent to perform PCA/PLS on enlarged 2D matrices derived by unfolding the 3D data matrices: A

X = ∑ t r Pr + E ,

(2.74)

r =1

where the tr’s are the score vectors, and the Pr’s are the loading matrices of the loading for all the J variables and all the K samples, the direction of maximum variability for every variable in every sample in time or space, and E is the residual matrix. “Unfolding” is a technique to derive 2D matrices by spreading out the original 3D matrices in a meaningful way to highlight the relevant variability to be inspected. Different unfolding methods were developed (Kourti, 2003), corresponding to different ways to unfold the 3D matrix, but two of them are the most significant (Figure 2.3): • batch-wise unfolding (BWU); • variable-wise unfolding (VWU). The BWU unfolding spreads out the 3D data in 2D matrices XBWU that consider the time/space order of the data (Wise and Gallagher, 1996), putting the time slices of the original 3D matrix side-by-side along the direction of the batches. Considering a 3D matrix X={xi,j,k} with i=1,…, I, j=1,…, J, k=1,…, K, where: • the ith horizontal slice Xi is the matrix of the trajectories of all the J variables in all the K samples in time or space for the observation (i.e. batch or image) i; • the jth vertical slice Xj is the matrix of the time/space evolution of the variable j for all the samples K and all the observations I; • the kth vertical slice Xk is the matrix of the time/space sample k for all the J variables and all the I observations.

52

Chapter 2

Figure 2.3 Unfolding of the three-dimensional data matrix in both the variable-wise direction and the batch-wise direction.

MPCA and MPLS can be performed using PCA and PLS respectively on the batch wise unfolded 2D matrix:

X BWU = [X k =1 X k =2 K X k = K ] ,

(2.75)

which is a (I×JK) matrix. Mean-centring the batch wise unfolded data matrix (i.e., subtracting the mean trajectory of each variable) removes the major non-linearity of the input variables (Nomikos and MacGregor, 1995b), summarizing the variability of the variables with respect to both the variables and their time/space evolution (Kourti and MacGregor, 1995). Accordingly, the

Mathematical and statistical background

53

cross-correlation between variables is analyzed together with the auto-correlation in time/space within each variable. This means that, in the example of batch processes, the entire history of the batch is taken into account and the batch dynamics is properly represented into the model. In the example of the images, the spatial structure (or different spectral channels) are considered throughout the entire image in the BWU. Anyway, some difficulties arise in the realtime application of BWU to the case of batch processes, because data are collected sequentially and are available for the entire batch only after the completion of the batch itself. In fact, BWU is successfully applied to run-to-run monitoring and control strategies. However, when online applications are required some issues have to be faced. In fact, before batch completion BWU works well only if at least 10% of the batch history is already available (Nomikos and MacGregor, 1995b). Furthermore, BWU presents two main limitations about the data collected in real time: • data are often not synchronized-aligned; • data are not available for the entire batch to perform a sequential test during a batch run. The latter problem can be solved filling the incomplete matrix for the future unknown samples under three alternative hypothesis (Nomikos and MacGregor,1995a): • the future samples conform to the mean reference conditions; • the current deviation from the mean variables’ trajectory remain unchanged for the rest of the batch duration; • using the ability of the PCA and PLS to handle missing data. The abovementioned methods to treat missing data can be used to this purpose. The synchronization of batches of uneven duration can be a very demanding issue. Using MPCA or MPLS on batch-wise unfolded data requires effective methods for the alignment/synchronization of the variables image features or time trajectories, stretching or shrinking the actual observation to the length of a reference one. The most popular synchronization methods are: • the dynamic time warping (Kassidas et al., 1999); • the indicator variable (Westerhuis et al., 1999). The VWU (Wold et al., 1987) represents the data in 2D matrices XVWU (IK×J) that preserve the direction of the variables (Eriksson et al., 2001) and do not consider the data time or space order, because they are constituted putting the slices of the observations Xi of the original 3D matrix in vertical position one underneath the other:

X VWU

⎡ X i =1 ⎤ ⎢X ⎥ = ⎢ i =2 ⎥ . ⎢ M ⎥ ⎢ ⎥ ⎣Xi=I ⎦

(2.76)

54

Chapter 2

Using this procedure, it is neither necessary to estimate the future unknown part of the batch, nor to synchronize or to align the signals. The VWU approach is easier to implement online. However, if the variables do not consider the time/space order, the dynamics of the data is lost in batch processing, and the effect of the neighbourhood is lost in images. In summary, the auto-correlations are not considered in VWU. Furthermore, the VWU forces the correlation structure between data to be constant within the entire batch or image (Kourti, 2003). But considering a fixed and unchangeable correlation structure of the data is too a restrictive condition. Considering auto-correlation and the change of cross-correlation during time is the main difficulty of the VWU scheme.

2.2

Multiresolution decomposition methods

Multiresolution decomposition methods are techniques which transform a signal into a representation that is more useful and easily manageable (Addison, 2002). To perform this decomposition a transformation process is needed: the wavelet transform. This procedure entails the use of wavelet functions, which are localized waveforms that spread out the signal from the original domain to the domain of frequency. This means that it is possible to convert the signal in a series of profiles, which are more linear and more normally distributed.

(a)

(b) Figure 2.4 Example of “Mexican hat” wavelet (a) location in the domain and (b) dilation of the scale.

Mathematical and statistical background

55

The wavelet transform mechanism entails the comparison between a wavelet of scale a and location b and an arbitrary signal x. To carry out a proper decomposition, the waveforms can be translated varying its location b (moved along the domain in which it is defined, i.e. time or space) or dilated varying its size a (shrinking or widening the wavelet) (Figure 2.4). The transform results in a positive contribution when the signal and the wavelet are both positive or both negative, while the transform is negative if the signal and the wavelet are of opposite signs (Figure 2.5). The higher the correlation between signal and wavelet is, the higher the absolute value of the transform is.

Figure 2.5 Mechanism of transformation of a signal through wavelet transform.

This means that, when the signal trajectory has approximately the same shape and size of the wavelet profile in a determined location, the transform produce a large positive value, and vice versa when signal and wavelet are out-of-phase. As a consequence, the smallest size of the wavelet are correlated to the highest frequencies of the signal (i.e., noise), while the widest size of the wavelet are related to the long-term fluctuations of the signal, such as drifts or seasonal effects. Note that an advantage in the choice of multiresolution techniques is that the signatures of a signal in its domain (i.e. time or space) are maintained in the transformation of the signal from the original domain to the frequency domain. In fact, the transformed signal can be rebuilt preserving the time/space information (which is unfeasible, for example, in the Fourier transform). In the following sections the mathematical and algorithmic aspects of the wavelet transform are presented, together with their main applications.

2.2.1 Continuous and discrete wavelet transform In mathematical terms, the wavelet transform is the convolution of a signal x with ψ *a ,b (s ) , the complex conjugate of a “mother” wavelet function ψ a,b , integrated over the signal range:

56

Chapter 2

+∞

T ( a , b) =

∫ x( s )ψ

* a ,b

( s )ds .

(2.77)

−∞

where s ∈ ℜ identifies the domain. The localized and normalized waveform of the mother wavelet is: ψ a ,b ( s ) =

1 ⎛ s −b⎞ ψ⎜ ⎟ , a ⎝ a ⎠

(2.78)

where a is a dilation parameter and b a location parameter. These family of translations and dilations is a basis of the Hilbert space of square integrable functions L2 (ℜ) . The transformation procedure compare the signal to the mother wavelet, shifting its location b and shrinking or stretching the scale a. If the signal and the wavelet are both positive or both negative in the original domain, the transform will be positive, otherwise it will be negative. The practical implementation of the wavelet transform entails the discretization of the scales a and of the step size between different locations b. The discretization can be given by: ψ m ,n ( s ) =

⎛ s − nb0 a0m ⎞ ⎟⎟ , ψ⎜⎜ a0m a0m ⎝ ⎠ 1

(2.79)

where n and m are integer parameters which respectively control the wavelet dilation and translation. The size of the translation step is Δb = b0 a 0m and the transform becomes: +∞

1

−∞

m/2 o

Tm ,n = ∫ x( s )

a

ψ(a0−m s − nb0 )ds = x, ψ m ,n

.

(2.80)

The inner products of x and ψ m,n are called detail coefficients Tm,n. The simplest and most efficient discretization is the so called dyadic grid. It generates orthonormal wavelets, where a0=2 and b0=1:

(

ψ m ,n ( s ) = 2 − m / 2 ψ 2 − m s − n

)

.

(2.81)

In a discretized wavelet transform there is a finite number of wavelet coefficients, which require the evaluation of an integral. After having passed the signal through the abovementioned high-pass filter ψ m,n , another function φ m,n (the so called “father” wavelet) is needed to avoid the numerical complication. The father wavelet have the same form as the mother wavelet:

(

φ m ,n ( s ) = 2 − m / 2 φ 2 − m s − n

)

,

(2.82)

57

Mathematical and statistical background

which are orthogonal to its translation, but not to its dilation, and performs a low-pass filter, i.e. a scaling function, which establishes the multiresolution features of the wavelet decomposition. The convolution of the scaling function with the signal produces the approximation coefficients: +∞

1

−∞

m/2 o

S m ,n = ∫ x ( s )

a

φ(a0−m s − nb0 )ds ,

(2.83)

so that the continuous approximation of the signal at the scale m can be generated by summing a sequence of scaling function at the scale factored by the approximation coefficients: +∞

xm (s ) =

∑S

n = −∞

m ,n

φ n ,m (s ) .

(2.84)

This is an approximated, smoothed version of the original signal. Also the original signal can be rebuilt following the reconstruction representation of the inverse wavelet transform: x m −1 (s ) = x m (s ) + d m (s ) .

(2.85)

The reconstruction has no redundancy, because of the normality of the wavelet. The term dm(s) is constituted of the detail coefficients at scale m: d m (s ) =

+∞

∑T

n = −∞

ψ m , n (s ) .

(2.86)

m,n

The result is that a signal can be represented combining the approximation coefficients and the series expansion of the details: x(s ) =

+∞

∑ S M , n φ M , n (s ) +

n = −∞

M

+∞

∑ ∑T

m = −∞ n = −∞

ψ m ,n (s )

m ,n

(2.87)

where M is an index of the chosen scale. For a signal of finite length Nx, M = log 2 N x is the maximum number of scales which can be investigated with the dyadic grid discretization. In summary, the wavelet transform is a band-pass filter, which allows the components within a predefined and finite range of frequency to relapse into the detail coefficients at each scale (Addison, 2002). Namely, at each scale the original signal is increasingly cleansed by the higher frequency components, by means of two complementary filters: a low-pass filter and a high-pass one. From the numerical point of view, the wavelet pyramidal algorithm (Mallat, 1989) decomposes a signal trajectory sequentially in a series of approximated versions of the

58

Chapter 2

profile (lower frequency scales) and details (higher frequency scales), iterating the procedure at every decomposition level.

Figure 2.6 Schematic diagram of the algorithm for the wavelet transform filtering.

Passing through the filters, the original signal is split into two parts: the approximation (which retains the high scale and low frequency part of the signal), and the detail (which summarizes the high frequency, low scale part). In this way the original signal can be studied at different resolution scales, or denoised and detrended in a meaningful manner.

(a)

(b) Figure 2.7 Wavelet signal filtering: (a) down-sampling associated to the signal wavelet filtering and (b) up-sampling associated to the signal reconstruction from approximations and details.

Mathematical and statistical background

59

Note that, when the signal is convolved with a low-pass filter (moving the filter along the signal step-by-step of the discretized domain) a dyadic down-sampling is applied: the signal is down-sampled by a factor 2 generating the approximation, that contains the odd elements of the signal. The signal is also convolved with a high-pass filter and down-sampled to form the detail that contains the even elements of the signal (Figure 2.7). The down-sampling retains only the odd elements of the for the approximation only the and for the detail only. This means that the approximation and the detail at scale m+1 are half of the dimension of the signal at scale m. In the signal reconstruction from scale m to scale m+1, the filtering process is reversed, feeding back the larger scales components (approximations and details) through the filter, which up-samples the scales components approximation and detail and assemble them re-building the original signal. In mathematical terms this is an inverse wavelet transform. Different types of wavelet are available for signal transformation: mexican hat wavelet (the second derivative of a Gaussian distribution function); Haar wavelet (the most effective for the representation of the discontinuities); Daubachies wavelets (the most frequently used in the texture analysis; Salari and Ling, 1997); etc.... What can be inferred by the literature is that the selection of the most proper wavelet is case sensitive, but it is suggested (Ruttiman et al., 1998) the use of wavelets that: • determine limited phase distortion; • maintain a faithful localization on the domain; • de-correlate the signal in a sensitive manner for both the smooth features and discontinuities. Also the choice of the proper decomposition scale is case sensitive. However, some general methodologies to select the most relevant scales are available in literature, such as the comparison of some statistical indices in different scales of resolution derived by the signals and the respective approximations (moments, entropy, skewness, kurtosis, etc…; Addison, 2002). 2.2.1.1 Bi-dimensional wavelet transform In many applications (e.g. image analysis) the dataset is a 2D matrix in the domain of the variables s ∈ ℜ 2 (Mallat, 1989). The wavelet transform can be used either to compress the data in a meaningful manner, or to perform a multiresolution characterization of the matrix. In both the cases, two-dimensional wavelet transforms are required. The two-dimensional wavelet transforms can be generated by the tensor product of their mono-dimensional orthonormal counterparts (Addison, 2002), using the same scaling procedure as the one-dimensional scale on both the rows and the columns of the data matrix.

60

…..

scale

Chapter 2

Figure 2.8 Schematic diagram of the matrix manipulation to decompose the 2D array on a bi-dimensional grid through wavelet transform.

Two-dimensional scaling and wavelet functions can be defined as: • 2D scaling function: φ(s ) = φ(s1 )φ(s 2 ) ;

(2.88)

• 2D horizontal wavelet (in the sense of the rows): ψ h (s ) = φ(s1 )ψ(s 2 ) ;

(2.89)

• 2D vertical wavelet (in the sense of the columns): ψ v (s ) = ψ(s1 )φ(s 2 ) ;

(2.90)

• 2D diagonal wavelet: ψ d (s ) = ψ(s1 )ψ(s 2 ) ;

(2.91)

Mathematical and statistical background

61

where s1 and s2 are elements of the 2D domain defined by all the s ∈ ℜ 2 (e.g., spatial coordinates of images). Accordingly, the multiresolution decomposition can be expressed as: 1 ⎧ ⎪S m+1,(n1 ,n2 ) = 2 ∑∑ ck1 ck2 S m ,( 2 n1 + k1 , 2 n2 + k2 ) k1 k 2 ⎪ ⎪ h 1 ⎪Tm+1,(n1 ,n2 ) = 2 ∑∑ bk1 ck2 S m,( 2 n1 + k1 , 2 n2 + k2 ) k1 k 2 ⎪ ⎨ 1 ⎪T v ∑∑ ck bk S m,( 2n1 +k1 ,2n2 +k2 ) m +1,( n1 , n2 ) = ⎪ 2 k1 k2 1 2 ⎪ 1 ⎪T d ∑∑ bk bk S m,( 2n1 +k1 ,2n2 +k2 ) m +1,( n1 , n2 ) = ⎪ 2 k1 k2 1 2 ⎩

(2.92)

where k1 and k2 are scaling coefficients and n1 and n2 are location indices. The general idea of a 2D wavelet decomposition is shown in Figure 2.8. After the first decomposition, the original data matrix X0 is split into four distinct sub-matrices: an approximation S1; an horizontal detail T1h ; a vertical detail T1v ; and a diagonal detail T1d . In the next decomposition scale, the details are left untouched, and the next iteration decomposes only the approximation S1. The transformation at scale m=2 decomposes S1 in a new approximation S2 and the details T2h , T2v and T2d . This procedure can be iterated M times for a (2M×2M) matrix, where the dimension of the matrices Sm, Tmh , Tmv and Tmd is down-sampled to (2M-m×2M-m). Once more, the original matrix can be reconstructed as:

X 0 = X M + ∑ (D mh + D mv + D dm ) , M

(2.93)

m =1

where the matrix XM is the smooth version of the original matrix at the largest scale index M, while the D hm , D mv and D dm are the reconstruction of the details from the coefficients in Tmh , Tmv and Tmd , respectively.

Chapter 3 Industrial process for the production of resins by batch polymerization In this Chapter, an industrial batch process for the production of resins is presented as a typical example of batch polymerization process for the production of high value added goods. The process is discussed showing both the methodology for the production of resins (i.e. the recipe) and the plant in which the production is carried out. After that, the main challenges of the batch polymerization processes are briefly analyzed from the operational and the organizational point of view. Finally, it is shown how process and quality data are collected, how quality and process are usually monitored in the industrial practice, and how the recipe can be adjusted to pursue a determined target. Accordingly, the benefits of the implementation of soft-sensors for the online estimation of quality and for the real time prediction of a batch duration are explained.

3.1 The industrial production plant and the operating recipe Batch processing is used to manufacture high value added goods, such as specialty chemicals and biochemicals, products for microelectronics, and pharmaceuticals. With respect to their continuous counterparts, batch processes are easier to set up and often require only limited fundamental knowledge of the underlying process mechanisms. In principles, the operation of a batch process is easy, because the processing recipe usually evolves through a series of elementary steps (e.g.: charge; heat-up/cool; mix; react; discharge) that can be easily carried out even without supervision, if the production facility is outfitted with a fairly large degree of automation. Important features of batch processes are: i) their flexibility; ii) the fact that they allow to manufacture several different products in the same production facility; and iii) a consistently high and reproducible quality of the final product can be achieved by properly adjusting the operating recipe, in spite of changes in the raw materials and in the state of the equipment or of the utilities, with a limited degree of automation. To ensure consistency and productivity, batch plants often need the manual intervention of the operating personnel to correct the operating recipe or to adjust the conditions of the reaction environment. To the purpose of quality control, information about product quality is required, but is usually obtained from a

64

Chapter 3

scarce number of laboratory analysis of product samples taken from the reactor. The lack of online measurements on product quality determines a delayed detection of the quality shifts and difficulties to counteract them by adjusting the operating recipe. Therefore, a quality control strategy for a batch process often reduces to the online control of some key process variables, that are available online, and to a midcourse correction policy to compensate for the shifts detected in the product quality measured offline. All these characteristics of batch processes can be found in the case study considered in this Thesis: an industrial batch polymerization process for the production of resins. These products are high molecular weight polyester resins, produced by batch catalytic polycondensation of carboxylic acids and alcohols. Several different resins are produced in the same reactor in different production campaigns. In the following sections, the case of two resins is considered. To protect the confidentiality of the data, the resins under study are called resin A and resin B. The production of resin A and B are carried out running batches through prescribed sequences of operating steps, most of which are triggered manually by the operators. The switching from one operating step to the subsequent one is determined by the current values of the quality of the resin, which is determined by the combined values of two indicators, namely the resin acidity number (NA) and the resin viscosity (μ). The process recipe evolves through an operating policy that accounts for: • the initial load of raw materials, catalyst and additives into the reactor; • the mixing and heating-up to a preset temperature; • the reaction with simultaneous separation of water; • the vacuum phases; • the discharge of the final product. Since midcourse corrections of the recipe are always required, fresh raw materials and catalyst are added to the reaction environment to adjust the batch and to counteract the deviation from nominal conditions.

3.1.1 Resin A Resin A is a polyester resin manufactured via batch poly-condensation between a diol (D1) and a dioic acid (DA1). Besides the desired product, the polycondensation reaction leads to the formation of water, which must be removed from the reaction environment to promote the forward reaction. The typical sequence of the operating steps for the production of resin A runs as follows. Cleaning of the equipment and lines is done when a different resin has been produced in the preceding batch. Then, the reactants, additives and catalyst are loaded into the reactor. The charge of liquid D1 is automated, while the DA1 is charged manually as a solid. Being DA1 a product of fermentation, its quality may vary markedly; minor changes may be experienced in

Industrial process for the production of resins by batch polymerization

65

the quality of fresh D1. None of these quality changes in the raw materials can be detected in advance. During the reactor loading, the mixing and heating systems are switched on, and heat-up continues until the reactor reaches the set-point temperature (202 °C).

load to the reactor

diol D1 dioic acid DA1 catalyst anti-oxidizing promoter

mixing and heating

water separation

vacuum phase corrections (D1 or DA1) catalyst quality in spec?

final raise of reactor temperature

stop batch

reactor discharge

final product

Figure 3.1 Schematic representation of the single stage production process of the resin A (the broken arrows point to the fresh materials which are fed to the process or the products discharged by the reactor, while the full arrows indicate temporal transitions between different operations throughout the production recipe).

The raising temperature in the reactor activates the polycondensation reaction; hence, water is produced and must be removed to promote the forward reaction. Water is generated as a vapor phase that leaves the reactor. In the early stages of the batch, this vapor phase contains significant amounts of D1, which must therefore be recovered and recycled for further processing. Therefore, the vapor phase leaving the reactor is sequentially processed in the following ways: i) by differential condensation in the packed column, in such a way as to

66

Chapter 3

recover liquid D1 and recycle it back to the reactor; ii) by total condensation in the condenser; iii) by washing and contact condensation in the scrubber. Vacuum needs to be applied during the course of a batch to adjust the viscosity and the molecular weight distribution of the resin. Furthermore, to ensure that the final product quality is on specification, the operating recipe always requires at least two additions of fresh raw materials and catalyst during the course of a batch. The first addition is made before vacuum is applied for the first time. Therefore, when fresh materials and catalyst are charged into the reactor again, vacuum must be broken and then resumed. When the resin quality fails to approach the target values in the expected amount of time, further amounts of fresh material is charged. Following the operators’ jargon, these supplementary additions are known as “corrections” to a batch. Corrections are the way the operators act online to compensate for any unmeasured disturbance affecting a batch, and more than one third of the batches undergoes to corrections. When the end of the batch is approaching, the reactor temperature is increased to 220÷230 °C. The batch is stopped and the product is discharged when the resin reaches the desired quality targets in terms of quality indices. The duration of the batches for the production of resin A vary between 30 and 65 h.

3.1.2 Resin B Resin B is a high molecular weight polyester resin produced by catalytic poly-condensation of carboxylic acids CA1 and CA2 and alcohols D1 and D2 with additives and catalysts. Resin B is produced in the same production facility as resin A. A scheme of the production process is shown in Figure 3.2. first load to the reactor

carboxylic acids CA1+CA2 poly-alcohols D1+D2 catalysts Ct1 anti-oxidizing agent AO

stage 1

second load to the reactor

carboxylic alcohol C3 poly-alcohol D3

stage 2

catalyst Ct2

reactor discharge

final product

Figure 3.2 Schematic representation of the production process of the resin B.

Industrial process for the production of resins by batch polymerization

67

The production process of resin B is similar to the one of resin A. It differs because two distinct production stages are present. Raw materials, catalyst and additives are initially loaded to the reactor. When the charge is completed, production Stage 1 is started, and mixing, heating-up and separation via distillation column are performed. The objective of Stage 1 is to react partially the fresh materials until a pre-polymer with loosely specified characteristics is obtained. As soon as Stage 1 is completed, the pre-polymer is cooled down, and new ingredients are loaded into the reactor. Then, production Stage 2 begins and the prepolymer is further processed (through heating, water separation, and vacuum) using a fresh catalyst. In this way, the material is processed until it reaches the quality specifications. At that point also Stage 2 terminates, and the final product is discharged, and the processing equipment is ready for a new batch (cleaning of the equipment may be necessary).

3.1.3 P&ID of the production facility Resins A and B are produced in a plant whose process and instrumentation diagram P&ID is shown in Figure 3.3.

Figure 3.3 Process and instrumentation diagram of the batch polymerization facility.

The most important pieces of equipment of the plant are: • a stirred tank reactor (volume 12 m3) provided with external and internal coils; • packed distillation column (packing height 3 m); • water-cooled condenser;

68

Chapter 3

• vacuum pump; • scrubber; • thermal conditioning systems with utilities such as steam, cold water and dowtherm oil. The polymerization reaction is carried out in the reactor heated passing through an external coil. To promote the forward reaction, the water of polycondensation has to be removed from the reaction environment. The packed distillation column (which is run in dry mode for the production of the resins under study) separates water by partial condensation of the heavier compounds, and recycling them to the reactor. Downstream the column, the water separation is sequentially performed in ancillary equipment: a water-cooled condenser and a scrubber. A vacuum pump allows to operate the final part of the batch under vacuum, which is needed to get a narrower molecular weight distribution of the product and to guarantee safer operations.

3.2 Data acquisition Data from the plant are managed by a supervision system. The supervision system is Movicon 9.1®, a supervisory control and data acquisition (SCADA) scheme realized by Progea S.r.l. (www.progea.com). The supervision system communicates with three programmable logic controllers (PLC) S5 Siemens (DK3964 communication driver), two PLC S7 Siemens (via OPC, object linking and embedding for process control), and 30 Eurotherm regulators (via Modbus remote terminal units RTU drivers). The information form the plant is registered in a structured query language (SQL) database management system that is easily consultable by plant operators. This system allows both to execute the fundamental operations (e.g. activating the mixing system, switching on the vacuum pump, etc…) and to perform the regulatory actions, such as manipulating the set-points of the controlled variables.

3.2.1 Monitoring of the process variables A lot of hardware sensors and controller loops are present throughout the entire plant. In particular, the measurements of 34 variables (Table 3.1) are routinely collected by online sensors and recorded by process computers every 30 s. Typically, these measurements (such as temperatures and pressures) include process values (PV), setpoints (SP) and valve openings (VO) in different sections of the plant. Different sensors may be present for the measurement of the same variable. Some process variables have been discarded because they showed to be irrelevant from a statistical point of view, while some other are inappropriate from an engineering point of view. The selection of the most proper subset of process variables for the model calibration is case sensitive and is discussed in the following Chapters.

69

Industrial process for the production of resins by batch polymerization

Table 3.1 Process variables measured online by the monitoring system, and numbering of the respective sensors in the plant flow-sheet. process variables monitored online mixing rate (%) mixing rate (rpm) mixing rate SP vacuum line temperature (°C) inlet dowtherm temperature (sensor 1) (°C) outlet dowtherm temperature (°C) reactor temperature (sensor 1) (°C) dummy column head temperature (sensor 1) (°C) valve V25 temperature (°C) scrubber top temperature (°C) inlet water temperature (°C) column bottom temperature (°C) scrubber bottom temperature (°C) reactor temperature (Sensor 2) (°C) condenser inlet temperature (°C) valve V14 temperature (°C) valve V15 temperature (°C) reactor differential pressure (°C) dummy column top temperature PV (sensor 2) (°C) column top temperature SP(°C) V42 way-1 VO (°C) inlet dowtherm temperature PV (sensor 2) (°C) inlet dowtherm temperature SP (°C) V42 way-2 VO (%) reactor temperature PV (sensor 2) (°C) reactor temperature SP (°C) dummy valve V25 temperature PV (°C) valve V25 temperature SP (°C) valve V42 VO (%) reactor vacuum PV (mbar) reactor vacuum SP (mbar)

sensor number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

However, all the signals of the process variables are affected by noise, missing values and outliers, possibly caused either by maintenance operations, or by unintended interruption of the sensor connections, or by sensor faults. The setpoints are manipulated manually by the operators and this determines different time profiles of the controlled variables (Figure 3.4a and 3.4c). Moreover, the complex sequence of operations and the unpredictable number of corrections cause uneven batch length, as well as shifts and variations on the time trajectories of the process variables (Figure 3.4a and 3.4b).

70

Chapter 3

200

240 220 different duration

200 180 160

batch #1 batch #4 batch #6

140 120

reactor relative pressure (mbar)

reactor temperature (°C)

260

0 -200 -400 -600

-1000 0

batch #1 batch #4 batch #6

-800

1000 2000 3000 4000 5000 6000 7000 sample

0

1000 2000 3000 4000 5000 6000 7000 sample

(b)

260

260

240

240 temperature (°C)

dowtherm oil SP temperature (°C)

(a)

220 200 180 160 140 120

batch #1 batch #4 batch #6

outliers

0

1000 2000 3000 4000 5000 6000 7000

220 200 180 160

dowthermal oil inlet (sensor 1) dowthermal oil outlet reactor (sensor 1) dowthermal oil inlet (sensor 2) reactor (sensor 2)

140 120 100

0

1000 2000 3000 4000 5000 6000 7000

sample

(c)

sample

(d)

Figure 3.4 Process variables time trajectories of: (a) reactor temperature (different batches); (b) reactor pressure (different batches); (c) dowtherm oil temperature setpoint (different batches); (d) correlation between the time profiles of the reactor and the dowtherm oil temperatures of batch #4.

Finally, the sampling frequency determines strong auto-correlations on the data, and some variables are strongly cross-correlated (e.g. the dowtherm oil temperature and the reactor temperature profiles, Figure 3.4d). All these features make it impossible to monitor the process through the interrogation of the time profile of single process variables.

3.3 Empirical monitoring of the product quality The product quality is defined in terms of two indices: the resin viscosity (μ) and the resin acidity number (NA). However, realtime measurements of product quality are not available. Rather, product samples are taken manually, quite infrequently and irregularly (i.e., one sample each 1.5-2 h, depending on the operators’ availability and on the actual evolution of the batch) and sent to the laboratory for analysis; the full analysis takes about 20 min. Furthermore, the quality measurements are not available for the entire duration of the batch, because the product sampling is initiated 8-10 h after a batch starts. Typically, 15-25 quality

71

Industrial process for the production of resins by batch polymerization

viscosity

measurements per batch are available; the accuracy of the laboratory assay is about 10% of the reading. From an operation perspective, offline quality measurements suffer from two drawbacks. First, they are expensive, because, when quality assessment is needed, a sample is taken (manually) and sent to the laboratory for analysis. Therefore, specific personnel must be dedicated to sample collection and sample analysis (the number of sampling/measurements for a single product may easily exceed 10,000/year in a typical industrial scenario). Typically, a company would try to save on the personnel-related expenses by reducing the number of samples to be analyzed. Secondly, quality measurements are delayed: the analysis results are available at best 20 min after a sample has been taken.

Figure 3.5 A typical quality monitoring chart used industrially for resin B production during Stage 2. Acidity number is reported as the abscissa (decreasing values from left to right), and viscosity as the ordinate. Non-standard units are used for both quality indicators. Circles indicate a sample for which quality measurements are available from the lab. The measured values of acidity number and viscosity should always fall within the broken bounds. Realtime recipe adjustments are needed when a sample falls outside the bounds. Time increases (nonlinearly) from the lower-left corner to the upper-right one.

The evolution of product quality is monitored by using empirically-derived monitoring charts. For example, typical such a chart for quality monitoring of Stage 2 in the production of resin B looks like the one shown in Figure 3.5. If the sample quality is found to lie outside the broken bounds, the operators must adjust the production recipe according to a given procedure. Note that only few samples are taken to monitor Stage 2 in the case of Figure 3.5, despite the fact that this stage lasted as long as 32.5 h.

72

Chapter 3

More timely information on product quality evolution would be highly desirable, because the production recipe could be adjusted more promptly.

3.4 Challenges for the statistical monitoring of product quality The net result of the quite complex and mostly manually driven series of batch stages is that, although the end-point quality of a resin usually falls within a very narrow range, the “internal” variability of the batches is very large. Indeed, there are several sources of variability within a batch, most of which cannot be eliminated: • different state of the pieces of equipment (hot/cold); • optional cleaning of the lines and of the pieces of equipment; • variable state of the utilities (e.g. the heating system serves at the same time for several reactors in the same production facility and the duty of the heating furnace is fixed. This results in different durations of the heat-up period from one batch to another); • only the charge of liquid raw materials is automated, while the solid raw materials are loaded manually by operators. Consequently, errors in weighting and contaminations can be experienced; • the quality of raw materials may vary from batch to batch, because of different levels of impurities, different suppliers, loss of the activity of catalysts, etc…; • midcourse corrections, subject to laboratory measurement delay and operators’ judgment; • the switching from one operating step to the subsequent one is triggered manually by operators, depending on the actual value of the quality indices; • most of the operations performed manually by operators, depending on their availability, experience and judgment (e.g. the set points of the controlled process variables are manipulated manually by operators). Most of this variability reflects itself in the trajectories of the process measurements, and eventually in the product quality and in the total batch duration. Therefore, it is hard for the management to appropriately schedule both the production policy, and the use of equipment when several batches are to be processed in series or in parallel. All these situations make the quality estimation and the batch duration prediction a challenge.

3.5 Automated quality monitoring through soft-sensors The monitoring strategy based on the offline laboratory assays is not an efficient approach to guarantee high quality products and reproducible operations. Because quality measurements are available quite infrequently, the switching from an operating stage to the following one may be substantially delayed, with the result of a poor monitoring of the product quality and an increase of the duration of a batch. More timely information on product quality evolution

Industrial process for the production of resins by batch polymerization

73

would be highly desirable, because the production recipe could be adjusted more promptly. Delays on recipe adjustments may result in significant increase of the processing time and in potential loss of the end-point quality. The performance of a batch process could be improved if accurate and frequent information on the product quality and the batch duration were available. Software sensors are powerful tools for this task. They are able to reconstruct online the estimate of “primary” quality variables from the measurements of some “secondary” process variables (typically, temperatures, flow rates, pressures, valve openings), by using a model to relate the secondary variables to the primary ones. Therefore, the design of a soft-sensor for the online estimation of μ and NA is considered, with the objective to make available online frequent and accurate estimations of the product quality indicators, to avoid off-spec products and to obtain welltimed adjustments of the recipe. Moreover, the monitoring scheme can be endowed with a realtime system for the prediction of the duration of the batches with the purpose of assisting the production planning and organization, the scheduling of the equipment use, and the coordination of the operating labour resources. Summing up, to improve the process operation and to reduce the measurement-related costs, a realtime monitoring system is sought that allows: • to estimate online the instantaneous values of the quality indicators, in such a way as to promptly counteract any deviation from the desired quality profile by adjusting in real time the processing recipe; • to predict the duration of the batches and of the respective operating stages, in such a way as to reduce the number of quality measurements that are required to assess online the termination of the stage, and to allow to schedule the use of the different pieces of equipment for different productions in the same facility. The first of the Thesis considers the design and implementation of the abovementioned softsensors in a real-world industrial batch polymerization process for the production of resins.

Chapter 4 Soft sensors for the realtime quality estimation in batch processes This Chapter1 describes how PLS models can be used for the realtime estimation of the product quality in a batch polymerization process. In particular, PLS-based soft sensors are designed and their performances are evaluated and enhanced using multi-phase PLS modelling. Furthermore, information about the time evolution are included into the model using both a lagged variables technique and a moving average technique. These strategies are compared and the benefits on the accuracy and the precision of estimation are shown. Finally, it is explained how the reliability of the model is checked through statistical indices and how the causes of the soft sensor faults can be diagnosed in real time.

4.1 Quality estimation in resin A using PLS models The quality monitoring approach that has been developed relies on the PLS regression technique (Geladi and Kowalski, 1986; Wise and Gallagher, 1996). For the production of resin A, the available dataset includes measurements of the process variables and of the quality variables from 33 batches (16 months of operating effort in the plant facility described in Chapter 3). This dataset was split into two subsets: 27 batches constitute the reference (i.e., calibration) dataset, while the remaining 6 batches represent the validation dataset. The reference process data are collected into a three-way matrix X (I×J×Ki). Each of the J = 23 columns of this matrix contains one measured process variable, while each row corresponds to one of the I = 27 reference batches; time occupies the third dimension, and Ki is the total number of recordings taken for each of the J process measurements during batch i. As was already mentioned, the duration of the generic batch i is not fixed, and this makes Ki change from batch to batch (typically Ki=4000-8000 samples). The process variables included into the model (Table 1) were selected discarding some variables that result to be not relevant either form the engineering point of view (e.g. valve V43 temperature), or form the statistical

1

Portions of this Chapter have been published in Facco et al. (2007), Facco et al. (2008a), Facco et al. (2009a), and Faggian et al. (2009).

76

Chapter 4

point of view (constant setpoints). The generic element of matrix X is denoted with the symbol xi , j ,ki . Table 4.1 Subset of the measured process variables included into the PLS models, and their relevant position j in the process data matrix X. online monitored variable mixing rate (%) mixing rate vacuum line temperature (°C) inlet dowtherm temperature (sensor 1) (°C) outlet dowtherm temperature (°C) reactor temperature (sensor 1) (°C) column top temperature PV (sensor 1) (°C) scrubber top temperature (°C) inlet water temperature (°C) column bottom temperature (°C) scrubber bottom temperature (°C) reactor temperature (Sensor 2) (°C) condenser inlet temperature (°C) valve V14 temperature (°C) valve V15 temperature (°C) reactor differential pressure (°C) column top temperature PV (Sensor 2) (°C) V42 way-1 VO (°C) inlet dowtherm temperature PV (Sensor 2) (°C) V42 way-2 VO (%) reactor temperature PV (Sensor 2) (°C) valve V42 VO (%) reactor vacuum PV (mbar)

matrix column 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

The arrangement of the three-way Y (I×Q×Hi) matrix is similar; however, only M = 2 columns are present, which correspond to the two quality variables to be estimated (NA and μ). The third dimension of Y is scanned unevenly and with a much lower frequency than the one of the X matrix (i.e. H i > H i results, when static estimators were designed the Xi matrices were pruned in such a way as to eliminate all the rows that do not correspond to a time instant where a quality measurement is available. Note, however, that this pruning is needed only during the PLS model calibration phase. When the model has been designed, it can be interrogated any time a process measurement is available, regardless of the fact that a quality measurement is available or not, thus obtaining NA and μ estimates at the same frequency as the process measurements (i.e., every 30 s). Finally, note that two distinct Y matrices were considered, one for NA and one for μ.

4.1.1 Single-phase PLS model As discussed previously, the operating recipe for the production of the resin results in a complex series of operations, most of which are subject to the operators’ manual intervention. Therefore, also owing to the intrinsic nonlinear nature of the process, it is quite unlikely that the correlation structure between the variables remains the same during the whole duration of a batch. In turn, this means that a single linear PLS model may not be able to provide an accurate prediction of the quality variables along the whole duration of a batch. To check this

78

Chapter 4

conjecture and provide a term for comparison, a single PLS model on 5 LVs for the estimation of μ and NA was built from the reference dataset as a first attempt. The number of latent variables was chosen in such a way as to minimize the estimation error in the validation dataset. Typical validation results are reported in Figure 4.1, where the acidity number and the viscosity predicted by this model are compared to the measured values. 45

1.2

estimation measurement

estimation measurement

1.0

35 30

viscosity (Pa×s)

acidity number (mgKOH/gresin)

40

25 20 15

0.8 0.6 0.4

10 0.2

5 0

0

1000

2000

3000 4000 5000 time sample

6000

7000

0.0

0

1000

2000

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.1 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using a single PLS model. The vertical bars represent the laboratory assay accuracy.

Although the frequency at which the quality estimations are made available is much higher than the frequency of the lab measurements (which can improve the monitoring of the process), the estimation accuracy is not satisfactory. The use of nonlinear transformations on the process variables or of a nonlinear inner relationship in the PLS algorithm did not improve the results significantly.

4.1.2 Multi-phase PLS model One approach to overcome the nonlinearity problems (i.e., a changing correlation structure among the variables) is to divide a batch in different phases, and to develop a linear PLS submodel for each of these phases (Kourti, 2003; Zhao et al., 2006). In this case, a criterion also needs to be found to detect online a phase change so as to dictate the switching between one submodel and the subsequent one. In the process under study, designing a different PLS model for each operating step (multistage PLS model, as referred to in the literature: Ündey et al., 2002; Lu and Gao, 2005a; Camacho and Picò, 2008a) is not a viable solution, not only because the number of operating steps is large, but also because too few quality measurements are usually available within a single operating step to design the relevant PLS submodel. Furthermore, the actual number of operating steps in a batch is not known a priori, being dependent on the number of corrections that the batch will be subject to.

79

Soft sensors for the realtime quality estimation in batch processes

An alternative approach is to check whether different operating steps in a batch share the same correlation structure among the measurements. If this is the case, the same PLS submodel can be used to represent these operating steps. These “shared” operating steps constitute the same estimation phase. Following this approach, we may end up with a number of estimation phases that is (possibly much) lower than the number of the operating steps. To detect how many estimation phases can be recognized in the reference batches, the scores on the first two LVs can be plotted one against the other when a single PLS model is built from the reference dataset. In Figure 4.2a, each point represents the batch state in a certain instant of time when quality samples are available, and this is repeated for each of the reference batches. 6

6

2 0 -2 -4

Phase 1 -6 -8

4

305 230 248 244 221 314 254322 304 326 337 235 318 252 225 183 266 300 284227 222 233 240 251182 319 262 297 196 218 310 249 282 327 320 317 239 344 288 179 201 277 301323 272 258 226 341 261 309 357 255 267 175 241 481 245 316 280 238204 209 349 473 294 321 479 203354 469 362 216 358 194 361 187 363 173210 219 123 407483 333348 360 421422 475 39 315 111 396 428 436 133 99110 359345192306 150 429484 438 381 24 176 379 215 11 166 167 279 286 290 139 370 180 69 98 22 64 303 524 104 367 523 508 434 185 232 356 256 382 83131 85 28 307 229 260273 453 137 4762 237 103 516 122 114 408 10 102 298 283 20 142 332 330 27128 418 446 495 328 152 263 159 311 242 66 19 55149 36 329 313 236 285 338 350 259 498 525 228 403 101 113 528 353 207 308 38 153109 198 339 485 439 399 15158 77 411 5234 73 143 271 243 246 275 57 164 130 281 200206 274 299184 169 278 208 197 352 89 335 391 72 80 312 325 342 32 76 503 90 334 375 340211 224 42 13481 234 171 58 253 336 264 351 181 295 346 188 397 195 364494374 174289 412 231 136 199 331223 212 472 44 118 296269 372373 178190 116 86 270 177 126 474 250 125 17 385 386 257247 293 172 189 388465 426 268 220 387 84 31 14 46 186287205 460 389 74 4 276 193 291 488 191440 521 515 511 480 413 213343 292 355 217 119 302 405 467486 507 477 448 392 415 369 347 9582 456 414 394476 160 92 214 324 383 452380 384 450 454 202 265 432 377 5 7 138 406 59 409 155 156 441 94135 2 132 487 393 416 33 458376 365 108 478 115 504455 395 466468471 442 420 514 449 433 91 497 410 402 417 154140 496 459 63 447 437 430 482 398 513 120 144 16 65 56 129 506 162 54 423 45 443 23 78 165 435 470 145 461 371 366 489 445 3107 512 509 492493427 161 424425 70 368 147 96 510 1 127105 390 451505 526 527 419 112 93 1468775 97 9106 517 529431 5161 499 71 53 148 88157 490 100170 502 49 457 378 522500 400 15121 501 8 121 404 61241326 163 518 12 444 462 519 37 60168 464 463 48 520491 401 67 79 14125 68 40 29 35 43 30 117 5018 41

-6

-4

Phase 2 -2 0 2 score on LV1

2 0

14 10 15 9 7

3

-2

1

2

6 45

Phase 1 6

23 19

22 20 17 21 16 18 25 24

-4

Phase 3 4

13 12 811

score on LV2

score on LV2

4

8

-6 -8

-6

-4

Phase 2 -2 0 2 score on LV1

Phase 3 4

6

8

(a) (b) Figure 4.2 Scores plot on the first and second latent variables for a single PLS model: (a) whole reference dataset and (b) validation batch #3. The boxes indicate the approximate boundaries of the estimation phase regions.

It can be seen that the score points are mainly clustered into three distinct regions of the score plane. A closer inspection of the score points related to each single batch (Figure 4.2b) revealed that all batches are characterized by a similar pattern in the “movement” of the score points: at the beginning of the operation, a score point is located at the left of the score plane (“Phase 1” cluster), then it moves to the center of the plane (“Phase 2” cluster) as time progresses, and finally it shifts to the plane right (“Phase 3” cluster) towards the end of the batch. The correlation structure between variables is more similar for points within a cluster than for points between clusters. Otherwise stated, each cluster represents an estimation phase, and can be envisioned as a series of operating steps that maintain the same correlation structure among the variables. Therefore, one distinct PLS submodel can be developed for each estimation phase to predict the quality variables from the process ones within that phase. The resulting quality estimator is called a three-phase PLS (TP-PLS) estimator. Note that clusters could also be identified without using process knowledge. To this purpose, clustering

80

Chapter 4

techniques based on PCA and PLS can be an effective way to obtain automatic cluster detection (Lu and Gao, 2005). In the presence of auto-correlated and cyclic process data, like the ones encountered in the process under study, the clustering algorithm proposed by Beaver et al. (2007) can also prove useful. Note, however, that the number of clusters must be kept as small as possible because if too few quality measurements are available within a cluster, it may be impossible to design the relevant PLS submodel. 180

10 8

reference set validation batch #3

1

6 1

-2 -4 -6 -8 -10

5440 5461 5482 5419 5503 5398 5524 5566 5545 5881 5902 5860 5944 5839 5923 5965 5608 5986 6007 5818 5587 5797 6028 5629 5776 5650 5671 5692 5734 5713 5755 5377

22

5454 5437

time

43

Phase 3

5356 5230 5209 5188 5167 5251 5146 5293 5314 5335 5272 5125 5104 4873 4852 5083 4096 4894 4831 4117 4789 4810 5062 5041 4075 4054 4159 4201 4138 4999 4768 4978 4033 4222 4243 5020 4180 4264 4915 4936 4957 4012 4348 4390 3991 4306 4327 4369 4600 4747 4411 4432 4621 4642 4726 4705 4663 4684 3970 4453 4558 4579 4495 4537 3865 3949 3907 39284285 3886 3739 3844 4474 4516 3718 3802 3823 3760 3697 3781 3676 3655 3634 3592 3613 3550 3571 3529 3508 3109 3487 3130 3088 3466 3151 3172 3298 3382 3445 3319 3193 3403 85 3361 3277 3424 3214 3256 3235 3340 3046 3025 106 3004 3067 2983 148 169 2962 127 2941 190 2836 2920 2857 211 2773 2794 2731 2710 2815 2878 2752 2899 232 253 2689 2668 274 2647 2626 337 295 316 358 2605 379 2584 2542 400 2500 2521 2563 442 421 2479 463 505 526 484 2458 547 568 589 610 2437 631 652 2416 2122 2038 2080 2101 2143 2017 673 694 1996 736 2164 757 2185 715 1975 1954 2059 2395 2374 2206 2353 1933 778 2332 2311 2227 1912 2269 2248 799 2290 1849 1891 1870 1828 1786 1807 820 1765 1744 1723 1702 841 1681 1660 862 1639 883 1618 904 946 1072 925 967 1093 988 1009 1030 1051 1135 1156 1219 1177 1240 1114 1282 1198 1261 1303 1429 1492 1513 1387 1450 1471 1576 1324 1408 1555 1534 1597 1366 1345 64

1017

-15

-10

-5 0 5 score on LV1

1034

100

1051 1068 1119 1085 1102 1136

80

1153 1170 1187 1204 1221 1238 1612 1629 1255 1561 1595 1646 1884 1850 1901 1272 1527 1816 1578 1918 1544 1731 1765 1867 1289 1663 13231510 1306 1833 1782 1748 1799 1935 1340 1697 1714 1476 14931680 1459 1357 1442 2105 1952 2071 1374 1408 2037 2122 2054 1425 1391 1969 2139 2088 1986 2020 2003 2156 2173 2190 2224 2241 2207 2258 2275 2343 2292 2309 2394 2326 2377 2360 2411

40

10

15

20

4519 4502 4485 4468 4451 4434

4417 4400 4383 3261 3244 3210 3278 4349 3227 3295 3193 3176 4366 3159 3125 3142 3108 3329 2734 30913312 4332 3346 3363 3380 4298 4315 2751 3414 3057 2428 2768 2445 2717 3074 3397 3431 2802 2836 2479 3448 3788 3941 3754 4128 2462 2700 3771 4111 2496 2785 2853 2870 3040 3465 2564 2819 2530 2547 2513 2598 2581 4145 2972 4094 4247 2887 2921 2904 3482 3822 4281 2632 2683 3550 3584 3737 2615 2649 2938 4264 3907 3958 2955 3023 2666 3006 3601 4179 2989 3992 4043 4213 3618 4077 3635 4162 3805 3839 3856 3975 4230 3669 3686 3873 4009 3499 3516 3533 3652 3703 3720 4060 3890 3924 4196 4026 3567

20 0

-20

5097 5063 5114 5403 5080 5488 4740 4791 5046 4774 4757 4723 5131 5386 55055743 5777 5760 5148 4689 5029 5794 5726 4706 4808 5811 5012 55225709 4672 51655352 58286032 5369 4825 53355539 5845 6015 4655 4995 5692 46384842 5182 5556 5199 4978 5318 5675 4859 56585862 5998 5301 5573 5641 4621 4604 4961 5216 5624 5879 5964 4944 5233 5284 5981 5267 5607 5896 4587 4876 4910 5913 4927 5250 4570 4893 5947 5590 5930 4536 4553

1000

60

Phase 2

Phase 1

5420 5471

120 Q residual

score on LV2

0

Phase 3

140

4 2

Phase 2

Phase 1

160

0

1000

2000

3000 4000 5000 time sample

(a)

6000

7000

(b)

30 estimation measurement

acidity number (mgKOH/gresin)

25 20 15 10 5 0 -5

Phase 1 0

1000

2000

Phase 2 3000 4000 time sample

Phase 3 5000

6000

7000

(c) Figure 4.3 TP-PLS estimator: reliability of the Phase 2 submodel in the estimation of acidity number for validation batch #3. (a) Scores plot for the first two latent variables, (b) squared prediction error plot; and (c) comparison of the estimated and measured values. The dashed lines in (a) and (b) indicate 95 % confidence limits. To improve the readability, most of the samples have not been plotted in (a) and (b).

A key issue in the development of such a multi-phase estimator is finding a proper criterion to switch from one PLS submodel to the subsequent one (Camacho and Picò, 2006a and 2006b; Lu and Gao, 2004a, 2004b, 2004c, 2005a, 2005b and 2006). Switching from one submodel to the subsequent one means being able to recognize in real time that the correlation structure of

81

Soft sensors for the realtime quality estimation in batch processes

the data is changing. It was observed that, due to the large inter-batch variability, “time” is not a good indicator to assess phase switching for the process under study. Therefore, submodel switching was linked not to time, but to events: there are certain events that do occur in all batches and mark a change in the correlation structure, although they may occur at a different time from batch to batch. Analysis of the process and quality data for all the reference batches revealed that the switching from Phase 1 to Phase 2 occurs the first time vacuum is applied to the reactor, while Phase 3 begins as soon as the final rise of temperature takes place. Following this approach, not only does the number of submodels to be developed remain sufficiently low, but clearly detectable events can also be recognized during a batch to trigger the switching between submodels. It should be stressed that each submodel is representative only of the phase it refers to. Figure 4.3 clarifies this issue with respect to Phase 2 submodel. The scores plot of Figure 4.3a refers to a typical validation batch, and shows the similarity of each sample to the Phase 2 samples of the reference set of batches. Only during Phase 2 do the validation scores fall within the 95 % confidence ellipse of the Phase 2 submodel. When the process measurements start to be recorded (Phase 1), the score points all lie well outside the confidence ellipse; they enter into, and stay within, the ellipse during Phase 2, while during Phase 3 they tend to move again outside the ellipse to a different region of the score plot. The squared prediction error plot in Figure 4.3b shows that Phase 2 submodel is not reliable as a quality estimator during Phase 1 or Phase 3. In fact, the estimations of the acidity number is accurate only during Phase 2, while during Phases 1 and 3 the estimations are affected by severe errors (Figure 4.3c). Therefore, care must be taken to identify the proper switching criterion and detect it online; anticipated or delayed detection may lead the soft sensor to provide unreliable quality estimates. 1.2

45 Phase 1

Phase 2

40

Phase 2

Phase 3

1.0

35 viscosity (Pa×s)

acidity number (mgKOH/gresin)

Phase 1

Phase 3

30 25 20 15 10

0

0

1000

2000

3000 4000 time sample

0.6 0.4 0.2

esimation measurement

5

0.8

0.0

5000

6000

7000

estimation measurement 0

1000

2000

3000 4000 5000 time sample

6000

(a) (b) Figure 4.4 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using the TP-PLS model. The vertical bars represent the laboratory assay accuracy.

7000

82

Chapter 4

The typical estimation performance of the TP-PLS soft sensor in a validation batch is shown in Figure 4.4. It can be seen that the performance is greatly improved with respect to that of the single PLS model. The estimation accuracy is generally within the accuracy of the laboratory analysis. Yet, some noise in the estimation is present (for example, in the estimation of the acidity number during Phase 1, and in the estimation of viscosity during Phase 3). Furthermore, the viscosity estimation displays a somewhat erratic behavior during Phase 2. In the next section, two different approaches will be considered to further improve the estimation performance.

4.2 Including time information to improve the estimation performance The variable-wise unfolding of the three-way X and Y matrices (Wold et al., 1998) has the advantage of being very simple to carry out, because it can be applied in a straightforward way to sets of batches that have different time duration, without the need of synchronizing the batch length. The price to pay for this simplicity is that the “time footprint” of the data is lost, because the order in which the rows of the two-way X and Y matrices are assembled following a variable wise unfolding is unimportant for the design of a PLS estimator. Otherwise stated, the TP-PLS model is inherently static, and this may affect the estimation performance given the fact that a batch process in inherently dynamic. To account for the process dynamics, two different techniques were evaluated, namely the augmentation of the process data matrix with lagged values, and the use of averaged values instead of point values in the X matrix.

4.2.1 Improving soft sensor performance through lagged process variables A PLS model on variable-wise unfolded data is inherently static. To take into account the dynamic behavior of a batch process, the use of dynamic PLS models has been suggested (Ku et al., 1995; Chen and Liu, 2002; Sharmin et al., 2006). By following this approach, the process data matrix is augmented with lagged values of the process variables at the past sampling instants. To keep reasonably small the column dimension of the XVWU process data matrix, lagged values of only the three most important variables, as identified by the VIP method (Chong and Jun, 2005), were considered in three past time instants (Figure 4.5). These variables are the column top temperature (variables # 7 and 17), the column bottom temperature (variable # 10), and the reactor temperature (variables # 6, 12, and 21). To the

83

Soft sensors for the realtime quality estimation in batch processes

purpose of minimizing the number of the variables retained into the model, only variables # 7, # 10, and # 21 are considered into the lagged models. 12 Phase: 1 2 3

10

VIP index

8 6 4 2 0

2

4

6

8

10 12 14 16 18 20 22 variable number

Figure 4.5 The most important variables in the PLS projection method according to the VIP index.

By trial and error, it was found that a good performance of the estimator could be obtained by considering the current measurement value plus the values lagged by 1 h (120 time instants), 3 h (360 time instants) and 5 h (600 time instants) for the selected process variables. Information on the most proper values for the lags can be obtained also by studying the autocorrelation and cross-correlation structure of the process and quality measurements. By using this approach, it was found that, depending on the process variable and on the estimation phase, the most appropriate lags range from 300 to 900 time instants, which is consistent with the values we considered in our simplified approach. Note that the variety of lags existing for different process variables is an indication of the variety of time scales that may exist in the process dynamics. In order to account for the effect of different time scales, a more rigorous approach could be taken using a multiresolution PLS approach (Bakshi, 1998). However, this was found to be unnecessary in the present application. The reference process data matrix was therefore augmented by including 3 ⋅ 3 = 9 additional columns:

[

X L = X VWU

where:

XD

]

,

(4.3)

84

Chapter 4

⎡ X1D ⎤ ⎢ D⎥ ⎢X 2 ⎥ ⎢ M ⎥ XD = ⎢ D ⎥ , X ⎢ i ⎥ ⎢ M ⎥ ⎢X D ⎥ ⎣ I⎦

(4.4)

and

[

X iD = x i−,120 7

x i−,360 7

x i−,600 7

120 x i−,10

360 x i−,10

600 x i−,10

x i−,120 21

x i−,360 21

x i−,600 21

]

,

(4.5)

− ΔK

where x i , j is the vector of the jth variable time trajectory for the ith batch lagged of –ΔK time instants. This approach introduces process variables that are more collinear with the quality ones. As a result, the variability in XL is more representative of the variability in Y, and the estimation capability is improved. It was verified that, also in this case, each batch can be segmented into three estimation phases. The resulting soft sensor is called a lagged three-phase PLS (LTP-PLS) estimator, in which 5, 4 and 3 LVs were chosen for Phase 1, 2 and 3, respectively. The reduction of the number of LVs in Phase 2 and 3 prevents overfitting problems, particularly when the signalto-noise ratio is low. 1.2

45 40

Phase 3

estimation measurement

1.0

35

viscosity (Pa×s)

acidity number (mgKOH/gresin)

Phase 2

Phase 1

30 25 20 15 10

0

0

1000

2000

3000 4000 time sample

0.6 0.4 0.2

estimation measurement

5

0.8

Phase 1 5000

6000

7000

0.0

0

1000

2000

Phase 2

Phase 3

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.6 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using the LTP-PLS model. The vertical bars represent the laboratory assay accuracy.

Figure 4.6 shows the estimation results for this model on a typical validation batch. It can be seen that including information about the batch dynamics, through the use of lagged measurements, suppresses most of the noise that was apparent in the TP-PLS estimations. A slight improvement is also obtained in the accuracy of the viscosity estimation during Phase

Soft sensors for the realtime quality estimation in batch processes

85

2, although the estimated values of this quality indicator in this phase still seem to suffer from some inaccuracy. It should be noted, however, that quality estimation during Phase 2 is inherently difficult because all the corrections to the operating recipe take place during Phase 2, which is therefore subject to a much larger inter-batch variability than the other phases.

4.2.2 Improving soft sensor performance through moving-average process data An alternative way to account, although indirectly, for dynamics in the process data is to build the process data matrix with averaged values of the measurements, instead than with current process measurement values. Modifications to the standard PLS algorithm that consider moving windows on weighted past process measurement values have already been proposed for use within recursive and adaptive process control strategies (Wold, 1994; Dayal and MacGregor, 1997b; Rännar et al., 1998; Qin, 1998; Wang et al., 2003). However, a different approach was taken here. The PLS algorithm itself was not altered; what was altered instead is the process measurement matrix: each entry in a column of the process data matrix X BWU represents the average value xi, j ,k of the relevant process measurements j, in a certain batch i, within a window including the previous ΔK’ time samples. Namely, the value included in the process data matrix at any time instant is the average of the last ΔK’ samples:

X BWU

⎡ X1 ⎤ ⎢ ⎥ ⎢X2 ⎥ ⎢ M ⎥ =⎢ ⎥ , X ⎢ i⎥ ⎢ M ⎥ ⎢⎣ X I ⎥⎦

(4.6)

where: ⎡ xi ,1,1000 ⎢x i ,1,1001 Xi = ⎢ ⎢ M ⎢ ⎣⎢ xi ,1, K i

xi , 2,1000 K xi , J ,1000 ⎤ xi , 2,1001 K xi , J ,1001 ⎥⎥ M O M ⎥ ⎥ xi , 2, K i K xi , J , K i ⎦⎥

(4.7)

and k − ΔK ′

xi , j , k =

∑x r =k

i , j ,r

ΔK ′

.

(4.8)

86

Chapter 4

The moving averages are used not only to dampen measurement noise (as done for example by Kamohara et al., 2004), but also to smooth out short-term fluctuations (process noise) while, at the same time, preserving the capability to highlight long-term trends. Smoothing process noise was necessary because when a correction takes place in a batch most process variables change abruptly (e.g., when vacuum is broken, most process variables undergo a step-wise change), while bulk properties (like μ and NA) are practically insensitive to such abrupt changes. Therefore, the performance of a purely static estimator may be disrupted when these events occur. A similar effect is found when, due to poor controller tuning, some process variables tend to cycle (typically, the reactor temperature), while at the same time this cycling does not affect the product quality properties. The length of the moving window was set by trial-and-error to 900 time samples (7.5 h). The wide extension of the time window also allows to incorporate the variability within most of the first part of the batch, when no quality measurements are taken. Furthermore, the size of the time window (that is about 10-15% of the entire batch) underlines the importance of including a long-term past history information into the data. In fact, other moving average strategies were explored: • a weighted moving average approach in which the input data are calculated as:

xi , j ,k =

xi , j ,k + λxi , j ,k −1

(1 + λ )

;

(4.9)

• an exponentially weighted moving average techniques, where input data are:

xi , j ,k = λxi , j ,k + (1 − λ )xi , j ,k −1 ,

(4.10)

and λ is a forgetting factor that give less weight to the past time instants. These strategies, which give less weight to the past measurements, revealed to be less effective from the estimation point of view than the moving average one, as a proof of the importance of incorporating long-term “memory” to the soft sensor. The resulting estimator is a moving-average three-phase PLS (MATP-PLS) soft sensor built on 5 LVs for every Phase, and its performance is illustrated in Figure 4.7. As expected, the estimated profiles of the quality variables are smoother than with the other models. An improvement is apparent in the accuracy of the estimated viscosity profile during Phase 2.

87

Soft sensors for the realtime quality estimation in batch processes

45

Phase 2

Phase 1

1.2

Phase 3

estimation measurement

1.0

35

viscosity (Pa×s)

acidity number (mgKOH/gresin)

40

30 25 20 15 10

0

0.6 0.4 0.2

estimation measurement

5

0.8

0.0 0

1000

2000

3000 4000 5000 time sample

6000

7000

0

1000

Phase 3

Phase 1

Phase 2

2000

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.7 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using the MATP-PLS model. The vertical bars represent the laboratory assay accuracy.

Finally, it should be stressed that more sophisticated nonlinear methods (i.e. the multiresolution methods proposed, for example, by Kosanovic and Piovoso, 1997, and by Maulud et al., 2006) were investigated, but they provided results which are similar to the ones of the moving-average strategy, though with an heavier computational burden.

4.3 Comparison of the estimation performances Table 4.2 allows for a quantitative comparison of the three three-phase estimators that were designed: the static one, the “lagged” dynamic one, and the “averaged” dynamic one. It is clear that including some form of time information into the X matrix greatly increases the amount of variance that can be explained on the quality data, and using averaged measurements (MATP-PLS estimator) appears to be better than using lagged measurements (LTP-PLS estimator). Note that the amount of variance captured in the Y matrix during Phase 3 is relatively small for all estimators and both quality variables. This is due to the fact that, when the end of the batch is approaching, the process measurements profiles flatten considerably and the signal-to-noise ratio decreases, making the process variables much less effective predictors of the quality variables. It is also interesting to note that in the LTP-PLS estimator the variance captured in the Y matrix increases considerably in Phases 2 and 3 with respect to the TP-PLS model, despite a smaller number of retained LVs and a larger number of process variables that causes the captured variance of the X matrix to decrease. This indicates that the lagged measurements do bring “new” valuable information for the prediction of quality, and this information was not present in the original X matrix; although this new information contained in the “lagged” XL matrix cannot be captured to a large extent, it is nevertheless much more predictive of the quality matrix.

88

Chapter 4

Table 4.2 Explained variance of the TP-PLS, LTP-PLS and MATP-PLS estimators on the process and quality variables for both acidity number and viscosity (calibration dataset). Phase

1 2 3

TP-PLS model NA estimation μ estimation on X on Y on X on Y (%) (%) (%) (%) 62.00 88.50 63.74 86.57 67.38 82.97 67.47 78.46 73.78 59.84 72.56 52.93

LTP-PLS model NA estimation μ estimation on X on Y on X on Y (%) (%) (%) (%) 66.12 96.47 67.47 93.14 57.11 88.92 57.14 83.66 61.68 67.47 61.01 55.59

MATP-PLS model NA estimation μ estimation on X on Y on X on Y (%) (%) (%) (%) 70.77 95.56 71.74 94.52 67.42 91.13 68.63 85.04 74.21 72.26 75.57 61.90

During each phase of the generic validation batch i, the estimation accuracy on the quality variable q can be evaluated in terms of mean relative prediction error MRPEi,q: nsample

∑

MRPE i ,q =

h =1

⎡ ⎢ ⎢ ⎣

(y

2 − yˆ i ,q ,h ) ⎤ ⎥ ⎥ yi ,q , h ⎦ ⋅100 ,

i , q ,h

(4.11)

nsample

where yi,m,h is the (measured) value of quality variable q at the hth sampling instant of that phase, ^ indicates an estimated value, and nsample is the total number of quality samples in the phase. This error can be averaged on all the validation batches, to get an MRPE value for each estimated quality variables during any of the estimation phases. 14

14

TP-PLS LTP-PLS MATP-PLS

10 8 6 4 2 0

TP-PLS LTP-PLS MATP-PLS

12 MRPE on μ (%)

MRPE on NA (%)

12

10 8 6 4 2

Phase 1

Phase 2

Phase 3

0

Phase 1

Phase 2

Phase 3

(a) (b) Figure 4.8 Comparison between the estimation accuracy of the three-phase PLS models in terms of the average mean relative prediction errors (MRPE) on the validation datasets for (a) the acidity number and (b) the viscosity. The dashed lines represent the laboratory analysis accuracy.

In Figure 4.8, the MRPE on the acidity number and on the viscosity for the three soft sensors is shown. Despite the explained variance on the viscosity during Phase 3 is lower then the other phases, the predictions are very accurate. It can be seen that, although all three soft

89

Soft sensors for the realtime quality estimation in batch processes

sensors provide an estimation accuracy generally within the one of the laboratory analysis, the MATP-PLS model shows a superior overall performance.

4.3.1 Reliability of the estimations The estimations of the quality indices can not be trusted blindly, because occasionally the soft sensor may provide wrong estimations. Using diagrams similar to those reported in Figures 4.3a and 4.3b, the reliability of an estimate provided by any of the estimators can be assessed online during each estimation phase. Indeed, the reliability of the estimation can be evaluated by comparing the instantaneous value of either the scores, or T2, or SPE with the respective confidence limits. If all the observed statistical indices are within the respective 95% limits, the estimation is considered to be reliable (Figure 4.9) with an uncertainty of the 5%.

180 160

SPE residual

140 120 100 80 60 40 20 0

0

1000

2000

3000

4000

5000

6000

time sample

Figure 4.9 SPE residuals control chart with 95% confidence limits for the online assessment of the reliability of the viscosity estimation using the MATP-PLS model in validation batch #4.

On the contrary, if one of the statistical indices exceeds the respective limit, the estimation loose reliability, and the farther the statistics are from the limits, the lower the reliability is. For example, the estimated viscosity profile during validation batch #5 is shown in Figure 4.10a. While the batch is being run, the quality measurements of course are not available to assess the reliability of the estimation. However, by complementing the estimation results (Figure 4.10a, full line) with a SPE-residuals plot (Figure 4.10b), one can detect online when the estimated quality values are not reliable (a T2 plot can be used additionally, providing similar results).

90

Chapter 4

1.2

180

1.0 0.8

Phase 3

120

0.6 0.4 0.2

90 60 30

Phase 1

0.0

Phase 2

150 SPE residual

viscosity (Pa×s)

Phase 1

estimation measurement

0

1000

2000

Phase 2

Phase 3

3000 4000 5000 time sample

6000

7000

0

0

1000

2000

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.10 Online assessment of the reliability of viscosity prediction using the MATPPLS estimator in validation batch #5: (a) viscosity profile and (b) SPE residuals control chart with 95 % confidence limits.

This example shows that, at the beginning of batch #5 and for the entire Phase 1 and Phase 2, the estimations are reliable. Instead, during Phase 3 the estimator starts to fail around time instant #5450, providing unreliable estimations. However, the alarm on the SPE residuals points out the lack of reliability of the estimation, exceeding the confidence limit (dashed line in Figure 4.10b).

4.3.2 Diagnosis of the soft sensors faults When an alarm on the Hotelling statistics T2 or on the SPE residual statistics indicates an unreliable estimation, an in-depth analysis can be done to understand why the soft sensor fails to give reliable estimations. The causes responsible for the mistaken estimations can be found inspecting the contributions of each variables to the relevant statistics. Since the T2 statistics and the SPE residuals are cumulative values and each process variable contributes to their value, the variables that give the highest contribution to the alarms are considered to be the main candidate to disclose the root cause of the soft sensor malfunctioning. Accordingly, to get information about a soft sensor fault, the contribution plots can be inspected. In this way, the variables that are most affected by the root-cause of the malfunctioning can be identified. First, the contribution to the relevant statistics exceeding the limits have to be studied, to identify which variable contributes more to the exceeding statistics. Secondly, the time trajectory of the variable contribution should be compared with the respective 95 % confidence limits fixed by the study of the variance of the reference dataset to identify the time instant when the cause of the soft sensor fault appears.

91

Soft sensors for the realtime quality estimation in batch processes

3 2 variable #14 E contribution

instantaneous E contribution

2

0

-2

-4

1 0 -1 -2 -3 -4 -5 -6 -7

-6

2

4

6

8

10

12

14

variable number

16

18

20

22

-8

0

1000

2000

3000

4000

5000

6000

7000

time instant

(a) (b) Figure 4.11 Online identification of the causes of unreliability for the estimation of the viscosity prediction using the MATP-PLS estimator in validation batch #5: (a) instantaneous contribution of all the variables to the to the residual with the respective confidence intervals of 95% and (b) SPE residuals control for variable #14.

For instance, in the case of batch #5, where the soft sensor gives reliable and very accurate estimations during Phase 1 and 2, while the estimation looses reliability during Phase 3 E (around time instant #5450), variable #14 shows a pronounced contribution c#5,14 compared to its limit and to the other variables (Figure 4.11a) . On a relative basis, variable 14 is the one with the highest relative contribution, if compared with the respective limit c14E (5% ) , that is a function of the current estimation phase within which it is maintained constant. The same result is shown in Figure 11b, where this single variable is monitored in time, and where the time instant in which the cause of the malfunctioning appears is detectable.

4.4 Soft sensor for estimation of quality in resin B In Chapter 2 it is highlighted that the production of resin B is carried out through two different stages: • Stage 1, which produces a pre-polymer with loosely specified characteristics; • Stage 2 to complete the resin B, giving the desired end-point quality to the product. Stage 1 has a short duration and only few quality samples are available. For this reason, it is important to determine when the pre-polymer is in-spec and when to stop Stage 1. However, it is not possible to build a relevant model for the estimation of the quality in Stage 1, due to the scarce number of available quality lab assays. During Stage 2 the pre-polymer is transformed into the end product. Interventions on the recipe are carried out by the operators during this stage in response to the quality measurements coming from the lab. A soft sensor estimating in real time the product quality from process measurements was designed.

92

Chapter 4

In the next section, the quality estimator for resin B is presented, and its performances are discussed.

4.4.1 Estimation of the quality indicators As also done for resin A, two moving-average PLS regression models were designed, one for the estimation of the acidity number and one for the estimation of the viscosity of the reacting mass. The total number of batches available for this study was 36 (19 months of operating effort in the plant facility). Of these batches, 27 were designated as the calibration dataset; the remaining 9 batches constituted the validation dataset. Reference data were collected in a 3D matrix X (I×J×Ki), where J = 19 is the number of process measurements that were eventually retained. Also in this case batch alignment proved unsuccessful. Therefore, being the process variable trajectories quite dissimilar in this stage, the three-way process data matrices were variable-wise unfolded. However, VWU preserves the nonlinearity between the predictors space and the responses one (Kourti, 2003), and is preferred when the correlation structure of a process is roughly constant (Camacho et al., 2008a). 40

Y-scores (LV 1)

30

15 14 13 12

Cluster 2

11

20 10 9

10 0 Cluster 1

-10 1

-20 -30 -5

-4

3

2

-3

-2

4

5

6

78

-1 0 1 2 X-scores (LV 1)

3

4

5

Figure 4.12 Stage 2 scores plane in the first latent variable for the calibration dataset (reference is made to the time instants when viscosity measurements are available). The squares indicate how a single batch within the dataset projects onto this plane as time progresses from the beginning [1] to the end [15] of the stage. Dashed lines indicate the approximate locations of the clusters.

To compensate for these drawbacks, the same approach used in the quality estimation of resin A was used, i.e. the production stage was split into different estimation phases, and distinct PLS submodels were designed for each estimation phase. To determine the number of estimation phases, a simple approach proved satisfactory: plotting X-scores ur vs. Y-scores qr for the whole calibration dataset (Figure 4.12) clearly shows that two clusters are present in the score plane, each cluster representing an estimation phase. Therefore, two submodels were

Soft sensors for the realtime quality estimation in batch processes

93

built to estimate NA (or μ) within Stage 2. Note that cluster analysis (Lu and Gao, 2005a; Beaver et al., 2007) could have been used for an automatic detection of the clusters in the scores space. However, it should also be noted that that the number of clusters must be kept as small as possible because if too few quality measurements are available within a cluster, it may be impossible to design the relevant PLS submodel. We observed that “time” is really not a good indicator to assess phase switching in this process. Run-to-run variability is extremely large (for example, Stage 2 length ranges from 27.5 to 48.9 hours), and the switching time shows a large variability, too. Therefore, submodel switching was linked not to time, but to events (as in the previous case study): there are certain processing events that do occur in all batches and change the correlation structure between the variables, although they occur at a different time from one batch to another one (a similar approach has been used recently by Doan and Srinivasan, 2008). The occurrence of these events (which can be easily detected on line) dictates phase switching. We believe that this approach is more general than what could be obtained if time was used to designate submodel switching. We found that the switching event was the same both for the NA-model and for the μ-model, and is related to a change of pressure in the reactor that is part of the production recipe during Stage 2. To attenuate the effect measurement and process noise, and to provide the PLS model (which is inherently static) with “memory”, the moving-window approach was used. The process measurements included into the X matrix were averaged over a moving time window of 900 past time instants (7.5 h), this width having been determined in such a way as to minimize the mean relative prediction error on the validation dataset. Not only did this provide a significant smoothing of the estimated quality profiles, but it also increased the amount of predictive information included in the X matrix, which made the quality estimation more accurate. As observed also by Ku et al. (1995) in a different context, cross-validation proved ineffective for the determination of the number of latent variables (LVs) to be retained in the submodels. Therefore, this number was determined by minimizing the estimation error in the validation dataset. As far as the estimation of NA is concerned, 6 and 3 LVs were used for Phase 1 submodel and Phase 2 submodel, respectively; for the estimation of μ, 2 LVs were used during Phase 1, and 3 during Phase 2. Typical estimation results in a validation batch are shown in Figure 4.13, where quality estimations are compared to lab measurements for both the acidity number (Figure 4.13a) and the viscosity (Figure 4.13b). It can be seen that the estimated profiles of quality indicators are smooth (noise-free), and the estimation accuracy is within the accuracy of the laboratory measurements, but the estimation frequency is much more faster than the laboratory measurement one. These results are then displayed in an “industrial” monitoring chart (Figure 4.14), where quality estimations (solid line) are compared to lab measurements (dots). This is a further proof that the estimations compare well to the actual measurements, and indeed can

94

Chapter 4

be used as surrogate measurements to guide the operators throughout the application of the processing recipe. 300

estimation measurement

40

250 viscosity (Poise)

acidity number (mgKOH/gresin)

50

30 20 10

200 150 100 50

0

estimation measurement

0

4000

5000

6000 time sample

7000

8000

4000

5000

6000 time sample

7000

(a) (b) Figure 4.13 Comparison of estimated vs. measured quality variables (a) acidity number and (b) viscosity in a typical validation batch.

acidity number

Figure 4.14 Comparison between lab measurements (circles with measurement uncertainty) and realtime estimation (solid line) of the resin quality in an industrial monitoring chart (validation batch) during Stage 2. Acidity number is reported as the abscissa (decreasing values from left to right), and viscosity as the ordinate. Non-standard units are used. The measured values of acidity number and viscosity should always fall within the bounds (broken lines). Time increases (nonlinearly) from the lower-left corner to the upper-right one.

8000

95

Soft sensors for the realtime quality estimation in batch processes

Table 4.3 shows the summary of the results for the bi-phase PLS model (BP-PLS) and the moving-average BP-PLS modeling (MABP-PLS). The improvement of performances with the inclusion of dynamic information is apparent, and the accuracy of the online estimation demonstrate to be near to the one of the lab assays. It should be highlighted that, despite the average relative prediction errors on the acidity number of the validation batches during Phase 2 are ~20%, this error is tolerable, due to the fact that the absolute value of NA is very small, and such an error corresponds to really low absolute errors. Table 4.3 Comparison of the mean relative estimation errors (%) of different models for the quality estimation during Stage 2 of validation batches for the production of resin B. The optimal number of retained latent variables is shown for every model. Model BP-PLS MABP-PLS

Phase 1 NA (%) μ (%) 17.2 (4 LV) 10.6 (2 LV) 15.3 (2 LV) 11.1 (4 LV)

Phase 2 NA (%) μ (%) 30.0 (5 LV) 14.1 (2 LV) 20.1 (2 LV) 12.3 (2 LV)

Moreover, it should be remarked that lab measurements are spaced (roughly) by 2 h, whereas quality estimations are made available at the same frequency as process measurements (about two per minute). Therefore, the recipe adjustments can be carried out much more promptly if the soft sensor is employed in real time, the chances that product quality drifts outside the acceptable bounds are minimized, and the length of the batch can be shortened. The actual implementation of this soft sensor also allows to significantly reduce the number of samples to be taken and analyzed during a batch, which contributes to cut the lab-related expenses, and allows to redirect the operators to more qualifying duties. As for the real time implementation of the soft sensor, online detection of the switching instant from available process measurements was a key issue to guarantee a good performance of the sensor. Standard digital filters were used to protect the soft sensor from measurements noise and spurious events that might disrupt its performance by erroneously triggering a phase switch.

4.5 Concluding remarks Partial least squares regression has proved to be a reliable tool for the online estimation of the product quality properties in industrial batch polymerization processes. The process under study (i.e. manufacturing of resins) was characterized by a large number of available process measurements, uneven batch duration, scarce number of quality variable measurements with uneven sampling of (and lag on) these variables, complex and almost entirely nonreproducible sequence of processing steps. It was shown that the product quality can be estimated in real time from the available process measurements with an accuracy similar to

96

Chapter 4

that of the lab instrumentation, but at a much higher frequency, with no delay, and with no need for dedicated personnel. The frequency at which the quality estimations are made available is 2 min-1, i.e. 240 times faster than the lab measurements frequency. To compensate for the nonlinear nature of the input/output mapping and the changing correlation structure between variables, a segmentation of the batches into a limited number of estimation phases was carried out by highlighting different clusters of score points in the scores plot of the reference dataset. Within each of these phases, linear PLS submodels were shown to provide accurate quality estimations. Switching between one submodel to another one was triggered by clearly detectable landmark events occurring in the process. Inclusion of time information into the process data matrix was shown to substantially improve the estimation accuracy. Namely, augmenting the process measurement matrix with lagged measurements dampened most of the noise on the estimated values of the quality variables. Furthermore, incorporating the soft sensor with a “memory” through a moving window approach was highly beneficial to increase the estimation accuracy, without introducing any significant complication in the structure of the soft sensor. Averaging the process measurement values on a moving window of fixed length provided valuable information on the batch evolution that proved to be useful both to suppress measurement noise and to attenuate process noise, especially during the phases with high degree of inter-batch variability (i.e. when correction and addition of fresh raw material take place). One of the advantages of the resulting moving-average multiphase PLS estimator is that it is very easy to implement, because it does not require to modify the structure of the PLS algorithm. Furthermore, using averaged measurement represents an easy way to handle noise spikes or temporarily missing values of the process measurements. However, care must be taken in selecting the length of the moving window, because too wide a window may delay the appearance of out-of-threshold values in the T2 or SPE residuals control charts. Therefore, the product quality can be estimated in real time and, to compensate for quality drifts, adjustments on the production recipe can be carried out very promptly. This minimizes the risk to obtain off-spec products and reduces the overall processing time. Furthermore, the number of product samples that need to be taken and analyzed in the lab can be drastically reduced. Indeed, product samples can be taken only when the product is deemed to be close to specification, which contributes to cut the lab-related expenses, and allows to redirect the operators to more qualifying duties. In summary, realtime knowledge of product quality can significantly improve the operation of a batch and cut the expenses related to sample handling and analysis.

Chapter 5 Realtime prediction of batch length Typically, a monitoring system in the production of specialty chemicals has to tackle the challenging issue of understanding the “maturity” of a product in real time. In fact, for several specialty productions it is not possible to know in advance either the total batch length, or the length of any processing stage within the batch, being the batch duration and the stage duration determined by several different causes (such as variable quality of the raw materials, uneven quality and quantity of midcourse corrections, delayed timing of the operations on the plant) most of which can not be known in advance. On the other hand, the realtime estimation of the total length of the batch would be very useful for production planning, scheduling of equipment use, as well as to coordinate the operating labour resources (Marjanovic et al., 2006). In this Chapter1, the realtime prediction of the batch length and of the duration of the production stages within a batch is addressed for the case study of the production of resins. Multivariate statistical techniques based on the projection onto latent structures (Wold et al., 2001) are used for this purpose. Reference is made to the fed-batch process where resin A and B are produced. It is shown that, by appropriately tailoring some existing techniques, information can be extracted from available process measurements that can significantly improve the overall performance of the process, offering an insight into the time development of a batch production.

5.1 Design of an evolving PLS model for the prediction of batch length In batch processes the number of quality measurements within a processing stage is often too small to allow designing a PLS model for the online estimation of the product quality within that stage. To monitor the evolution of the batch and the respective stages, the batch length τ or any stages’ length can be predicted instead. The possibility of knowing the value of τ in advance would allow the operators to perform timely interventions on the plant and to reduce

1

Portions of this Chapter have been published in Facco et al. (2008a) and in Faggian et al. (2009).

98

Chapter 5

the number of samples needed for laboratory analysis, because samples would only be taken starting from the time when the stage is expected to terminate. Thus, if the performance of the estimator is adequate, only one or two samples may be sufficient to detect the stage termination time.

Figure 5.1 Scheme of the evolving PLS model procedure to perform the online prediction of the batch and stage length.

By checking the process variable profiles, it was noted that, whatever product was processed, during the time evolution of a single stage the profile of each process variable displays similar trends in all batches, and only the stage length seems to discriminate one batch from another one. Therefore, because a certain degree of similarity was apparent among the early stages of

Realtime prediction of batch length

99

all the batches, a multi-way PLS model (Nomikos and MacGregor, 1994) using BWU was developed to provide realtime estimation of the stage length. Alignment techniques based on the indicator variable approach (Kourti, 2003; Ündey et al., 2003) proved unsuccessful for synchronizing the process variables trajectories. This is probably due to the fact that the process variable trajectories are correlated to time in a highly nonlinear way. Therefore, a simpler approach was taken: the values of process measurements that had been collected at a time exceeding a threshold value t* were simply disregarded. As for the number of process measurements to be included into the X matrix, an engineering analysis suggested to discard a small subset of the available measurements, i.e. those measurements that either were expected to provide no contribution to the batch dynamics or had a markedly non-smooth profile throughout the batch. Following these indications, constant set-points and on-off variables were eliminated from the dataset. This resulted in X being unfolded to a (I×J·k) two-way matrix XBWU, where J is the number of process measurements that were eventually retained, and k is the number of time instants used to estimate τ, where k=1,…,τ* and τ* is the number of samples corresponding to the time horizon t*. The response matrix Y reduced to a column vector containing the length of the I batches. Before further processing, the XBWU and Y matrices were auto-scaled. The realtime estimation of τ was accomplished by designing a set of time evolving PLS models to be used within each batch (Wold et al., 1996; Westerhuis et al., 1998; Louwerse and Smilde, 2000; Ramaker et al., 2005). Each model refers to the time instant k ≤ τ* at which a process measurement becomes available, and uses the process variable values from time instant 1 up to time instant k to estimate τ, as represented in Figure 5.1. Therefore, the dimension of XBWU increases as time progresses. However, note that this has a negligible effect on the calculation time. The goodness of prediction is checked through the time-averaged absolute error TAAEi of the stage length prediction in a batch i over the whole estimation window t*, which is defined as: t*

1 TAAEi = * ∫ ε i (t ) d t , t 0

(5.1)

where:

ε i (t ) = τˆ i (t ) − τi

(5.2)

is the instantaneous error of estimation of stage length in batch i, τˆ i (t ) is the value of stage length in the same batch as estimated at time t (i.e., at time instant k), and τi is the actual length of the stage in the same batch. Equation (5.2) is evaluated with a finite difference approximation. Further averaging the value of TAAEi over all the validation batches provides _______ the value of the overall average absolute error AAE for the whole validation dataset.

100

Chapter 5

The number of latent variables to be retained in the evolving PLS model can be determined by _______ minimizing the value of AAE .

5.2 Prediction of batch length in the production of resin B In the production of resin B, the total duration of a batch is very variable (from 50 to 80 h) and seemingly unpredictable, as are the lengths of Stage 1 and Stage 2. Therefore, it is hard for the management to appropriately schedule the use of equipment when several batches are to be processed in series or in parallel. Evolving PLS models were built to predict the duration of the Stages 1 and 2. In particular, I=27 batches were used as reference dataset, while 9 batches where used for the validation step. Finally, J=19 process measurement were selected to the purpose of the stage and batch duration prediction.

5.2.1 Prediction of Stage 1 length The length of Stage 1 in the available dataset ranged between 14.2 h and 26.1 h, i.e. the length variability (~12 h) was about 1.5 times the length of an operator’s shift window (8h). The threshold length was set to be as the shortest Stage 1 length among all the available batches. Namely, t * = 14.2 h (i.e., τ* = 1700 time instants) was set. _______ One latent variable was retained in the model to minimize AAE . This single latent variable was not able to capture much of the variability in the XBWU space of the calibration dataset. In fact, for any batch of the calibration dataset only 23 to 26 % of the XBWU variance was explained by the first latent variable, which indicates that only a small fraction of the information embedded in the measured process variable profiles is actually correlated with the length of the stage. Correspondingly, about 50 to 70 % of the variance in Y was explained. _______

Nevertheless, the length prediction was quite satisfactory, because the value of AAE was calculated as 196 time instants (~1.6 h), i.e. ~8 % of the average length of Stage 1, and ~14 % of the variability in the length. Figure 5.2 shows that, with reference to average of the nine validation batches, the batchaveraged instantaneous absolute error of estimation is relatively low (~2.1 h) at the very beginning of the operation; it soon decreases down to ~1.8 h; then, starting from k ≅ 600 time instants, it further decreases steadily and reaches a minimum of ~1.3 h at the end of the estimation window.

average inst. abs. error of estimation (h)

Realtime prediction of batch length

101

2.2 2.1 2.0 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2

0

200 400 600 800 1000 1200 1400 1600 1800 time instant

Figure 5.2 Resin B, Stage 1: batch-averaged instantaneous absolute error of Stage 1 length estimation in the validation dataset as a function of time within the length of the estimation window

This is an indication that it is the information progressively collected during the evolution of the batch that proves useful for the estimation of τ. This issue is further clarified in Figure 5.3, which refers to a single validation batch. When incremental information is used to build matrix XBWU (i.e., when the column dimension of XBWU grows with time; evolving model), the estimation of τ is smooth and steadily improves after 600 time instants. However, if only instantaneous information is used to build the predictor matrix Xk (i.e., at time k, Xk is only made with measurements taken at k; local model), the estimation of τ is much more erratic; it would be hard to have the process operators trust such an estimation (note that the actual length, which is also indicated in Figure 5.3, is obviously not known when the batch is being run). The information of Figure 5.3 can be complemented online with the plots of the Hotelling T2 and squared prediction error statistics, which would provide an indication on whether the estimation is reliable or not. To appreciate how variable the validation results are, Table 1 provides the time-averaged _______ results for each of the validation batches. Note that the overall AAE of the validation dataset can be calculated as the average of the curve shown in Figure 5.3, or as the average of the data reported in Table 5.1. For a practical perspective, the results shown in Figure 5.3 would be implemented in a slightly different way: the projected time of the day at which Stage 1 is expected to terminate would be shown onto the operators’ display at selected time instants. About 1.6 h before the stage is expected to terminate, the operators can take one product sample and send it to the lab for analysis. Thus, the number of product samples that need to be analyzed can be minimized, which reduces the operator-related costs. Furthermore, it is possible to know in advance

102

Chapter 5

whether the sample should be taken during the current shift or during the next one, which has a favourable impact on the workload organization. Table 5.1 Resin B, Stage 1: time-averaged absolute estimation error (TAAEi) of Stage 1 length for each of the validation batches. Batch 1 2 3 4 5 6 7 8 9

Actual length (h) 17.4 16.1 16.9 16.4 18.9 22.6 24.1 21.3 23.3

TAAEi 2.2 1.2 0.3 0.8 0.6 0.6 4.8 3.2 1.1

20 evolving model local model

stage 1 lenght (h)

19 18 17 16 15 14

0

200 400 600 800 1000 1200 1400 1600 1800 time instant

Figure 5.3 Resin B, Stage 1: time evolution of the estimated length of Stage 1 in one validation batch for two different arrangements of the process data matrix X. Broken line: incremental information is used to build XBWU (evolving model); solid line: only instantaneous information is used to build the matrix Xk (local models). The actual length of the stage is also indicated (thin solid line), although it can only be known at the end of the stage.

It is interesting to note that only a subset of the variables included in the XBWU matrix provides a significant contribution to the estimation of τ. To appreciate this, the index of variable importance in the projection method (VIP; Chong and Jun, 2005) can be calculated for each process variable at each time instant. The results of this “dynamic” VIP analysis are reported in Figure 5.4. Process variables with VIP > 1 are considered “important” for the estimation of τ.

Realtime prediction of batch length

103

5.5

5.5 16 5 15 14 17 4

8 10

6 9

13 7 11 19 12 18 2 3

1

5.0

4.5

4.5

4.0

4.0

3.5

3.5

VIP index

VIP index

5.0

3.0 2.5

var. #5

var. #15

var. #14

var. #17

3.0 2.5

2.0

2.0

1.5

1.5

1.0

1.0

0.5

0.5

0.0

var. #16

0.0 1700

length of the estimation window, repeated (time instants)

1700 1700 1700 1700 1700 length of the estimation window, repeated (time instants)

(a) (b) Figure 5.4 Resin B, Stage 1: profile of the VIP index over the Stage 1 estimation window length (1700 time instants) for (a) all process measurements (the numbers above the curves indicate the process variable number designation) and (b) the most “important” five measurements.

Five process variables show a value of VIP consistently larger than 1 throughout the whole estimation window. These variables are the reactor temperature (variable #16), the outlet temperature of the heating oil (variable #5), the setpoint for the inlet temperature of the heating oil (#15), the inlet temperature of the heating oil (#14), and the setpoint for the reactor temperature (#17). This suggests that the most important variables for the estimation of τ are those associated to the thermal behaviour of the reactor, which is consistent with what one would expect from engineering judgment. Furthermore, starting from k ≅ 400 time instants (i.e., 3.3 h), the VIP index keeps increasing for all of these variables, indicating that “thermal behaviour” and stage length get more and more correlated after that time. Then, after k ≅ 1500 time instants (~12.5 h), VIP decreases for these variables, meaning that the correlation between thermal behaviour and stage length starts vanishing after that time. This indicates that the “temperature footprint” of the process is almost completely traced about twelve hours after the process has been started, which is consistent with the fact that the profiles of most process variables start flattening after ~12 h from the beginning of the batch.

5.2.2 Prediction of Stage 2 length Production planning is difficult for this process, because the length of a batch is not known a priori, and changes a lot from batch to batch (e.g., the range of variability of Stage 2 length is as large as three operator’s shift windows). Estimating the stage length in advance is very important to schedule the use of the equipment in the subsequent batches, and to plan the operating labour requirements. For these reasons, a soft sensor was designed to estimate the stage length in realtime. The approach was the same used for the realtime prediction of Stage

104

Chapter 5

8

14

7 6

12

5

10 VIP index

average inst. abs. error of estimation (h)

1 length (evolving PLS model), and τ* = 3000 time instants was set (all symbols refer now to Stage 2). The results obtained were satisfactory, as shown in Figure 5.5a.

4 3 2

8

4

9

6

17 14

7 15

5 18 16 12

8 6 4

1 0

19 13 11 10

2 0

500 1000 1500 2000 2500 3000 time instant (from the beginning of Stage 2)

0

length of the estimation window (repeated)

(a) (b) Figure 5.5 Resin B, Stage 2: (a) batch-averaged instantaneous absolute error of τ estimation in the validation dataset as a function of time within the length of the estimation window (evolving model); and (b) profile of the VIP index over the estimation window length for all process measurements (the numbers above the curves indicate the process variable number designation).

Table 5.2 Resin B, Stage 2: time-averaged absolute estimation error (TAAEi) of Stage 2 length for each of the validation batches. Batch 1 2 3 4 5 6 7 8 9

Actual length (h) 42.0 51.6 42.0 44.7 35.6 47.4 38.0 31.6 43.9

TAAEi 2.1 6.9 2.3 4.8 9.6 3.3 2.3 3.3 2.2

Figure 5.5b shows that no process variable is really “dominant” during Stage 2 as far as the stage length estimation is concerned. Almost all variables provide some kind of contribution to the estimation of τ, and the variables related to the “temperature footprint” of the reactor (e.g., variables # 5, 14, 15, 16, and 17) are among the least important. On the average, the estimation error on the validation dataset is larger than in Stage 1 _______ ( AAE = 490 time instants, i.e. ~4.1 h). However, this is only ~11 % of the average length of Stage 2, and ~20% of the variability in the length, which is well below the length of an operator’s shift. Note that it takes only ~250 time instants (~2 h) to have a satisfactory estimation of the overall length of the stage; after ~1500 time instants from the beginning of

Realtime prediction of batch length

105

the stage, the average absolute estimation error further decreases by about 1 h. Table 5.2 provides time-averaged results for each of the validation batches.

5.3 Prediction of batch length in the production of resin A

average inst. abs. error of estimation (h)

An evolving PLS soft sensor for the prediction of the batch length was designed also in the case of the resin A. The batches considered in this case are the same as the ones used in Chapter 4 to design the quality estimator. I=27 calibration batches were considered, while the number of validation batches is 5. The predictor variables are J=21. The total length of the of the batches ranges between 40 and 60 h,. The prediction of the batch length is interesting when accomplished in the first part of the batch, i.e. when the quality measurement are not available and it is not possible to design a soft sensor for the estimation of the quality indices. For this reason the threshold value was set at t * =8.3 h (i.e., τ * =1000 time instants).

6 5 4 3 2 1 0

0

200

400

600

800

1000

time instant

Figure 5.6 Resin A, start of the batch: batch-averaged instantaneous absolute error of total length prediction in the validation dataset as a function of time within the length of the estimation window. _______

One latent variable was retained in the model to minimize the AAE also in this case. Despite a single LV is not able to capture much of the variability of both XBWU (~17 %), the _______ prediction of the batch length results satisfactory, because the value of AAE is 3.23 h, i.e. ~6.5 % of the average length of batch and ~16.7 % of the variability in the length. This error is much lower than one operator’s shift window length. Furthermore, although the batchaveraged instantaneous absolute error of estimation in the validation batches (Figure 5.6) is relatively high at the beginning of the operation (~6 h), it decreases down to ~1.1 h at the time

106

Chapter 5

t*, because the information accumulated after 8 h of the batch becomes more correlated to total processing time τ. Table 5.3 provides the time-averaged results for each of the validation batches. Table 5.3 Resin A, start of the batch: time-averaged absolute estimation error (TAAEi) of the total batch length for each of the validation batches Batch 1 2 3 4 5

Actual length (h) 53.3 50.2 52.3 50.4 52.8

TAAEi 12.4 1.9 0.4 0.5 1.0

The time trajectories of the VIP index for each of the process variables highlight that the most relevant predictor variables are the reactor temperature, the temperature of the heating oil and the top and bottom temperatures of the distillation column. Consequently, the most important variables for the prediction of the total batch length are associated to the thermal behaviour of the reactor and to the performances of the distillation column.

5.4 Concluding remarks In this Chapter it was shown through an industrial case study how, by considering a blend of engineering judgment and mathematical modeling, multivariate statistical techniques can be exploited to assist the realtime monitoring of product quality and to deliver helpful information for an effective production planning in the semi-batch processing of specialty chemicals. An evolving PLS modeling approach was exploited for the prediction of the duration of the batch. Namely, it was shown that, by incrementally using the information gathered during the evolution of the batch, a sound estimation of the length of the batch (or of any processing stage within the batch) can be obtained in realtime with an average error that is at most as large as 17 % of the inherent batch-to-batch variability. Such piece of information is particularly useful in batch processing, as it allows to schedule manual interventions, to optimize the manpower in terms of shifts and roles, to forecast the production time, and to schedule the most convenient utilization of plant equipment. The statistical analysis of the most significant process variables that contribute to determine the length of the batch confirmed that the initial heat-up stage is crucial for the development of the entire batch.

Chapter 6 Industrial implementation of a soft sensor prototype Chapter 6 describes the “physical” implementation of a prototype of the soft sensors designed in the previous Chapters. In particular, it is explained how the virtual sensor technology is integrated into the supervision system of an industrial polymerization process for the production of resins. Three prototypes have been designed and implemented to work online: i) a three-phase moving-average PLS soft sensor for the realtime estimation of the quality indices of resin A; ii) a soft sensor for resin B that predicts the length of production Stage 1 in real time by means of an evolving PLS model; and iii) a soft sensor for resin B that estimates the quality indices of the resin through a bi-phase moving-average PLS model during production Stage 2.

6.1 Industrial supervision system The supervision system adopted in the industrial production facility under study is MoviconTM 9.1, which has a typical SCADA scheme (see §3.2). The central core of the supervision system interfaces to different modules through a client-server technology (Figure 6.1).

Figure 6.1 Structure of the communication system between the plant, the server of the supervision system, the communication interface that visualizes the plant status and manages the recipes and the required interventions.

The central core of the supervision system manages automatically the following items:

108

Chapter 6

• visualization of the state of the equipment; • alarms; • recording of all the operations carried out in the production facility; • collection and visualization of the process variables; • interfaces for the communication with the operating personnel; • collection and visualization of the quality indices laboratory measurements; • communication with control devices. The supervision system is based on a Visual Basic Sax platform, which can supervise the “animation” of the clients, the correct functioning of the clients, and the communication with the online hardware sensors and with the controllers. Accordingly, the supervision system allows to perform all the required interventions for the safety and the productivity of the manufacturing through OPCs (OLE, i.e. object linking and embedding, for process control) operating in real time. In fact, data are acquired both from PLC controllers and from the hardware sensors to collect the process variables measured online. All the acquired data are registered in SQL databases easily consultable by operating personnel. Consequently, the operators can observe and modify the current production through a system of views and queries. Furthermore, a supervision server ensures the direct communication between the supervision system and both the PLCs and the regulators, while in the production facility some client personal computers are present. The supervision system interfaces to these clients through networking variables available on different levels. A local area network (LAN) guarantees the communication between servers and clients with an Ethernet structure and conveys all the process information to the recording system.

6.2 Implementation of the soft sensor In Chapters 4 and 5 three different types of soft sensors have been designed for the polymerization processes for the production of both resin A and resin B. In particular, the following soft sensors were implemented: • a three-phase moving-average PLS soft sensor (called Prototype A) for the online estimation of the quality indices, which gives both the estimations of NA and μ, and a measure of the reliability of the estimations for the entire duration of the batches for the production of resin A; • an evolving PLS soft sensor (called Prototype B1) for the realtime prediction of the length of the operating Stage 1 of the production of the resin B; • a bi-phase moving-average PLS (called Prototype B2) soft sensor for the online estimation of the quality indices in Stage 2 of the production of the resin B.

Industrial implementation of a soft sensor prototype

109

These virtual sensors have been working online at a prototype level in the production facility for the manufacturing of the abovementioned resins. These soft sensors are mathematical models based on PLS that regress the relevant characteristics (quality indices or batch/stage length) from the process variables that are available online (measured by hardware sensors). However, the “physical” implementation of a soft sensor require more than the estimation or prediction time, because there is the need of: • interfacing with the process; • interfacing with the operators; • acquiring the process variable measurements; • giving the output values, i.e. the estimation of the quality indices or the prediction of the batch/stage duration, together with the reliability of estimations/predictions. The models built for the online implementation refer to the entire dataset of the available past batches, including the set of the validation batches considered in Chapters 4 and 5.

6.2.1 Architecture of the soft sensor The soft sensors for the online estimation of the quality indices and for the prediction of the batch/stage duration were developed in the MatlabTM computing environment (www.mathworks.com). MatlabTM is provided also with a specific package, namely the PLS_ToolboxTM (www.eigenvectorresearch.com) for the multivariate statistical tools. The online implementation implies a complex series of operations, which are represented in Figure 6.2. This sequence goes through the following steps: • the first time the supervision system interrogates the routine for a batch, the “memory” of the soft sensor is erased and the virtual sensor starts to work the data; • the soft sensor acquires the array of the 34 process variables measured online, and stores them in such a way to create the “memory” of the soft sensor; • some flags are applied to recognize both the current stage of the batch for the duration prediction, and the current phase for the estimation of the quality; • the process variables are then properly selected and pre-treated. For example, the subset of the most relevant process variables to the purpose of quality estimation (or to the purpose of batch length prediction) are selected. Then, the relevant unfolding procedure is performed (BWU in the case of the length prediction and VWU in the case of the quality estimation), and the variables are scaled and mean-centred. When needed, the movingaverage filter is applied to smooth the process and the measurement noise and to dampen the effect of some outliers or making the soft sensor less sensitive to short interruptions; • at this point, data are ready to be treated by the soft sensors, and are fed into either the quality estimator or the length predictor.

110

Chapter 6

• the output of the online quality estimators (i.e. acidity number and viscosity) and of the realtime duration predictor (i.e. date and time of the end of the batch/stage) are provided to the supervision system with the respective reliability indices. After that, they can be displayed through the communication interface.

process

SUPERVISION SYSTEM

process data acquisition and recording

phase identification

ESTIMATION ROUTINE

data pre-treatment and alignment

quality estimation

comunication interface

batch duration prediction

output of estimated quality and predicted batch duration + reliability

Figure 6.2 Scheme of the sequence of operations performed by the soft sensor implemented online in the batch polymerization process for the manufacturing of resins.

The soft sensors are embedded into the supervision system (see also Figure 6.1), which commands to repeat the sequence every 30 s.

6.2.2 MatlabTM codes of the soft sensors The implemented soft sensors are built with MatlabTM codes. In principle, the “architecture” of both the soft sensors implemented has the same structure, based on three MatlabTM codes (files .m) and the relevant models (file .mat). The codes are: • an initialization code for erasing the “memory” of the soft sensor and for beginning the treatment of the data;

Industrial implementation of a soft sensor prototype

111

• a code for the management of both inputs and outputs, and for the administration of the alarms for the lack of reliability; • a code performing either the phase/stage detection and switching, and the data pretreatment (e.g.: calculation of moving average and lagged variables; variable selection; data unfolding), and the online estimation of the quality or the realtime prediction of the stage duration. In the following sections, details are given on either Prototype A and Prototypes B1 and B2. 6.2.2.1 Prototype A The soft sensor for the resin A is a three-phase moving-average PLS virtual sensor for the online estimation of the quality indices throughout the entire duration of the batch. The soft sensor is constituted by the files: • InizializzazioneModelloSIRCA.m; • OnlineSensor1.m; • SIRCAproject1v1.m; • Modello.mat. The file Modello.mat is a cell array that contains the three-phase moving-average PLS model. In particular, one model is present for each estimation phase and for both the quality indices. The first code, InizializzazioneModelloSIRCA.m is a function for the initialization of the soft sensor. It is called by the supervision system when a batch of the resin A starts. This code aims at: • erasing the “memory” of the soft sensor; • giving the output istante. The variable istante is a counter of the number of times the soft sensor is interrogated. This function is called at the beginning of the batch by the operating personnel through the “Start” button of the graphical interface of Figure 6.3. The soft sensor is then provided with some flags on the process variables to verify if the initial state of the manufacturing system fits the required conditions for the soft sensor to begin working. Afterwards, the function OnlineSensor1.m is called. This code requires two inputs: the array of the 34 process variables measured online, and the value istante. First of all, this code allows for the recording of the process variables (i.e. it starts the “memory” of the soft sensor). Then, the alarms of the estimation reliability are computed. In general, this function manages the input and output. The outputs are: • instantaneous estimations of the acidity number and viscosity; • instantaneous values of the Hotelling statistics with the respective limits for both the quality indices;

112

Chapter 6

• instantaneous values of the SPE residuals with the respective limits for both the quality indices; • alarms of reliability for the estimations and the predictions, which are easy and fast representation of the Hotelling and residual statistics limits.

Figure 6.3 Graphical user interface for the operating personnel to start the soft sensors when the resin A or B is manufactured.

The estimation process is performed calling the function SIRCAproject1v1.m, through the following steps: • automatic detection of the current estimation phase; • data pretreatment (i.e. computing the moving-average variables, performing variable selection, and unfolding input data); • online estimation of the quality indices through the relevant model embedded into the file Modello.mat.

No output is available for the first 999 sampling instants (as described in Chapter 4), but when the variable istante reaches the value of 1000, the quality estimator start to deliver the outputs, which are specifically: • the quality indices, i.e. acidity number and viscosity; • the Hotelling statistics and the residuals statistics of the abovementioned estimations, with the respective limits; • the reliability of the estimations. The reliability index is a summary of the Hotelling statistics and the SPE residuals. There are two reliability indices, one for NA and one for μ. The reliability indices assume the value 1 if the estimation is deemed to be reliable, and

Industrial implementation of a soft sensor prototype

113

assumes the value 0 if it is deemed to be unreliable, because the actual value of at least one relevant statistic (i.e. Hotelling statistics or SPE residuals) overshoots the respective limit; • the input matrix of the model updated in the function records; • the number of the current phase identified through the flags imposed on the process variables. The procedure to stop the interrogation of the model is operated by the plant personnel through the “Stop” button of the interface of Figure 6.3. 6.2.2.2 Prototypes B1 and B2 The soft sensor for the resin B performs in series the realtime prediction of Stage 1 duration in the first 1700 sampling instants with Prototype B1 and the online estimation of the quality during Stage 2 with Prototype B. The soft sensor is constituted by the files: • Inizializzazione.m; • InOut.m; • realtimeSensor.m; • modelTau.mat; • softsensor.mat. The files Inizializzazione.m, InOut.m, and realtimeSensor.m are common to both the quality estimator and the duration predictor. The file Inizializzazione.m is the initialization procedure, started by the operators when the manufacturing of resin B begins. This function resets the “memory” of the soft sensor and detects the first instant for interrogation of the models through variable istante. No input is required by this function. Similar to the case of resin A, a function InOut.m is responsible of the communication of both the estimator and the predictor with the supervision system. This function requires two inputs: the 34 process variables measured online, and the variable istante. The outputs are: during Stage 1, the prediction of the end point of the stage (date and time) and the reliability of the prediction; during Stage 2, the outputs are the estimated acidity number and viscosity, and the alarms of reliability also with the Hotelling and the residuals statistics and the respective limits. The function InOut.m calls the soft sensors, i.e. the function realtimeSensor.m, which carries out: • identification of and switching between different production stages and different estimation phases by means of some flags on the input process variables; • pre-treatment of the incoming data (i.e.: variable selection, BWU of the data for the duration predictor and VWU for the quality estimator, calculation of the moving average); • prediction of the stage duration during Stage 1 interrogating the model modelTau.mat;

114

Chapter 6

• estimation of the quality during the two phases of Stage 2 interrogating the model softsensor.mat. The models of the Prototypes B1 and B2 are recorded in the .mat files. The model of Prototype B1 is stored in the file modelTau.mat, while the file softsensor.mat is a cell array with the Prototype 2. It should be highlighted that the cell-array structure of the model of Prototype B2 (one model for each estimation phase and for each quality index) is sufficiently parsimonious because the “physical” dimension of the file is less then 200 Kb, while the dimension of the model of Prototype B1 (one model for every interrogation instant) can be nearly 2 Gb when the instantaneous models are stored in different cell-arrays structures. Therefore, the implementation of Prototype B1 requires to “shrink” the model from both the structural and the algorithmic point of view. To reduce the computational burden, the rate of interrogation of the duration predictor was decreased 10 times (1 interrogation every 5 min, and not every 30 s). This imply that the number of models to be stored in the modelTau.mat file decreases by 10 times (form 1700 to 170). Further improvements can be achieved if the structure of the model is organized in a more profitable way. For example, the instantaneous models that constitute the evolving PLS model can be reorganized in such a way that they are stored in a single cell-array structure, where the model parameters of the different instantaneous models (e.g., the loading matrices, the confidence limits, etc…) are grouped in the same vector for all the models. Therefore, on a certain sampling instant, the parameters of the appropriate instantaneous model are extracted by the all-encompassing vector of the evolving PLS to use the correct model in the right moment. In this way the dimension of the file modelTau.mat can be reduced to about 30 Mb.

Chapter 7 Surface characterization through multiresolution and multivariate image analysis In this Chapter1 the issue of the characterization of the microscopic features of the surface of a high value-added product is faced using novel techniques for the multiresolution and multivariate image analysis. In the case of a photolithography process for the production of integrated circuits it is shown that, after applying a multiresolution filter to denoise the an image of the product, it is possible to monitor the features of the product surface through multivariate statistical techniques. New multivariate techniques are proposed for the systematic monitoring of the roughness and of the surface shape. In particular, a two-level “nested” PCA model is used for surface roughness monitoring, while a new strategy based on “spatial moving windows” PCA is proposed to analyze the shape of the patterned surface. The proposed approach generates fast, meaningful and reliable information identifying the abnormalities on the surface of a device and localizing the defects in a sensitive fashion.

7.1 Photolithography process and inspection tools Integrated circuits (ICs) are recognized as some of the most complex manufactured products and some of the most versatile devices. The fabrication of an IC is obtained through a complex infrastructure of materials supply, waste treatment, logistics, and automation to support the entire process. Specifically, the semiconductor production technology develops through an extensive series of photographical, mechanical, and chemical steps in the cleanest environment, achieved by ultra-precision engineered equipments (Helbert and Daou, 2001). Typical processing loops, which may recur several times, comprise some or all of the following phases (Figure 7.1a): oxidation; photoresist application; exposure to light; development of the resist; etching; and photoresist removal.

1

Portions of this Chapter have been published in Facco et al. (2008b), Facco et al. (2008c) and Facco et al. (2009b).

116

Chapter 7

photoresist deposited film Film deposition

Photoresist application

substrate

photolithography light

mask

Exposure & development

Liquid or plasma chemical agent

Etching

Photoresist removal

etching mask

(a)

(b) Figure 7.1 (a) Simplified processing sequence for semiconductor manufacturing and (b) simplified graphical representation of the most important quality parameters of an edge.

In detail, the abovementioned procedure can be outlined as follows. First of all, the wafer, a thin crystalline slice from a semiconductor ingot (e.g.: silicon), is heated to drive off the moisture from the surface, and cleaned. Later, the wafer is maintained into a high temperature

Surface characterization through multiresolution and multivariate image analysis

117

environment, until an oxide layer grows on the substrate. After the addition of adhesion promoters, a thin (and as uniform as possible) layer of photoresist is applied by spin-coating, in a high-speed centrifugal whirling process. Essentially, the photoresist is a polymer mixed with light sensitive compounds. Through light exposure, the desired pattern (often determined by a mask) can be impressed on the surface by illuminating certain portions of the resist selectively. If the light weakens the polymer to the so called “positive” photoresist, it will becomes a less chemically stable aggregate and will be more easily removed during the following stages. Conversely, a “negative” photoresist is strengthened by light and becomes resistant to solvents. The chemical change triggered by the light during the photolithography step allows the resist to be removed by a solution called developer. The resulting shape of the device surface should be the one shown in Figure 7.1a, alternating zones in which photoresist is present, the so called edges, and zones (which will be indicated as valleys) in which the oxidized substrate is no more protected and is completely free from the resist (Figure 7.1b). After the development, a hard baking is performed to give a stronger structure to the residual resist and only then, during the etching, the part of the surface that is not protected by the resist is engraved. A chemical agent (a liquid or a plasma) removes the oxide layer to prepare the surface for the following phases. Finally, the remaining resist is removed and the surface is ready for the diffusion of dopants on the part of the surface where the oxide barrier is not present. The doping induces the formation of ions that create regions with different electrical properties. During each production loop, it is crucial to meet stringent requirements in terms of quality uniformity and consistency. At each stage, several inspections and measurements are performed, but they monitor only few samples of different lots, and some pieces of the processing equipment. Since photolithography is performed several times on the same device and, even after it is completed, a defective device can still be reprocessed, it is common industrial practice to perform quality inspections after the photolithography step (the so-called after-development inspections). Usually, a CD-SEM is adopted to measure some significant features of the semiconductor surface. The SEM (scanning electron microscope) images are used mainly for metrology purposes. In other words, the common inspection tools measure the critical dimension (CD) or, in the most advanced instrumentation, the edge height, the side wall angle (SWA), or the line edge roughness (LER) (Figure 7.1b). Recently, the possibility of measuring and reducing the edge wall roughness using deep UV (ultra-violet) light scatterometry has been discussed (Yaakobovitz et al., 2007). However, in order to detect, distinguish and classify critical features of the manufactured device, more sensitive and reliable tools are required by the new generation of products (Guldi, 2004). For instance, there is a number of defects affecting the final product quality and performance (Figure 7.2) that

118

Chapter 7

cannot be identified in terms of CD measurements, but to which it could be possible to remedy if timely detected.

Figure 7.2 Ideal shape of an edge and possible microscopic defects on the edge shape and surface.

Thus, an automatic monitoring system for the frequent and accurate quality monitoring of a photolithographed surface would be a highly attractive perspective to increase the yield and the consistency of the fabrication program.

7.2 Image analysis through multiresolution and multivariate statistical techniques Images are 2D maps summarizing the characteristics of 3D scene. In this research, industrial SEM images representing the product surface after photolithography with positive photoresist have been used as a case study. These images are grey scale functions of light intensities and can be used to extract meaningful information about the quality of the product, its regularity and conformity to the requirements, and the types and location of defects. The scale of the images is such that 1 pixel corresponds to about 9.8 nm. In general, an image is a collection of well identified characteristics. As such, it is intrinsically a multivariate system, being a wide collection of pixels, where each pixel is highly correlated to its neighbors. In addition, in a surface image the apparent variability comprises both the actual roughness of the surface (which is an actual product feature) and the signal disturbance (which corresponds to noise to be removed). Thus, a number of tasks need taking into account by a monitoring system in order to “use” an image for quality control. Figure 7.3 illustrates the general architecture for the proposed monitoring system.

Surface characterization through multiresolution and multivariate image analysis

119

50

100

150

200

50

100

150

200

50

50

100

100

150

200

150

50

100

150

200

200

50

100

150

200

Figure 7.3 Sketch of the semiconductor monitoring system through (wavelet) image filtering and multivariate statistical techniques.

First of all, a reference model is defined by selecting a suitable reference image. As the quality problem involves different scales of resolution, a multiresolution approach is needed to filter the image and denoise the signal. Subsequently, multivariate statistical techniques are used to exploit the information content of the filtered image, to formulate the monitoring model, and to build the monitoring charts for the product quality inspection. Three quality features are described and monitored through the proposed monitoring system: the line edge roughness, the surface roughness, and the shape of an edge trans-section profile. In the following subsections, the main properties and the mathematical foundations of the multiresolution-multivariate monitoring system will be discussed.

7.2.1 Image multiresolution denoising An image is always affected by disturbances, e.g., the random fluctuation of the pixel light intensity. In general, multivariate statistical techniques can discard the non systematic part of a signal, distinguishing the meaningful variability from the random one. Unfortunately, in this case, the noise is somehow blurred with the roughness of the surface (which is a structural part of the device and defines the quality properties one is interested in). Therefore, a pretreatment is needed on the image to remove the noise without discarding the structure of the surface roughness. The problem of the dual nature of the noise is faced following a multiresolution approach, which examines the different scales of the image through wavelet decomposition (details on wavelets can be found in: Kosanovich and Piovoso, 1997; and Addison, 2002). Specifically, a scale-dependent smoothing of the image is performed by subtracting the unwanted part of the noise. In fact, the smoother version iM of the image io in the domain of the pixel space s ∈ ℜ 2 is the approximation at the scale M of the original image:

i M (s ) =

+∞

∑S

n = −∞

M , n φ M , n (s ) +

M

+∞

∑ ∑T

m = −∞ n = −∞

denoised m,n

ψ m,n (s ) ,

(7.1)

120

Chapter 7

where the first term on the right-hand side is the summation of the products between the approximation coefficients SM,n at the Mth scale and the selected father wavelet ϕ (lower frequencies of the signal), and the second term is the summation of the products between the and the mother wavelet ψ (higher frequency of the signal). Some detail coefficients Tmdenoised ,n of the higher frequencies (scales m over a prescribed limit M1) are redundant and, as a consequence, can be removed: ⎧ 0 =⎨ Tmdenoised ,n ⎩Tm,n

m ≥ M1 m < M1

.

(7.2)

Only the lower resolutions are retained, because of their relevance to the purpose of the roughness monitoring. So, after being convolved through the wavelet decomposition in different scales of resolution, the image is reconstructed, by merging together all the significant scales. Different types of wavelets were tested for the denoising of the photolithographed surface. The Daubechies wavelet with 8 scaling coefficients was eventually selected. The use of the Daubechies wavelet is suggested in several studies (e.g., Salari and Ling, 1995), especially for segmentation and texture analysis problems. Indeed, the Daubechies 8 wavelet seems to respond very well to the requirements discussed in Ruttiman et al. (1998) as it introduces very limited phase distortion, maintains a faithful localization on the spatial domain, and decorrelates the signal in a sensitive manner for both the image smooth features and discontinuities. Once a reference wavelet has been identified, the issue is to find the best smoothing scale for roughness monitoring. We found that useful indications can be obtained by evaluating the correlation coefficients between the side-wall roughness lines at different light intensity levels. A procedure inspired by the work of Patsis et al. (2003) was followed to detect the side-wall roughness. The locus of minimum light intensity in the valleys (represented by the blue lines in Figure 7.4) is first recognized, and then one moves upwards along any of the two edge walls (upper wall and lower wall2), detecting different light intensity levels at preset thresholds. In this way, the “topological lines” at light intensity thresholds of 0.5, 0.6, 0.7 and 0.8 are identified (respectively the green, yellow, red, and violet lines in Figure 7.4). Stated differently, these lines represent the pixel location along an edge side wall where the light

2

The definition of upper and lower walls simply refers to the order in which they appear in our images, i.e. for one edge the upper and lower wall are respectively the first and second wall when moving in a top-down direction. In principles, they do not have any systematic (physical) difference; however, from a statistical point of view, we observed that they “belong” to different categories (perhaps because the light hit them from a slightly different angle) and therefore we decided to identify them.

121

Surface characterization through multiresolution and multivariate image analysis

edge trans-section width (pixel)

intensity assumes a certain value (iso-intensity lines). Therefore, according to this procedure, a spatial location along the edge wall is identified through a certain light intensity level.

5 10 local minima of light intensity 0.5 light intensity 0.6 light intensity 0.7 light intensity 0.8 light intensity

15 20 25 30 35 0

20

40

60

80 100 120 140 160 180 200 edge length (pixel)

Figure 7.4 Magnified section of an edge image: detection of topological levels at different light intensities for the identification of the side-wall roughness on a reconstructed Daubechies-8 1st level approximation.

The correlation coefficients between the different topological lines are computed in the cases of the original image and of the reconstructions at the 1st and at the 2nd approximation levels. Table 7.1 Correlation coefficients between different topological lines in the original image and in the 1st and 2nd level Daubechies 8 approximation reconstructions.

original image

1st level approximation

2nd level approximation

threshold 0.5 0.6 0.7 0.8 0.5 0.6 0.7 0.8 0.5 0.6 0.7 0.8

0.5 1 0.5974 0.3881 0.3316 1 0.7506 0.7675 0.5167 1 0.8086 0.7971 0.7784

0.6 0.5974 1 0.5030 0.4573 0.7506 1 0.7164 0.6321 0.8086 1 0.8337 0.7787

0.7 0.3881 0.5030 1 0.6179 0.7676 0.7164 1 0.6761 0.7971 0.8337 1 0.8889

0.8 0.3316 0.4573 0.6179 1 0.5167 0.6321 0.6761 1 0.7784 0.7787 0.8889 1

As can be seen in Table 7.1, the correlation coefficients between the topological lines of the original image are always low (even for neighboring lines). This means that the noise corrupts the image determining an artificial “uncorrelation” among pixels. However, note that the reconstruction at the 1st approximation level exhibits a substantially different situation: high

122

Chapter 7

correlation is shown between neighboring lines and a significantly lower correlation between distant lines. This occurrence is related to a certain level of roughness, which determines a random shaping of the edge wall (hence, of the topological lines). If the filtering level is further increased as in the case of the reconstruction image at the 2nd approximation level, the random shaping of the lines is excessively reduced, i.e., the ability to capture the roughness is lost. In fact, the correlation coefficient between far thresholds (0.5 and 0.8) is almost the same as the one between neighboring thresholds. This means that an excessive “smoothing” of the signal has caused both the noise and the structural roughness to be removed from the original image. Thus, a first level decomposition is chosen.

(b)

1.0

1.0

0.8

0.8 light intensity

light intensity

(a)

0.6 0.4

0.4 0.2

0.2 0.0

0.6

0

50

100 150 200 250 300 350 400 450 trans-section width (pixel)

0.0

0

50

100 150 200 250 300 350 400 450 trans-section width (pixel)

(c) (d) Figure 7.5 (a) Original image and (b) filtered image, with an example of trans-section profiles on the same pixel column for (c) the original image and (d) the filtered one.

After normalization (i.e., forcing the spectrum of the black-and-white light intensities between the values 0 and 1) and wavelet filtering, the resulting smoothing on the original image is shown in Figures 7.5a and 7.5b. The trans-section profiles of the original image along a given pixel column (Figure 7.5c) show a confused trend (the noise is almost indistinguishable from the underlying pattern). Once the image is filtered and polished from the high frequency

Surface characterization through multiresolution and multivariate image analysis

123

components, it gets a clearer and more noticeable pattern (Figure 7.5d), to which multivariate statistical techniques can be applied.

7.2.2 Multivariate statistical surface monitoring methods Multivariate monitoring instruments are needed for analyzing and interpreting the multivariate nature of images. Whenever there is the technical impossibility to perform a measurement, or an analysis entails the contemporaneous accessibility of multiple characteristics, a multivariate statistical monitoring system may prove significantly more powerful and effective than common metrology tools. Different multivariate statistical schemes were adopted through modified PCA approaches. An extended treatment on PCA methods can be found in the books by in the books by Jackson (1991) and by Geladi and Grahn (1996). The basis for a multivariate monitoring technique is its capability of summarizing a plurality of quality clues embodied in an image into a limited number of statistical indices. Usually these are an indicator of the mean trend (the Hotelling T2 statistic, as a replacement of the scores t), and an indicator of the model representation suitability (the SPE statistics). As stated before, the objective is to develop an image-based strategy for the monitoring of LER, surface roughness and edge shape. 7.2.2.1 LER monitoring Line edge roughness is one of the typical reference parameters in the after-development

inspection of a photolithographed device (Yaakobovitz et al., 2007). In fact, it plainly affects both the subsequent production phases and the performance of the final product. To monitor the LER of a single edge, information must be provided on how the light intensity is distributed in the upper and lower walls of the edge. As was mentioned in the previous section, the topological lines on nlevels = 4 light intensity levels were identified. Two PCA models (Wold et al., 1987) were developed, one for the upper wall and one for the lower wall. To build each model, a reference [nel × nlevels] data matrix was used, where nel is the edge length (in pixels). Each column of this matrix represents the pixel locations, along an edge trans-section (identified by the relevant row number), where the light intensity assumes the value of 0.5, 0.6, 0.7 or 0.8. Otherwise stated, each column represents the topological line where light intensity assumes a specified value.

124

Chapter 7

edge length (nel pixels)

upper wall

pixel space lower wall

PCA upper wall

score space

t2

lower wall

t2

t1

t1

Figure 7.6 Inspection of the upper and lower walls of an edge by the LER monitoring model. Two latent variables were used in the PCA model.

For each trans-section of an edge, the PCA model combines the nlevels variables into a measure of the mean and variance of the LER, and produces one point in the scores space representing the edge quality on that trans-section. When a new edge is inspected, a trans-section is considered (thick black line in the upper part of Figure 7.6), and the relevant topological levels data are projected onto the reference model. Differences in the mean of topological lines at a given light intensity (e.g., due to the presence of side-wall bumps) or in their variance (e.g., due to the presence of large feet or spikes) are highlighted by the T2 (or scores) and the SPE monitoring charts (respectively). By proceeding this way for each trans-section, the whole edge length can be scanned and inspected. This strategy allows to detect and localize the imperfections on the edge side walls, which can then be related to malfunctions or drifts in the production machinery. The confidence levels for the T2 and SPE thresholds can be selected in such a way as to guarantee a good sensitivity to faults, while at the same time keeping the number of false alarms small. 7.2.2.2 Surface roughness monitoring Semiconductor surface roughness is also a very important parameter, as it can deliver

substantial information concerning the accuracy of the erosion during photolithography, the presence of resist residuals on the substrate, etc. The challenge in monitoring is that the patterned surface of an after-photolithography photoresist shows uneven characteristics in different positions. In fact, the zones at lower light intensity (valleys) have remarkably larger roughness than the zones at higher light intensities (edges), due to irregular erosion from the light. Therefore, not only the edge surface should be inspected, but also the valley surface

Surface characterization through multiresolution and multivariate image analysis

125

must be analyzed. This also implies that one must be able to automatically distinguish between edges and valleys along the whole surface of the device. A single model on the entire semiconductor surface cannot simultaneously capture the uneven variance structure on different locations of the surface. The surface can be segmented in classes through a k-means clustering on the light intensity. An unsupervised PCA discriminant analysis is instead proposed here to distinguish the edges from the valleys. As a result, for each of the two surface configurations (edge or valley) a specific monitoring model is built. A “nested” PCA monitoring system was designed to this purpose, which is based on a twostage procedure: the outer level carries out the unsupervised discriminant analysis; the inner level is the actual monitoring model, where two PCA submodels are enclosed, one for the edges and one for the valleys. Thus, in the sequential surface scanning procedure, the outer level provides an automatic switching from one submodel to the other, and the correct submodel is interrogated in the inner level for the monitoring step. edge length (nel pixels)

image width ( nEiw+nViw pixels)

sequential scanning of the image

OUTER LEVEL PCA discriminant analysis

t2

t1 valley model

edge model

INNER LEVEL monitoring models

t2

t2

Figure 7.7 Edge and valley surface roughness monitoring using the nested PCA model. Two latent variables were used.

Each submodel works on either an [ niwE × nel ] or an [ niwV × nel ] matrix, where niwE and niwV are the reference edge and valley image widths (in pixels), respectively. The reference images for edges and valleys are built by considering a statistically sound number of on-quality edges V of on-quality and valleys (in fact, it is sufficient that a statistically sound number niwE and niw edge and valley image rows is collected and assembled in two reference images). The jth column of each matrix contains the values of light intensities on the image trans-section located at pixel 0 ≤ j ≤ nel along the image (i.e., edge) length. The PCA submodels reduce the

126

Chapter 7

dimension of the problem (nel variables) into a point on a 2-D scores space that characterizes most of the information content and the variability of the original space. As illustrated in Figure 7.7, a new picture is scanned row by row; any new row is identified as belonging either to an edge or to a valley, and then transformed by using the ad-hoc PCA model. The value of the first score t1 was found to be a useful indicator to classify a row. A threshold value for t1 was determined by analysis of the reference images. Each submodel allows to monitor the surface roughness by analyzing the T2 and the SPE indices. Excessively high values for the Hotelling’s T2 statistics alarm on an abnormal mean light intensity along the whole length of an edge or valley (e.g., due to uneven distribution of the photoresist coating). Excessively high values for the SPE statistics indicate an abnormal variability of the light intensity (e.g., due to the presence of holes or of photoresist residuals). Note that any abnormality can be precisely located by analyzing the contribution provided by each of the nel pixels to the altered value of the relevant statistics. Therefore, the information that becomes available through this surface monitoring system complements and extends the information provided by the LER monitoring system. 7.2.2.3 Edge shape monitoring A more wholesome and meaningful monitoring approach would attempt to examine the

overall edge shape and to compare it to a required standard. A methodology is suggested here to achieve this goal. The proposed approach performs the inspection of a single edge shape by scanning the edge image in the sense of the pixel columns. The objective is to compare different trans-section profiles of an edge (as characterized in terms of light intensity) with respect to a reference profile, while proceeding along the edge length direction. There are two main difficulties in monitoring the edge shape. First of all, the data are clearly nonlinear and non normally distributed in the sense of the image column trans-section, whereas the calibration of a PCA model requires linear and normally distributed historical reference information. Secondly, it is difficult to retain the spatial information when using a PCA model as, ideally, every pixel should be considered for its position and for the spatial relation with its neighbors (Bharati et al., 2004). The strategy proposed to overcome these issues is to consider each pixel together with its nearest neighbors located in the same trans-section profile within a predefined spatial moving window (Figure 7.8), in which the correlation between the neighbors (i.e., the local edge shape) is maintained and the nonlinearity of the profile is negligible. Thus, a moving window of appropriate size ( Δnmw ) is defined in the space of the pixels along the edge profile transsection (whose overall width is ntsw pixels). The moving window captures the correlation between pixels located in a limited region, within which it is reasonable to suppose a strong correlation between neighbors. For this reason, Δnmw = 5 pixels was chosen; at larger distances the correlation between upper and lower pixels within the same window vanishes.

Surface characterization through multiresolution and multivariate image analysis

127

edge length (nel pixels)

trans-section width (ntsw pixels)

segment

segment (Δnsegm pixels) view point

moving window

Figure 7.8 Schematic of the main concepts underneath the monitoring method for the shape analysis of one edge.

In order to smooth nano-variations that may perturb the comparison between the local shapes of different edge profiles, a segment is taken along the edge length direction (a segment dimension of Δnsegm = 5 pixels was chosen for this study, which corresponds to ~49 nm; at larger distances the correlation between right and left pixels was not significantly high). This segment is assumed to be the smallest width for which a profile shape is analyzed; this also allows for a significant reduction in the computational burden. Therefore, each segment represents the image of Δnsegm consecutive edge profiles. The shape of an edge is then analyzed by monitoring the segments that edge is made of. Note that a segment can be represented by a [ ntsw × Δnsegm ] matrix, whose entries are the light intensities on each pixel belonging to the segment. To define the optimal reference for a segment, a number (equal to Nimage) of segments that conform to the required quality standards are collected from different edges and different images. The whole set of reference data is arranged in a 3-D matrix (Figure 7.9), whose dimension is [ N image × Δnsegm × ntsw ]. Therefore, according to this arrangement, the Nimage images are stored as horizontal slices and piled up one another. The spatial moving window multi-way PCA model is then defined as follows. The spatial window moves pixel by pixel along the matrix third dimension. For any position of the window, a subset of the 3-D matrix is defined (Figure 7.9a). This subset is unfolded “imagewise”, by cutting the submatrix into Δnmw vertical slices along the trans-section width dimension, and putting the slices side by side according to a multi-way procedure developed by Nomikos and MacGregor (1994; Figure 7.9b). This results in a [ N image × (Δnmw ⋅ Δnsegm ) ] 2-D matrix that can be processed through PCA. A column of this matrix represents how the

128

Chapter 7

light intensity on a given position in segment and on a given position along the edge profile varies between different images. On the whole, ( ntsw − Δnmw + 1 ) 2-D reference matrices are obtained, this number being equal to the number of positions that the spatial window can take along the edge trans-section width. For each of these matrices, threshold values on the T2 and SPE statistics can be determined that allow to monitor the shape of the edge within the corresponding spatial window.

(a)

(b) Figure 7.9 (a) Reference data arrangement on a 3-D matrix with a spatial moving window. (b) Image-wise unfolding of the 3-D matrix on the moving window.

For any segment on a test edge whose shape needs to be monitored, the T2 statistics (or the scores) on a given edge window summarizes the mean edge shape in that window. Therefore, large T2 values indicate that the local edge shape is altered with respect to the average. Large SPE values indicate changes in the correlation structure between the profiles of a segment. This signals a local variation on the edge roughness, if the T2 statistics is found to be within its limit.

Surface characterization through multiresolution and multivariate image analysis

129

7.3 Case study: monitoring results The compliance of the surface product quality to the quality requirements after a photolithography process is evaluated in terms of LER, of surface roughness and of edge shape by applying the techniques described in the previous section. Hence, a multiple monitoring system is developed to perform the monitoring of all the abovementioned features through multiscale and multivariate image analysis. The monitoring results are presented in the following.

7.3.1 LER monitoring system The LER monitoring strategy goes through the following steps: i) an edge is selected on the de-noised test image; ii) the four light intensity levels are identified on the edge; and iii) the edge is monitored through the scores plot (and/or the T2 plot) and the SPE plot . The scores plot surveys the conformity of the mean side-wall roughness on the edge. Any point in this plot is designated with a number, which represents the column position (along the edge length) to which that point refers to. The location of the point on the scores plot represents how the topological lines are distributed along the edge side-wall at that column position. Non-conformities are shown as points located outside the confidence ellipse. The SPE plot points to irregularities in terms of excessive variance on the edge side-wall roughness, and to changes in the correlation between the topological lines (i.e., in the “parallelism” of topological lines). The column location over the edge length is represented on the x-axis of this plot. Non-conformities are shown as points located above the confidence threshold. In Figure 7.10, the scores plot (Figure 7.10a) and the SPE plot (Figure 7.10b) for the upper side wall of an edge are shown. The SPE plot shows four outliers within the first 30 pixel columns. An off-line visual inspection of the upper side wall (Figure 7.10c) indeed confirms that a bump is localized around column 30 (note that the scores plot, too, provides a “mild” alarm for column 30). Furthermore, Figure 7.10c shows that several side-wall irregularities are present in the first 30 columns, i.e., a non parallelism of the topological lines exists in this section of the edge length. This confirms the indications provided by the SPE plot. Note, however, that inspecting the side-wall image is much more time consuming and does not provide a precise and unambiguous indication on the pixel column where a non-conformity is present. Conversely, the SPE chart is very quick to analyze, the response is very localized and unambiguously points to a column where a defect is deemed to be present. The scores plot is somewhat less responsive than the SPE plot, but nevertheless the information provided by the two monitoring diagrams can complement each other.

130

Chapter 7

4

10 99

2

30 18 29 106

1 0

140 6 17

100 105 7

101 147 151 167 46 91 104 118 123 146 159 158 157 156 155 173 172 171 36 35 95 4

189 12 111 139 179 178 188 187 186 190 201 16 15 14 13 24 23 5

117 116 115 132 131 135 145 154 153 162 168 174 87 89 92 3

25 191 110 112 166 170 177 185 184 183 182 181 180 193 192 11 122 10 22 21 19 28 27 26 34 33 32 9

-1

8

90

136 31 8

121 200 120 176 195 194 109 108 107 113 119 134 133 138 137 152 161 160 165 169 175 196 20 45 44 88

96

SPE residuals

second principal component

3

102 150 83 148 86 149 47 53 75 82 85 48 52 51 54 58 64 63 62 61 68 67 72 71 81 80

103 114 126 125 124 130 129 128 141 144 164 163 43 42 41 40 39 38 37 78 77 76 84 94 93 2 1

98 97 142 143 79

127 199 198 197

50 49 57 56 55 60 59 66 65 74 73

70 69

-2

6 4 2

-3 -4 -6

-4

-2

0

2

4

6

0

0

20

first principal component

40

60

80 100 120 140 160 180 200 edge length (pixel)

edge trans-section width (pixel)

(a)

(b)

0.5 light intensity 0.6 light intensity 0.7 light intensity 0.8 light intensity

2 4 6 8 10 12 0

20

40

60

80

100

120

140

edge length (pixel)

(c) Figure 7.10 Line edge roughness monitoring: analysis of the upper side wall for one selected edge. (a) Scores plot (the numbers within the squares represent the column position along the edge length); (b) SPE plot; and (c) magnified section of the edge upper wall image.

Strong outliers are also identified by the SPE plot on pixel columns 99 and 106. These indicate a very large variability on the topological lines, or a non-parallelism between the lines. The irregularity on column 99 is also clearly detected by the scores plot. Visual inspection of the upper side wall image (Figure 7.10c) confirms that two large feet are present around pixel column 100.

7.3.2 Surface roughness monitoring system The surface roughness monitoring strategy goes through the following steps: i) a row is selected on the de-noised test image following a sequential scanning from the top to the

Surface characterization through multiresolution and multivariate image analysis

131

bottom of the image; ii) the row is categorized as belonging either to an edge or to a valley; iii) the edge/valley is inspected through the relevant scores plot (and/or the T2 plot) and the SPE residuals plot, which highlight potential anomalies on the surface; and iv) an analysis of the contribution of each pixel to the T2 statistic and to the e residuals, which constitute the SPE-residuals (Wise and Gallagher, 1996; Nomikos, 1996) allows to precisely localize the imperfections. Figure 7.11 shows the e contributions to the residuals of all the pixels along the length of a row within the same edge considered in Figure 7.10. If the e contribution of a pixel exceeds one of the confidence limits (Conlin et al., 2000), this is an indication that a surface imperfection is localized on that pixel. Following this rationale, several imperfections are detected by the contributions plot of Figure 7.11 around pixel columns 10 to 40, around column 100, and around column 135. Note that irregularities in the same locations were highlighted in the LER monitoring strategy, too. 2.0 1.5

e contribution

1.0 0.5 0.0 -0.5 -1.0 -1.5 -2.0

0

20

40

60

80

100 120 140 160 180 200

edge length (pixel)

Figure 7.11 Surface roughness monitoring: typical trend of the contributions to the residuals for the localization of defects on an image row. (The dashed lines represent the confidence limits).

The agreement between the results of LER monitoring and of surface roughness monitoring is a proof of the reliability of both approaches, which can be used simultaneously in a robust way. In fact, the topological lines at different light intensity thresholds (obtained by cutting the image with a plane parallel to the image itself) are highly correlated to the light intensities along the trans-sections of the edge surface (obtained by cutting the image with a plane orthogonal to the image). As an example, Table 2 shows the correlation coefficients between the upper side-wall topological levels of an edge and the trans-section profiles along the edge length from row 5 up to row 11. The high correlation coefficients emphasize the similarity

132

Chapter 7

between the trajectories of the LER lines and the profile of the light intensity on a row of pixels along the edge length. This confirms the possibility to observe accurately the LER through the surface roughness in terms of light intensity. Table 7.2 Correlation coefficients between the topological lines at different light intensity thresholds and the light intensity along the trans-section of an edge.

row 5 row 6 row 7 row 8 row 9 row 10 row 11

0.5 threshold 0.432 0.731 0.700 0.663 0.525 0.466 0.216

0.6 threshold 0.245 0.769 0.866 0.887 0.752 0.631 0.205

0.7 threshold 0.138 0.579 0.728 0.863 0.841 0.744 0.285

0.8 threshold 0.098 0.372 0.516 0.735 0.829 0.795 0.381

Note that the surface roughness analysis can be used to spot also other defects (which usually cannot be identified in a practical way). For instance, Figure 7.12a shows the contributions plot for a different pixel row along the image (the row was categorized as a valley). It can be seen that around pixel columns 25, 50, 100, 160 and 195 the contributions to the residuals exceed the confidence limits. This means that some defects are present in terms of excessive variance of the light intensity (which could be associated to the presence of holes or of photoresist residuals). 3

1.0

2 0.8 light intensity

e contribution

1 0 -1

0.4 0.2

-2 -3

0.6

0

20

40

60

80

100 120 140 160 180 200

valley length (pixel)

0.0

0

20

40

60

80 100 120 140 160 180 200

valley length (pixel)

(a) (b) Figure 7.12 Surface roughness monitoring: identification of defects on a valley from the analysis of (a) the contributions to the residuals and (b) the profile of light intensity along the valley length.

However, only a few of these defects would unambiguously be highlighted if an off-line inspection of the light intensity profile were carried out on the same row (Figure 7.12b).

Surface characterization through multiresolution and multivariate image analysis

133

7.3.3 Edge shape monitoring system The edge shape monitoring strategy goes through the following steps: i) a segment is selected in the de-noised image of an edge; ii) the trans-section profiles in this segment are aligned to the reference one; iii) the segment is sequentially scanned by the spatial moving window along the whole trans-section width; and iv) the shape conformity to the desired standard is surveyed using the ti plots and the SPE plot, along all the positions that the moving window can take on the trans-section width. The confidence limits on the ti charts and on the SPE chart are not the same for the entire width, but are defined for each spatial window. In particular, the confidence limits of the SPE chart take into account the different variability between valleys and edges: higher limits are set for the valleys because of a more marked shape variability; lower limits are set for the edges, in which the shape is expected to be only slightly variable. Figure 7.13a shows (full thin lines) the five light intensity (trans-section) profiles on an edge segment that conforms to the required quality standards. These profiles are compared to the average light intensity profile on the reference segments (broken thick line). All test profiles are aligned along the left-hand rising branch of the reference profile. According to a rough visual inspection of these profiles, the inspected segment seems to conform to the quality standards. However, it is not completely clear whether the shape differences on the center and on the borders of the trans-section profiles are to be regarded as “regular” or not. A more rigorous and automatic monitoring of the edge shape can be done by analyzing the t1 plot (Figure 7.13b) and the SPE plot (Figure 7.13c). It can be seen that no violations of the confidence limits are detected along the trans-section width (which is scanned by the moving window). Hence, Figures 7.13b and 7.13c unambiguously designate the tested edge as an on-quality one. A similar comparison is presented in Figure 7.13d for a non-conforming segment. Again, although the analysis of Figure 7.13d is not completely satisfactory to assess the quality of the trans-section profile, the analysis of the t1 plot (Figure 7.13e) and of the SPE plot (Figure 7.13f) unambiguously points to the locations (along the trans-section width) where the nonconformity is present. Note that this test segment is categorized as an “on quality” one in the valleys, although Figure 13d shows that there is a significant difference between the borders of the test and those of the reference profiles. However, the spatial moving window approach effectively allows to identify the difference in the roughness structure between edges (smoother structure) and valleys (coarser structure).

134

Chapter 7

1.0

1.0

0.8 light intensity

light intensity

0.8 0.6 0.4

0.6 0.4 0.2

0.2 0.0

reference test

reference test 0

10

20

30

40

0.0

50

10

20

30

(a)

20

15

15

10

10

5

5

t1

t1

0

0

-5

-5

-10

-10

-15

-15

-20

-20 0

10

20

30

-25

40

0

10

20

(b)

40

(e)

100

100

80

80 SPE residual

SPE residual

30

position along the trans-section width (window no.)

position along the trans-section width (window no.)

60 40 20 0

50

(d)

20

-25

40

trans-section width (pixel)

trans-section width (pixel)

60 40 20

0

10

20

30

40

position along the trans-section width (window no.)

0

0

10

20

30

40

position along the trans-section width (window no.)

(c) (f) Figure 7.13 Edge shape monitoring for an on-quality segment (left column plots) and for an off-quality segment (right column plots). (a) light intensity profiles on one segment of a test edge conforming to the required quality standards; (b) first score as a function of the moving window number for the same conforming segment; (c) SPE statistic as a function of the moving window location for the same conforming segment; (d) light intensity profiles on one segment of a test edge not conforming to the required standards; (e) first score as a function of the moving window location for the same non-conforming segment; (f) SPE statistic as a function of the moving window number for the same non-conforming segment. The dashed lines represent the confidence limits.

Surface characterization through multiresolution and multivariate image analysis

135

7.4 The EDGE3 monitoring interface The above procedures for the quality monitoring of a photolithography process have been implemented in the Matlab® modeling language so as to build up an interface for assessing the quality of manufacturing and detecting potential anomalies in a user-friendly way. The EDGE3 interface is designed in such a way that it is possible to switch among the three monitoring strategies (LER, surface roughness, edge shape). Once the property of interest is chosen, the interface goes through a number of automatic steps without any human intervention.

Figure 7.14 Graphical interface for the surface roughness monitoring code. The sequential scanning procedure surveys the test image (upper right image) through a nested PCA model: the discriminant analysis (scores diagram in the upper left part) between edge and valleys, and the monitoring models with T2 and SPE alarms and the contribution of each pixel to the alarms, for diagnosis and localization.

For instance, in the case of LER monitoring one edge is randomly selected from a test image and, after aligning the profile to the reference image, the four light intensities lines are automatically identified and the monitoring procedure is carried out through the scores plot and residual plots of both the edge walls. In the case of surface roughness monitoring, a similar approach is taken. An example of the interface window for surface monitoring is shown in Figure 7.14.

136

Chapter 7

The scanning procedure surveys sequentially (row by row as highlighted by the red line on the image in the upper right side of the interface) the test surface projecting the pixels of the row under investigation onto a 2-D scores plot (the upper left plot of Figure 7.13). This scores diagram identifies the normal surface condition within the dashed ellipse, and performs an unsupervised discriminant analysis between edges (projected inside the blue dotted ellipse) and valleys (projected inside the green dotted ellipse). This is the external level of the “nested” PCA method described in the previous sections. The lower part of the interface represents the monitoring stage of the nested PCA procedure. Two monitoring models are present: the model for the valleys (the four diagrams on the right) and the model for the edges (the four diagrams on the left). When evaluating the roughness on the edges, only the model of the edges is interrogated and the model for the valleys is in stand-by, and vice versa. The switching between the models is managed by the external level of the nested PCA scheme, while the internal level performs the monitoring of the surface through an alarm signal on the T2 and the SPE statistics (the penultimate row of diagrams). If the Hotelling statistics and the SPE residuals of the test samples are within the limits identified by the reference set, the signal remain to the value 0; otherwise, the alarm signal takes the value 1; the alarm signal has a -1 value when the model is in standby. The variance of the contribution to the statistical indices for every pixel of the reference data are computed, and the relevant 95% confidence limits are calculated for the reference morphology (last row diagrams in the interface). Finally, in the case of shape monitoring, the soft sensor randomly selects a limited portion of the test image. Then, the test sample is compared to the reference; and finally the shape of the edge trans-section profile is evaluated through the scores ti and SPE residuals monitoring charts. Alarm indicators appear if the shape of the test edge is recognized as off-quality.

7.5 Concluding remarks One of the most important step in the integrated circuit fab is photolithography, because of its economical and operational impact on the manufacturing scheduling. The greatest part of the monitoring efforts in photolithography is focused on the measurement of the most important physical parameters of a photolithographed device, such as CD, LER, or SWA and the common inspection tools are optical instruments. An image, however, retains information that largely exceed the mere metrology, and which are useful to identify the complex nature of the manufactured product quality. Through image analysis, and in particular advanced image processing techniques, it is possible to access without human intervention several meaningful clues that help to better understand the manufacturing process, to identify the critical situations (for the product quality and the process progress) and to counteract the problems.

Surface characterization through multiresolution and multivariate image analysis

137

In particular, in this Chapter, a monitoring system for the after-development quality evaluation was proposed. The approach, based on a combination of multiresolution techniques and multivariate statistics, can deal with the hidden characteristics of a device, performing all the tasks of the instrumental sensors, but grasping additionally features of the surface product, which are commonly inaccessible. Being quality a multivariable property in nature, multivariate statistical techniques were exploited for feature extraction of the information embedded in the image. Although the signals were corrupted by noise and alterations, which affects different scales of resolutions, only the relevant scales were considered through a multiscale treatment by means of the wavelet decomposition. The result was a multiscale and multivariate monitoring framework that inspects the quality of the photolithographed device through the analysis of a SEM image. The proposed monitoring system was shown to be an effective tool for the assessment of the product quality. By adding new features for the measurement of the critical physical parameters, the image analysis system performs a full scanning of the surface, identifying and localizing the defects and anomalies, and detecting in advance the drifts of the process.

Conclusions and perspectives Despite batch processes are relatively simple to set up and to carry out with limited degree of automation and with only partial knowledge of the underlying mechanisms of the process, it is always difficult to ensure a consistently high and reproducible quality of the product. Common instrumentation used in the industrial practice rarely provide realtime measurements of the product quality. Further complications arise from the multivariate nature of quality, which depends form a series of physical, operational, even subjective parameters. Although not easily accessible, information on quality is embedded in the value of the process variables, usually collected by process computers and stored in historical databases. Multivariate statistical methods allow to reduce the dimensionality of the problem to a latent subspace that explains the relevant part of the variability of product quality, dealing with noisy, redundant and highly correlated variables, with outliers or missing values, too. The aim of this Thesis was the development of multivariate statistical techniques for quality monitoring in the batch manufacturing of high value added products. Two classes of products were considered: products whose quality is determined by chemical/physical characteristics, and products whose surface properties define quality. The main scientific contributions of the PhD project have been: • the development of a strategy to design software sensors for the realtime estimation of product quality in batch processes; • the non-conventional application of latent variables methods for the prediction of the length of batch processes; • the development of an innovative methodology for the multiresolution and multivariate monitoring of the quality from images of a manufactured product. Soft sensors for the online estimation of product quality in batch polymerization processes were developed and implemented online in an industrial case study. These soft sensors are based on partial least squares, which regresses product quality from the process measurements that are available online. The accuracy of the estimation demonstrated to be similar to the one of the laboratory measurements, but the estimator can be interrogated with high frequency (on the order of s-1), namely hundreds of times faster then the lab assays (on the order of h-1). Furthermore, the estimations are available in real time, without the delay that is typical of the laboratory measurements. To compensate data nonlinearities and changes in the correlation structure between variables, the adopted procedure split a batch into a limited number of estimation phases. Within each of these phases, linear PLS submodels were shown to provide accurate quality estimations, and the switching from one submodel to following one is triggered in correspondence to process events, that can be easily detected from process

140

Conclusions and future perspectives

variables. The key characteristic of the proposed soft sensor is the inclusion of dynamic information through either lagged measurement (i.e. addition past values of some relevant process variables to the reference dataset) or incorporating “time memory” through a moving average approach. Including time information demonstrated to be a highly favorable approach to improve the estimation accuracy. Furthermore, averaging the process measurement values on a moving window of fixed length attenuates process noise, dampens spikes and compensates for the effect of temporarily missing values without introducing other significant computational difficulties. Caution was suggested in the selection of the moving window width, because too wide a window may delay the appearance of alarms on the reliability of the estimation. From the operational point of view, this system can help the operating personnel to promptly detect drifts on product quality, and to suggest timely adjustments of the processing recipe, minimizing the off-specifications of the final product. Moreover, the number of the quality sampling can be reduced drastically, determining a gain either in terms of overall processing time, and of lab related costs, and of manpower organization. Also a new soft sensor was developed to assist the online monitoring of product quality of batch processes and to deliver helpful information for an effective production planning: a soft sensor for the realtime prediction of batch length. This monitoring strategy utilized a timeevolving PLS modeling approach, that exploits the incremental information collected during a batch to forecast the length of the batch or the length of any of the production stages. Very satisfactory accuracy of the predictions was obtained, because the prediction error is much lower than both the variability of batch length and one operator shift length. The initial part of the batch confirmed to be of crucial importance for the batch length, because the initial conditions of the pieces of equipment, the state of the raw materials and the heat up of the reactor usually exhibit a deep influence on the batch performances. The information on the batch duration allows a better scheduling of the interventions on the plant, the optimization of the manpower both in terms of shifts and roles, the organization of a convenient utilization of plant equipment. The effectiveness of the soft sensors for the estimation of the quality and for the prediction of the batch duration was tested by applying and implementing online the above mentioned techniques to the case of an industrial batch production of resins by polymerization. Finally, multivariate statistical technique were exploited also in the field of image analysis. In the industrial practice, inspections of the product from image analysis are often mere measurements of the most important physical parameters, conveniently enhanced through filtering techniques. Moreover, these measurements are derived from non-systematic inspections. However, a lot of useful information are stored into images to allow for a systematic identification of the complex nature of the manufactured product quality. A totally automatic system for the realtime monitoring from images of high value added products was developed. By exploiting multiresolution and multivariate images analysis, this monitoring

Conclusions and future perspectives

141

system was tested in the case study of the after-photolithography characterization of a semiconductor surface in the manufacturing of integrated circuits. Advanced multivariate image analysis techniques extracted the traces that the process always leaves onto the product, helping both the identification of critical situations in the process and the neutralization of the problems. The proposed approach is based on the preliminary multiresolution filtering of the image through wavelet decomposition. Then, a monitoring scheme was developed to perform the parallel analysis of surface roughness and surface shape of the product. For example, it was shown that the surface roughness can be explored through a “nested” principal component analysis, a two-level methodology that allows to discriminate different parts of the inspected surface through the outer level PCA that performs a cluster analysis, and allows to monitor the surface roughness in the inner level PCA. The shape of the surface pattern was analyzed through a “spatial moving window” PCA approach, that retains information on spatial characteristics, taking into account both nonlinearities and different structural features of the inspected surface. To sum up, this system can help detecting and identifying several hidden characteristics of the product, which are commonly inaccessible, in a fully automatic fashion and without the human intervention. Furthermore, this tool revealed to be fast, sensitive, reliable, and unambiguous, operating the full scanning of the product surface for the precise localization of defects and anomalies, or the detection of process drifts. In conclusion, although the proposed methodologies were tested on particular case studies, they demonstrated to have great potentials and to be quite general. For this reason, it is possible to extend their application to different fields of research and different industrial applications (e.g.: food engineering; pharmaceutical industry; biotechnologies; etc…) or to different scales of investigation, from macro-scale to nano-scale. In particular, the future perspectives aim to unravel some issues that are still open for investigation. First of all, the monitoring performance of dynamic/spatial multivariate statistical methods could be more powerful if the inclusion of dynamic/spatial information were tailored to the current state of the system under study. This means that the lagged variable strategy could be highly improved if the selection of the variables and the choice of the time lags for every lagged variable fit the current state of the process. Also the moving window strategies could be improved if the size of the window can be modified during the batch to better describe the variability of the system, enlarging the window in the operating stages where low variability is experienced, or shrinking the window where the system shows larger variability. The examination of variance-covariance structure of the data can be highly beneficial to this purpose. Further research is needed also to solve the problem of the model adaptation to the changeable nature of the production processes, because multivariate statistical models are assumed to be time invariant. Although recursive strategies for batch processes are available in literature, they usually assume that the best reference data to build an adaptive model is

142

Conclusions and future perspectives

constituted by the most recent available data. However, there are many industrial situations where the batches that are the most similar to the current one are not necessarily the nearest in time. An adaptation scheme is required to tailor the model update to the running batch, in the sense that the calibration of a batch can be customized to the incoming batch. The model update could be managed by an artificial intelligence, which selects the best reference batches from a library of past ones in terms of monitoring purposes. This decision can be based on the idea of similarity, possibly by exploiting the information content initially available in a batch, and thus allowing to evaluate whether and how to update the model since the very beginning of a new production batch.

References Addison, P. S. (2002). The illustrated wavelet transform handbook. IOP Publishing, London (U.K.). Aguado, D., A. Ferrer, A. Seco, and J. Ferrer (2006). Comparison of different predictive models for nutrient estimation in a sequencing batch reactor for wastewater treatment. Chemom. Intell. Lab. Sys., 84, 75-81. Apetrei, C., I. M. Apetrei, I. Nevares, M. del Alamo, V. Parra, M. L. Rodrìguez-Méndez, J. A. De Saja (2007). Using an e-tongue based on voltammetric electrodes to discriminate among red wines aged in oak barrels or aged using alternative methods. Correlation between electrochemical signals and analytical parameters. Electrochimica Acta, 52, 2588-2594. Arvisenet, G., L. Billy, P. Poinot, E. Vigneau, D. Bertrand, and C. Prost (2008). Effect of apple particle state on the release of volatile compounds in a new artificial mouth device. J. Agric. Food Chem., 56, 3245-3253. Baffi, G., E. B. Martin, and A. J. Morris (1999a). Non-linear projection to latent structures revisited: the quadratic PLS algorithm. Computers Chem. Eng., 23, 395-411. Baffi, G., E. B. Martin, and A. J. Morris (1999b). Non-linear projection to latent structures revisited (the neural network PLS algorithm). Computers Chem. Eng., 23, 1293-1307. Bakshi, B. R., (1998). Multiscale PCA with application to multivariate statistical process monitoring. AIChE J., 44, 1596-1610. Bakshi, B. R., M. N. Nounou, P. K. Goel, and X. Shen (2001). Multiscale bayesian rectification of data from linear steady-state and dynamic systems without accurate models. Ind. Eng. Chem. Res., 40, 261-274. Bartolacci, G., P. J. Pelletier, J. J. Tessier, C. Duchesne, P. A. Bossè, J. Fournier (2006). Application of numerical image analysis to process diagnosis and physical parameter measurement in mineral processes - Part I: Flotation control based on froth textural characteristics. Minerals Eng., 19, 734-747. Beaver, S., A. Palazoglu and J. A. Romagnoli (2007). Cluster analysis for autocorrelated and cyclic chemical process data. Ind. Eng. Chem. Res., 46, 3610-3622. Bharati, M. H., J. F. MacGregor and W. Tropper (2003). Softwood lumber grading through on-line multivariate image analysis techniques. Ind. Eng. Chem. Res., 42, 5345-5353. Bharati, M. H., J. J. Liu, and J. F. MacGregor (2004). Image texture analysis: methods and comparisons. Chemom. Intell. Lab. Sys., 72, 57-71.

144

References

Blais, P., M. Micheals and J. N. Helbert (2001). Issues and trends affecting lithography tool selection strategy. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.). Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems: effect of inequality of variance in one way classification. The Annals of Mathematical Statistics, 25, 290-302. Brauner, N., and M. Shacham (2000). Considering precision of data in reduction of dimensionality and PCA. Computers Chem. Eng., 24, 2603-2611. Brosnan, T., and D. W. Sun (2004). Improving quality inspection of food products by computer vision - a review. J. Food Eng., 61, 3–16. Burnham, A. J., J. F. MacGregor, and R. Viveros (1999). Latent variable multivariate regression modeling. Chemom. Intell. Lab. Sys., 48, 167-180. Camacho, J. and J. Picò (2006a). Multi-phase principal component analysis for batch processes modeling. Chemom. Intell. Lab. Sys., 81, 127-136. Camacho, J. and J. Picò (2006b). Online monitoring of batch processes using multi-phase principal component analysis. J. Process Control, 16, 1021-1035. Camacho, J., J. Picò and A. Ferrer (2008a). Bilinear modeling of batch processes. Part 1: theorical discussion. J. Chemom., 22, 299-308. Camacho, J., J. Picò and A. Ferrer (2008b). Multiphase analysis framework for handling batch process data. J. Chemom., 22, 632-643. Capron, X., B. Walczak, O. E. de Noord, and D.L. Massart (2005). Selection and weighting of samples in multivariate regression model updating. Chemom. Intell. Lab. Sys., 76, 205214. Chang, H., J. Chen and Y. P. Ho (2006). Batch process monitoring by wavelet transform based fractal encoding. Ind. Eng. Chem. Res., 45, 3864-3879. Chen, J. and K. Liu (2002). On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chem. Eng. Sci., 57, 63-75. Chen, G., T. J. MacAvoy, and M. Piovoso (1998). A multivariate statistical controller for online quality improvement. J. Process Control, 8, 139-149. Chen, F. Z., and X. Z. Wang (2000). Discovery of operational spaces from process data for production of multiple grades of products. Ind. Eng. Chem. Res., 39, 2378-2383. Chiang, L. H., and L. F. Colegrove (2007). Industrial implementation of on-line multivariate quality control. Chemom. Intell. Lab. Sys., 88, 143-153. Chiang, L. H., E. L. Russel, and R. D. Braatz (2001). Fault detection and diagnosis in industrial systems. Springer, London (U.K.). Choi, S. W., E. B. Martin, A. J. Morris, and I. B. Lee (2006). Adaptive multivariate statistical process control for monitoring time-varying processes. Ind. Eng. Chem. Res., 45, 31083118.

References

145

Choi, S. W., and I. B. Lee (2005). Multiblock PLS-based localized process diagnosis. J. Process Control, 15, 295-306. Chong, I. G., and C. H. Jun (2005). Performance of some variable selection methods when multicollinearity is present. Chemom. Intell. Lab. Sys., 78, 103-112. Chu, Y. H., Y. H. Lee, and C. Han (2004). Improved quality estimation and knowledge extraction in a batch process by bootstrapping-based generalized variable selection. Ind. Eng. Chem. Res., 43, 2680-2690. Çinar, A., S. J. Parulekar, C. Ündey, and G. Birol (2003). Batch fermentation modeling, monitoring, and control. Marcel Dekker Inc., New York (U.S.A.). Clément, A., M. Dorais, and M. Verno (2008). Multivariate approach to the measurement of tomato maturity and gustatory attributes and their rapid assessment by vis-NIR spectroscopy. J. Agri. Food Chem., 56, 1538–1544. Conlin, A. K., E. B. Martin and A. J. Morris (2000). Confidence limits for contribution plots. J. Chemom., 14, 725-736. Costantoudis, V., G. P. Patsis, A. Tserepi, and E. Gogolides (2003). Quantification of lineedge roughness of photoresist. II. Scaling and fractal analysis and the best roughness descriptors. J. Vac. Sci. Technol. B, 21, 3, 1019-1026. Dayal, B. S., and J. F. MacGregor (1997a). Improved PLS algorithms. J. Chemom., 11, 73-85. Dayal, B. S., and J. F. MacGregor (1997b). Recursive exponentially weighted PLS and its applications to adaptive control and prediction. J. Process Control, 7, 169-179. de Jong, S.(1993). An alternative approach to partial least squares regression. Chemom. Intell. Lab. Sys., 18, 251-263. Doan, X. T. and R. Srinivasan (2008). Online monitoring of multi-phase batch processes using phase-based multivariate statistical process control. Computers Chem. Eng., 32, 230-243. Dokucu, M. T., and F. J. Doyle III (2008). Batch-to-batch control of characteristic points on the PSD in experimental emulsion polymerization. AIChE J., 54, 3171-3187. Dokken, K. M., and L. C. Davis (2007). Infrared imaging of sunflower and maize root anatomy. J. Agric. Food Chem., 55, 10517–10530. Donarski, J. A., S. A. Jones, and A. J. Charlton (2008). Application of cryoprobe 1H nuclear magnetic resonance spectroscopy and multivariate analysis for the verification of Corsican honey. J. Agric. Food Chem., 56, 5451–5456. Doymaz, F., A. Palazoglu, and J. A. Romagnoli (2003). Orthogonal nonlinear partial leastsquares regression. Ind. Eng. Chem. Res., 42, 5836-5849. Du, C. J., and D. W. Sun (2004). Recent developments in the applications of image processing techniques for food quality evaluation. Trends in Food Science & Technology, 15, 230-249.

146

References

Du, C. J., D. W. Sun (2008). Multi-classification of pizza using computer vision and support vector machine. J. Food Eng., 86, 234-242. Durante, C., M. Cocchi, M. Grandi, A. Marchetti, R. Bro (2006). Application of N-PLS to gaschromatographic and sensory data of traditional balsamic vinegars of Modena. Chemom. Intell. Lab. Sys., 83, 54-65. Eastment, H. T., and W. J. Krzanowski (1982). Cross-validatory of the number of components from a principal component analysis. Technometrics, 24, 73-77. Edgar, T. F., S. W. Butler, W. J. Campbell, C. Pfeiffer, C. Bode, S. B. Hwang, K. S. Balakrishnan and J. Hahn (2000). Automatic control in microelectronics manufacturing: practices, challenges, and possibilities. Automatica, 36, 1567-1603. Edgar, T. E (2004). Control and operations: when does controllability equal profitability. Computers Chem. Eng., 29, 41-49. El Chemali, C., J. Freudemberg, M. Hankinson and J. J. Bendik (2004). Run-to-run critical dimension and sidewall angle lithography control using the PROLITH simulator. IEEE Trans. Semiconductor Manuf., 17, 3, 388-401. ElMasry, G., N. Wang, C. Vigneault, J. Qiao, and A. ElSayed (2008). Early detection of apple bruises on different background colors using hyperspectral imaging. LWT, 41, 337-345. Eriksson, L., E. Johansson, N. Kettaneh-Wold and S. Wold (2001). Multi- and megavariate data analysis principles and applications. Umetrics Academy, Umeå (Sweden). Facco, P., (2005). Monitoring a semi-continuous polymerization process using multivariate statistical methods (in Italian). Tesi di Laurea, DIPIC, Università di Padova (Italy). Facco, P., M. Olivi, C. Rebuscini, F. Bezzo and M. Barolo (2007). Multivariate Statistical Estimation of Product Quality in the Industrial Batch Production of a Resin. In: Proc. DYCOPS 2007 – 8th IFAC Symposium on Dynamics and Control of Process Systems, (B. Foss and J. Alvarez, Eds.), Cancun (Mexico), June 6-8, vol. 2, 93-98. Facco, P., A. Faggian, F. Doplicher, F. Bezzo and M. Barolo (2008a). Virtual sensors can reduce lab analysis requirements in the industrial production of specialty chemicals. In: Proc. EMCC5 – 5th Chemical Engineering Conference for Collaborative Research in Eastern Mediterranean Countries, Cetraro (Italy), May 24-29, 178-181. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008b). “Monitoraggio multivariato e multiscala di processi di fotolitografia per la produzione di semiconduttori”. In: Proc. Congresso Gr.I.C.U 2008: Ingegneria Chimica, le nuove sfide, LaCastella (KR), September 14-17, 1383-1388. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008c). “Using digital images for fast, reliable, and automatic characterization of surface quality: a case study on the manufacturing of semiconductors”. In: Workshop on nanomaterials production, characterization and industrial applications, December 3, Milano (Italia).

References

147

Facco, P., F. Doplicher, F. Bezzo and M. Barolo (2009a). “Moving-average PLS soft sensor for online product quality estimation in an industrial batch polymerization process”. J. Process Control, in press. doi:10.1016/j.jprocont.2008.05.002 Facco, P., R. Mukherjee, F. Bezzo, M. Barolo and J. A. Romagnoli (2009b). “Monitoring Roughness and edge shape on semiconductors through multiresolution and multivariate image analysis”. AIChE J., in press. Faggian, A., P. Facco, F. Bezzo and M. Barolo (2009). “Multivariate statistical real-time monitoring of an industrial fed-batch process for the production of specialty chemical”. Chem. Eng. Res. Des., in press. doi:10.1016/j.cherd.2008.08.019 Flores-Cerrillo, J., and J. F. MacGregor (2002). Control of particle size distributions in emulsion semibatch polymerization using mid-course correction policies. Ind. Eng. Chem. Res., 41, 1805-1814. Flores-Cerrillo, J., and J. F. MacGregor (2003). Within-batch and batch-to-batch inferentialadaptive control of semibatch reactors: a partial least squares approach. Ind. Eng. Chem. Res., 42, 3334-3345. Flores-Cerrillo, J., and John F. MacGregor (2004). Control of batch product quality by trajectory manipulation using latent variable models. J. Process Control, 14, 539–553. Fransson, M., and S. Folestad (2006). Real-time alignment of batch process data using COW for on-line process monitoring. Chemom. Intell. Lab. Sys., 84, 56-61. García-Muñoz, S., T. Kourti, J. F. MacGregor, A. G. Mateos and G. Murphy (2003). Troubleshooting of an industrial batch process using multivariate methods. Ind. Eng. Chem. Res., 42, 3592-3601. Garcia-Muñoz, S., T. Kourti, and J. F. MacGregor (2004). Model Predictive Monitoring for Batch Processes. Ind. Eng. Chem. Res., 43, 5929-5941. Geladi, P., (1995). Sampling and local models for multivariate image analysis. Mikrochim. Acta, 120, 211-230. Geladi, P. and H. Grahn (1996). Multivariate Image Analysis, John Wiley & Sons, Inc., New York (U.S.A.). Geladi, P. and R. Kowalski (1986). Partial least squares regression: a tutorial. Anal. Chim. Acta, 185, 1-17 Giordani, D. S., A. F. Siqueira, M. L. C. P. Silva, P. C. Oliveira, and H. F. de Castro (2008). Identification of the biodiesel source using an electronic nose. Energy & Fuels, 22, 2743–2747. Giri, S., J. R. Idle, C. Chen, T. M. Zabriskie, K. W. Krausz, and F. J. Gonzalez (2006). A Metabolomic Approach to the metabolism of the areca nut alkaloids arecoline and arecaidine in the mouse. Chem. Res. Toxicol., 19, 818-827.

148

References

Guldi, R. L. (2004). Inline defect reduction from a historical perspective and its implication for future integrated circuits manufacturing. IEEE Trans. Semiconductor Manuf., 17, 4, 629-639. Gunther, J. C., J. S. Conner, and D. E. Seborg (2009). Process monitoring and quality variable prediction utilizing PLS in industrial fed-batch cell culture. J. Process Control, in press. doi:10.1016/j.jprocont.2008.11.007

Härdle, W., and L. Simar (2007). Applied multivariate statistical analysis (2nd ed.). Springer, New Jork (USA). Hare, L. (2003). SPC: from chaos to wiping the floor. Quality progress. Available at: http://www.asq.org/pub/qualityprogress/past/0703/58spc0703.html

[accessed on October 1st, 2008]. Harrison, L., P. Dastidar, H. Eskola, R. Järvenpää, H. Pertovaara, T. LuukkaalaP. L. Kellokumpu-Lehtinen, S. Soimakallio (2008). Texture analysis on MRI images of nonHodgkin lymphoma. Computers in Biology and Medicine, 38, 519-524. Helbert, J. N., and T. Daou (2001). Resist technology – Design, processing and applications. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.)/ William Andrew Publishing, LLC, Norwich, New York (U.S.A.). Höskuldsson, A., (1988). PLS regression methods. J. Chemom., 2, 211-228. Höskuldsson, A., (2001). Variable and subset selection in PLS regression. Chemom. Intell. Lab. Sys., 55, 23-38. Hwang, D. H., and C. Han (1999). Real-time monitoring for a process with multiple operating modes. Control Eng. Practice, 7, 891-902. Jackson, J. E. (1991). A user’s guide to principal components. John Wiley & Sons Inc., New York (U.S.A.). Jackson, J. E., and G. S. Mudholkar (1979). Control procedures for residuals associated with principal component analysis. Technometrics, 21, 341-349. Johnson, R. A., and D. W. Wichern (2007). Applied multivariate statistical analysis (6th ed.). Pearson International Edition, Upper Saddle River (USA). Kaistha, N., and C. F. Moore (2001). Extraction of event times in batch profiles for time synchronization and quality predictions. Ind. Eng. Chem. Res., 40, 252-260. Kamohara, H., A. Takinami, M. Takeda, M. Kano, S. Hasebe and I. Hasimoto (2004). Product quality estimation and operating condition monitoring for industrial ethylene fractionator. J. Chem. Eng. Japan, 37, 422-428. Kano, M., and Y. Nakagawa (2008). Data-based process monitoring, process control and quality improvement: recent developments and applications in steel industry. Computers Chem. Eng., 32, 12-24.

References

149

Kano, M., N. Showchaiya, S. Hasebe, and I. Hashimoto (2003). Inferential control of distillation composition: selection of model and control configuration. Control Eng. Practice, 11, 927-933. Kassidas, A., J. F. MacGregor, and P. A. Taylor (1999). Synchronization of batch trajectories using dynamic time warping. AIChE J., 44, 864-876. Khan, A. A., J. R. Moyne, and D. M. Tilbury (2008). Virtual metrology and feedback control for semiconductor manufacturing process using recursive partial least squares. J. Process Control, 18, 961-974. Kim, C., and C. H. Choi (2007). Image covariance-based subspace method for face recognition. Pattern Recognition, 40, 1592-1604. Kim, M., Y. H. Lee, I. S. Han, and C. Han (2005). Clustering-based hybrid soft sensor for industrial polypropylene process with grade changeover operation. Ind. Eng. Chem. Res., 44, 334-342. Kirdar, A. O., K. D. Green, and A. S. Rathore (2008). Application of multivariate data analysis for identification and successful resolution of a root cause for a bioprocessing application, Biotechnol. Prog., 24, 720-726. Knight, S., R. Dixon, R. L. Jones, E. K. Lin, N. G. Orji, R. Silver, J. S. Villarrubia, A. E. Vladár and W. Wu (2006). Advanced metrology needs for nanoelectric lithography. C. R. Physique, 7, 931-941. Komulainen, T., M. Sourander, S. L. Jämsä-Jounela (2004). An online application of dynamic PLS to a dearomatization process. Computers Chem. Eng., 28, 2611-2619. Kosanovich, K. A., M. J. Piovoso (1997). PCA of wavelet transformed process data for monitoring. Intelligent Data Analysis, 1, 85-99. Kourti, T. (2003). Mulivariate dynamic data modeling for analysis and statistical process control of batch processes, start-ups and grade transitions. J. Chemom., 17, 93-109. Kourti, T. (2005). Application of latent variable methods to process control and multivariate statistical process control in industry. Int. J. Adapt. Control Signal Process., 19, 213246. Kourti, T. and J. F. MacGregor (1995). Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom. Intell. Lab. Sys., 28, 3-21. Kresta, J. V. , J. F. MacGregor and T. E. Marlin (1991). Multivariate statistical monitoring of process operating performance. Canadian J. Chem. Eng., 69, 35-47. Kresta, J. V., T. E. Marlin, and J. F. MacGregor (1994). Development of inferential process models using PLS. Computers Chem. Eng.,18, 597-611. Kruger, U., Y. Zhou and G. W. Irwing (2004). Improved principal component monitoring of large scale processes. J. Process Control, 14, 879-888. Ku, W., R. H. Storer and C. Georgakis (1995). Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Sys., 30, 179-196.

150

References

Lee, F. (2001). Lithography process monitoring and defect detection. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.). Lee, J. H., and A. W. Dorsey (2004). Monitoring of batch processes through state-space models. AIChE J., 50, 1198-1210. Lee, J. H., A. W. Dorsey, and S. Russell (2004). Inferential product quality control of a multistage batch plant. AIChE J., 50, 136-148. Lee, Y. H., M. Kim, Y. H. Chu, and C. Han (2005a). Adaptive multivariate regression modeling based on model performance assessment. Chemom. Intell. Lab. Sys., 78, 6373. Lee, D. S., J. M. Park, and P. A. Vanrolleghem (2005b). Adaptive multiscale principal component analysis for on-line monitoring of a sequencing batch reactor. J. Biotech., 116, 195-210. Lee, . Y., S. S. Shah, C. C. Zimmer, G. Liu, and A. Revzin (2008a). Use of photolithography to encode cell adhesive domains into protein microarrays. Langmuir, 24, 2232-2239. Lee, A. C., K. Shedden, G. R. Rosania, and G. M. Crippen (2008b). Data mining the NCI60 to predict generalized cytotoxicity. J. Chem. Inf. Model., 48, 1379–1388. Lennox, B., G. A. Montague, H. G. Hiden, G. Kornfeld, P. R. Goulding (2001). Process monitoring of an industrial fed-batch fermentation. Biotechnology and Bioengineering, 74, 125-135. Li, B., J. Morris, E. B. Martin (2002). Model selection for partial least squares regression. Chemom. Intell. Lab. Sys., 64, 79-89. Li, W. and S. J. Qin (2001). Consistent dynamic PCA based on errors-in-variables subspace identification. J. Process Control, 11, 661-678. Lieftucht, D., U. Kruger, L. Xie, T. Littler, Q. Chen, and S. Q. Wang (2006). Statistical monitoring of dynamic multivariate processes – Part 2. Identifying fault magnitude and signature. Ind. Eng. Chem. Res., 45, 1677-1688. Lin, B., B. Recke, J. K. H. Knudsen, and S. B. Jørgensen (2007). A systematic approach for soft sensor development. Computers Chem. Eng., 31,419-425. Lindgren, S., and S. Rännar (1998). Alternative partial least-squares (PLS) algorithms. Perspectives in Drug Discovery and Design, 12/13/14, 105-113. Liu, J. J., and J. F. MacGregor (2005). Modeling and optimization of product appearance: application to injection-molded plastic panels. Ind. Eng. Chem. Res., 44, 4687-4696. Liu, J. J., M. H. Bharati, K. G. Dunn, and J. F. MacGregor (2005). Automatic masking in multivariate image analysis using support vector machines. Chemom. Intell. Lab. Sys., 79, 42-54. Liu, J. J., D. Kim, and C. Han (2007a). Use of wavelet packet transform in characterization of surface quality. Ind. Eng. Chem. Res., 46, 5152 -5158.

References

151

Liu, J. J., J. F. MacGregor, C. Duchesne, and G. Bartolacci (2007b). Flotation froth monitoring using multiresolutional multivariate image analysis. Minerals Engineering, 18, 65-76. Liu, J., Q. Li, J. Dong, J. Chen, and G. Gu (2008). Multivariate modeling of aging in bottled lager beer by principal component analysis and multiple regression methods. J. Agric. Food Chem., 56, 7106-7112. Ljung, L., (1999). System identification. Theory for the user (2nd ed.). Prentice Hall, Upper Saddler River (USA). Louwerse, D. J., and A. K. Smilde (2000). Multivariate statistical process control of batch processes based on three-way models. Chem. Eng. Sci., 55, 1225-1235. Lu, N., F. Gao and F. Wang (2004a). Sub-PCA modeling and on-line monitoring strategy for batch processes. AIChE J., 50, 255-259. Lu, N. and F. Gao (2005a). Stage-based process analysis and quality prediction for batch processes. Ind. Eng. Chem. Res., 44, 3547-3555. Lu, N. and F. Gao (2006). Stage-based online quality control for batch processes. Ind. Eng. Chem. Res., 45, 2272-2280. Lu, N., Y. Yang, F. Gao and F. Wang (2004b). Multirate dynamic inferential modeling for multivariable processes. Chem. Eng. Sci., 59, 855-864. Lu, N., Y. Yang, F. Gao and F. Wang (2004c). PCA-based modeling and on-line monitoring strategy for uneven length batch process. Ind. Eng. Chem. Res., 2004, 43, 3343-3352. Lu, N., Y. Yao, F. Gao, and F. Wang (2005b). Two-dimensional dynamic PCA for batch process monitoring. AIChE J., 51, 3300-3304. MacGregor, J. F., T. E. Marlin, J. Kresta and B. Skagerberg (1991). Multivariate statistical methods in process analysis and control. In: Chemical process control (Y. Arkun e W. H. Ray, Eds.), CACHE Austin, AIChE New York (U. S. A.). Mallat, S. G., (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on pattern analysis and machine intelligence, 11, 674-693. Marjanovic, O., B. Lennox, D. Sandoz, K. Smith and M. Crofts (2006). Real-time monitoring of an industrial batch process. Computers Chem. Eng., 30, 1476-1481. Marín, S., M. Vinaixa, J. Brezmes, E. Llobet, X. Vilanova, X. Correig, A. J. Ramos, V. Sanchis (2007). Use of a MS-electronic nose for prediction of early fungal spoilage of bakery products. International Journal of Food Microbiology, 114, 10–16. Maulud, A., D. Wang, and J. A. Romagnoli (2006). A multi-scale orthogonal nonlinear strategy for multi-variate statistical process monitoring. J. Process Control, 16, 671683. Misra, M., H. H. Yue, S. J. Qin, and C. Ling (2002). Multivariate process monitoring and fault diagnosis by multi-scale PCA. Computers Chem. Eng., 26, 1281-1293.

152

References

Mosteller, F., and D. L. Wallace (1963). Inference in an authorship problem. J. Amer. Statist. Assoc., 58, 275-309. Montgomery, D. C. (2005). Introduction to statistical quality control (5th edition). John Wiley & Sons Inc., Danvers (USA). Montgomery, D.C., and G. C. Runger (2003). Applied statistics and probability for engineers (3rd ed.). John, Wiley and Sons Inc., Danvers (USA). Neogi, D., and C. E. Schlags (1998). Multivariate statistical analysis of an emulsion batch process. Ind. Eng. Chem. Res., 37, 3971-3979. Nomikos, P. (1996). Detection and diagnosis of abnormal batch operations based on multiway principal component analysis. ISA Trans., 35, 259-266. World Batch Forum, Toronto, May. Nomikos, P., and J. F. MacGregor (1994). Monitoring batch processes using multiway principal component analysis. AIChE J., 40, 1361-1375. Nomikos, P., and J. F. MacGregor (1995a). Multivariate SPC charts for monitoring batch processes. Technometrics, 37, 41-59. Nomikos, P., and J. F. MacGregor (1995b). Multi-way partial least squares in monitoring batch processes. Chemom. Intell. Lab. Sys., 30, 97-108. Patsis, G. P., V. Constantoudis, A. Tserepi, E. Gogolides and G. Grozevb (2003). Quantification of line-edge roughness of photoresists. I. A comparison between off-line and on-line analysis of top-down scanning electron microscopy images. J. Vac. Sci. Technol. B, 21, 1008-1018. Qiao, J., M. O. Ngadi, N. Wang, C. Garièpy, S. O. Prasher (2007). Pork quality and marbling level assessment using a hyperspectral imaging system. J. Food Eng., 83, 10-16. Qin, S. J., (1998). Recursive PLS algorithms for adaptive data modeling. Computers Chem. Eng., 22, 503-514. Qin, S. J., and R. Dunia (2000). Determining the number of principal components for best reconstruction. J. Process Control, 10, 245-250. Quevedo, R., L. G. Carlos, J. M. Aguilera, and L. Cadoche (2002). Description of food surfaces and microstructural changes using fractal image texture analysis. J. Food Eng., 53, 361-371. Ramaker, H. J., E. N. M. van Sprang, S. P. Gurden, J. A. Westerhuis, and A. K. Smilde (2002). Improved monitoring of batch processes by incorporating external information. J. Process Control, 12, 569-576. Ramaker, H. J., E. N. M. van Sprang, J. A. Westerhuis and A. K. Smilde (2005). Fault detection properties of global, local and time evolving models for batch process monitoring. J. Process Control, 15, 799-805. Rännar, S., J. F. MacGregor and S. Wold (1998). Adaptive batch monitoring using hierarchical PCA. Chemom. Intell. Lab. Sys., 41, 73-81.

References

153

Rao, A. R. (1996). Future directions in industrial machine vision: a case study of semiconductor manufacturing applications. Image and Vision Computing, 14, 3-19. Reis, M. S., P. M. Saraiva, and B. R. Bakshi (2008). Multiscale statistical process control using wavelet packets. AIChE J., 54, 2366-2378. Romagnoli, J. A., and A. Palazoglu (2006). Introduction to process control. Taylor & Francis, Boca Raton (FL, U.S.A.). Russel, S. A., P. Kesavan, J. H. Lee and B. A. Ogunnike (1998). Recursive data-base prediction and control of batch product quality. AIChE J., 44, 11, 2442-2458. Ruttiman, U. E., M. Unser, R. R. Rowlings, D. Rio, N. F. Ramsey, V. S. Mattay, D. W. Hommer, J. A. Frank and D. R. Weinberger (1998). Statistical analysis of functional MRI data in the wavelet domain. IEEE Trans. Med. Imag., 17, 142-154. Salari, E., and Z. Ling (1995). Texture segmentation using hierarchical wavelet decomposition. Pattern Rec., 28, 1818-1824. Schievano, E., G. Pasini, G. Cozzi, and S. Mammi (2008). Identification of the production chain of Asiago d’Allevo cheese by nuclear magnetic resonance spectroscopy and principal component analysis. J. Agric. Food Chem., 56, 7208-7214. Seborg, D. E., T. F. Edgar, and D. A. Mellichamp (2004). Process dynamics and control (2nd. ed.). John Wiley & Sons Inc., New York (U.S.A.) Sharmin, R., U. Sundararaj, S. Shah, L. V. Griend and Y. J. Sun (2006). Inferential sensor for estimation of polymer quality parameter: industrial application of a PLS-based soft sensor for a LDPE plant. Chem. Eng. Sci., 61, 6372-6384. Shawhart, W. A. (1931). Statistical method from the viewpoint of quality control. The Graduate School of Agriculture, Washington DC (USA). [Reprinted in 1986 by Dover Publishing, Toronto (Canada)]. Shao, R., F. Jia, E. B. Martin, and A. J. Morris (1999). Wavelets and non-linear principal components analysis for process monitoring. Control Eng. Practice, 7, 865-879. Shi, R., and J. F. MacGregor (2000). Modeling of dynamic systems using latent variable and subspace methods. J. Chemom., 14, 423-439. Škrbić, B., and A. Onjia (2007). Multivariate analyses of microelement contents in wheat cultivated in Serbia (2002). Food Control, 18, 338-345. Srinivasan, R., and M. Qian (2005). Off-line temporal signal comparison using singular points augmented time warping. Ind. Eng. Chem. Res., 44, 4697-4716. Srinivasan, R., and M. Qian (2007). Online temporal signal comparison using singular points augmented time warping. Ind. Eng. Chem. Res., 46, 4531-4548. Szatvanyi, G., C. Duchesne, and G. Bartolacci (2006). Multivariate image analysis of flames for product quality and combustion control in rotary kilns. Ind. Eng. Chem. Res., 45, 4706-4715.

154

References

Tan, Y., L. Shi, W. Tong and C. Wang (2005). Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data. Nucleic Acids Research, 33, 56-65. Teppola, P., and P. Minkkinen (2000). Wavelet–PLS regression models for both exploratory data analysis and process monitoring. J. Chemom., 14, 383-399. Tessier, J., C. Duchesne, and G. Bartolacci (2007). A machine vision approach to on-line estimation of run of mine ore composition on conveyor belts. Minerals Eng., 20, 11291144. Tessier, J., C. Duchesne, C. Gauthier, and G. Dufour (2006). Estimation of alumina content of anode cover materials using multivariate image analysis techniques. Chem. Eng. Sci., 63, 1370-1380. Treasure, R. J., U. Kruger, and J. E. Cooper (2004). Dynamic multivariate statistical process control using subspace identification. J. Process Control, 14, 279-292. Trendafilova, I., M.P. Cartmell, and W. Ostachowicz (2008). Vibration-based damage detection in an aircraft wing scaled model using principal component analysis and pattern recognition. Journal of Sound and Vibration, 313, 560-566. Übeyli, D. E., (2007). Implementing automated diagnostic systems for breast cancer detection. Expert Systems with Applications, 33, 1054-1062. Ündey, C., and A. Çinar (2002). Statistical process monitoring of multistage, multiphase batch processes. 2002. IEEE Control Systems Magazine, 22, 40-52. Ündey, C., B. A. Williams, and A. Çinar (2002). Monitoring batch pharmaceutical fermentations: data synchronization, landmark alignment and real-time monitoring. 15th IFAC World Congress, July 22-26, Barcelona (Spain). Ündey, C., S. Ertunç, and A. Çinar (2003a). Online batch/fed-batch process performance monitoring, quality prediction, and variable-contribution analysis for diagnosis. Ind. Eng. Chem. Res., 42, 4645-4658. Ündey, C., E. Tatara, and A. Çınar (2003b). Real-time batch process supervision by integrated knowledge-based systems and multivariate statistical methods. Engineering Applications of Artificial Intelligence, 16, 555-566. Ündey, C., E. Tatara, and A. Çınar (2004). Intelligent real-time performance monitoring and quality prediction for batch/fed-batch cultivations. J. Biotech.,108, 61-77. Valle, S., W. Li, and S. J. Qin (1999). Selection of the number of principal components: the variance of the reconstruction error criterion with a comparison to other methods. Ind. Eng. Chem. Res., 38, 4389-4401. Viggiani, L., and M. A. Castiglione Morelli (2008). Characterization of wines by nuclear magnetic resonance: a work study on wines from the Basilicata region in Italy. J. Agric. Food Chem., 56, 8273-8279.

References

155

Viñasa, R., A. Eff-Darwicha, V. Solerb, M. C. Martín-Luisa, M. L. Quesadaa, J. de la Nuez (2007). Processing of radon time series in underground environments: implications for volcanic surveillance in the island of Tenerife, Canary Islands, Spain. Radiation Measurements, 42, 101-115. Xie, L., U. Kruger, D. Lieftucht, T. Littler, Q. Chen, and S. Q. Wang (2006). Statistical monitoring of dynamic multivariate processes – Part 1. Modelling autocorrelation and crosscorrelation. Ind. Eng. Chem. Res., 45, 1659-1676. Xu, L., J. H. Jiang, H. L. Wu, G. L. Shen, R. Q. Yu (2007). Variable-weighted PLS. Chemom. Intell. Lab. Sys., 85, 140-143. Yaakobovitz, B., Y. Cohen and Y. Tsur (2007). Line edge roughness detection using deep UV light scatterometry. Microelectronic Eng., 84, 619-625. Yabuki, Y., T. Nagasawa, and J. F. MacGregor (2002). Industrial experiences with product quality control in semi-batch processes. Computers Chem. Eng., 26, 205-212. Yabuki, Y., and J. F. MacGregor (1997). Product quality control in semibatch reactors using midcourse correction policies. Ind. Eng. Chem. Res., 36, 1268-1275. Yao, Y, and F. Gao (2009). Phase and transition based batch process modeling and online monitoring. J. Process Control, in press. doi:10.1016/j.jprocont.2008.11.001 Yoon, S., and J. F. MacGregor (2004). Principal-component analysis of multiscale data for process monitoring and fault diagnosis. AIChE J., 50, 2891-2903. Yu, H., and J. F. MacGregor (2003). Multivariate image analysis and regression for prediction of caoating content and distribution in the production of snack foods. Chemom. Intell. Lab. Sys., 67, 125-144. Yu, H., J. F. MacGregor, G. Haarsma and W. Bourg (2003). Digital imaging for online monitoring and control of industrial snack food processes. Ind. Eng. Chem. Res., 42, 3036-3044. Yue, H. H., S. J. Qin, R.J. Markle, C. Nauert and M. Gatto (2000). Fault detection of plasma etcher using optical emission spectra. IEEE Trans. Semicond. Manufact., 13, 374-385. Waits, C. M., B. Morgan, M. Kastantin and R. Ghodssi (2005). Microfabrication of 3D silicon MEMS structures using grey-scale lithography and deep reactive ion etching. Sensors and actuators A, 119, 245-253. Waldo, W. (2001). Techniques and tools for optical lithography. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.). Wang, X., U. Kruger, and B. Lennox (2003). Recursive partial least squares algorithms for monitoring complex industrial processes. Control Eng. Practice, 11, 613-632. Warne, K., G. Prasad, S. Rezvani, and L. Maguire (2004). Statistical and computational intelligence techniques for inferential model development: a comparative evaluation and

156

References

a novel proposition for fusion. Engineering Applications of Artificial Intelligence, 17, 871-885. Westerhuis, J. A., S. P. Gurden, and A. K. Smilde (2000). Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Sys., 51, 95-114. Westerhuis, J. A., T. Kourti, and J. F. MacGregor (1998). Analysis of multiblock and hierarchical PCA and PLS models. J. Chemom., 12, 301-321. Westerhuis, J. A., T. Kourti, and J. F. MacGregor (1999). Comparing alternative approaches for multivariate statistical analysis of batch process data. J. Chemometrics 13, 397-413. Whelehan, O. P., M. E. Earll, E. Johansson, M. Toft, L. Eriksson (2006). Detection of ovarian cancer using chemometric analysis of proteomic profiles. Chemom. Intell. Lab. Sys., 84, 82-87. Wise, B. M., and N. B. Gallagher (1996). The process chemometrics approach to process monitoring and fault detection. J. Process Control, 6, 329-348. Wold, S. (1978). Cross-validatory estimation of number of components in factor and principal components models. Technometrics, 20, 397-405. Wold, S. (1994). Exponentially weighted moving principal components analysis and projection on latent structures. Chemom. Intell. Lab. Sys., 23, 149-161. Wold, S., P. Geladi, K. Esbensen, J. Öhman (1987). Multi-way principal components and PLS-analysis, J. Chemom., 1, 47–56. Wold, S., N. Kettaneh, H. Fridèn e A. Holmberg (1998). Modeling and diagnostics of batch processes and analogous kinetics experiments. Chemom. Intell. Lab. Sys., 44, 331-340. Wold, S., N. Kettaneh-Wold, and Skagerberg, B (1989). Non-linear PLS modelling. Chemom. Int. Lab. Sys., 7, 53-65. Wold, S., N. Kettaneh, and K. Tjessem (1996). Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection. J. Chemom., 10, 463-482. Wold, S., M. Sjöström, and L. Eriksson (2001). PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Sys., 58, 109-130. Zamprogna, E., M. Barolo and D. E. Seborg (2004). Estimating product composition profiles in batch distillation via partial least squares regression. Computers Eng. Practice., 12, 917-929. Zhai, H. L., X. G. Chen, and Z. D. Hu (2006). A new approach for the identification of important variables. Chemom. Intell. Lab. Sys., 80, 130-135. Zhang, Y., and M. Dudzic (2006). Industrial application of multivariate SPC to continuous caster start-up operations for breakout prevention. Computers Eng. Practice, 14, 13571375. Zhang, H., and B. Lennox (2004). Integrated condition monitoring and control of fed-batch fermentation processes. J. Process Control, 14, 41-50.

References

157

Zhang, Q., K., Poola, and C. J. Spanos (2007). Across wafer critical dimension uniformity enhancement through lithography and etch process sequence: concept, approach, modeling, and experiment. IEEE Trans. Semicond. Manufact., 20, 488-505. Zhang, Q., K., Poola, and C. J. Spanos (2008). One step forward from run-to-run critical dimension control: across-wafer level critical dimension control through lithography and etch process. J. Process Control, 18, 937-945. Zhao, C., F. Wang, F. Gao, N. Lu, and M. Jia (2007a). Adaptive monitoring method for batch processes based on phase dissimilarity updating with limited modeling data. Ind. Eng. Chem.. Res., 46, 4943-4953. Zhao, C., F. Wang, N. Lu, and M. Jia (2007b). Staged-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes. J. Process Control, 17, 728-741. Zhao, C., F. Wang, Z. Mao, N. Lu, and M. Jia (2008a). Improved batch process monitoring and quality prediction based on multiphase statistical analysis. Ind. Eng. Chem. Res., 47, 835-849. Zhao, C., F. Wang, Z. Mao, N. Lu, and M. Jia (2008b). Quality prediction based on phasespecific average trajectory for batch processes. AIChE J., 54, 693-705. Zhao, S. J., J. Zhang and Y. M. Xu (2006a). Performance monitoring of process with multiple operating modes through multiple PLS models. J. Process Control, 16, 763-772. Zhao, S. J., J. Zhang, Y. M. Xu, and Z. H. Xiong (2006b). Nonlinear projection to latent structures method and its applications. Ind. Eng. Chem. Res., 45, 3843-3852. Web sites http://www.asq.org/index.html/ http://www.eigenvectorresearch.com/ http://www.mathworks.com/ http://www.progea.com/ http://www.sirca.com/

[accessed on October 1st, 2008] [accessed on January 5th, 2009] [accessed on January 5th, 2009] [accessed on January 5th, 2009] [accessed on December 31st, 2008]

Acknowledgements The author would like to express his gratitude to the people and the institutions whose technical and intellectual support has been involved in this project: prof. Massimiliano Barolo and Dr. Fabrizio Bezzo of the Department of Chemical Engineering Principles and Practice (DIPIC - Università di Padova, Italia), and Prof. J. A. Romagnoli of the “Gordon A. and Mary Cain” Department of Chemical Engineering (Louisiana State University, Baton Rouge, Louisiana, USA) for their invaluable guidance and their essential advices. Furthermore, the author would like to gratefully acknowledge SIRCA S.p.A. for the financial support and Fondazione “Ing. Aldo Gini” for the scholarship. … e i ringraziamenti! Grazie Mamma e Papà, siete una vera benedizione. Non saprò mai ringraziarvi abbastanza, e i miei modi ruvidi di certo non vi ripagano come meritate. La vostra disponibilità e dedizione nei miei confronti è proverbiale. Penso proprio che un giorno ci sarà un adagio: “disponibili come Luciana e Adriano”. Il mondo è un caldo abbraccio perché ci siete voi! Claudia, sei una stella che ho rubato al cielo! Tu e la tua famiglia mi avete regalato incredibile entusiasmo e voglia di vivere. Se solo ne fossi capace, consegnerei nelle tue mani dolci tutta la felicità del mondo, consegnerei alla tua vita ogni grazia, realizzerei ogni tuo desiderio. Ciò che posso fare è donarti tutto il mio Amore. Per sempre! Grazie agli amici veri, perché una serata a cena con loro è sempre un dono inestimabile. Grazie a Max e Fabrizio, non solo per le opportunità di crescita che mi hanno dato, per quello che mi hanno insegnato o per quanto mi hanno aiutato, ma per i panini mangiati assieme a mezzogiorno, i CAPE-luganega, la grande simpatia, l’umanità, la puntualità, … Un grazie anche a Federico. Chi avrebbe detto di scoprire un nuovo amico nelle mezz’ore perse a chiacchierare sulla varianza della matrice di covarianza o l’analisi del’immagine dei funghi? Impagabili. E grazie a tutti i miei compagni di ufficio, per le due parole dette per scherzo e che magari ci hanno alleggerito la giornata. E un ringraziamento ai “miei” laureandi, che sono sempre simpatici e brillanti.

SCUOLA DI DOTTORATO DI RICERCA IN INGEGNERIA INDUSTRIALE INDIRIZZO: INGEGNERIA CHIMICA CICLO XXI

DEVELOPMENT OF MULTIVARIATE STATISTICAL TECHNIQUES FOR QUALITY MONITORING IN THE BATCH MANUFACTURING OF HIGH VALUE ADDED PRODUCTS

Direttore della Scuola: Prof. Paolo Bariani Supervisore: Prof. Massimiliano Barolo

Dottorando: Pierantonio Facco

To Claudia To my parents, Luciana and Adriano

Foreword The realization of this work has involved the intellectual and financial support of many people and institutions, to whom the author is most grateful. Most of the research activity that led to the results summarized in this Thesis has been carried out at DIPIC, the Department of Chemical Engineering Principles and Practice of the University of Padova, under the supervision of Prof. Massimiliano Barolo and Dr. Fabrizio Bezzo. Part of the work has been conducted under the supervision of Prof. Josè A. Romagnoli, at the “Gordon A. and Mary Cain” Department of Chemical Engineering of the Louisiana State University, Baton Rouge (LA, U.S.A.). The realization of this study has been made possible also through the financial support of SIRCA S.p.A. (Massanzago, Padova, Italy; www.sirca.it) and the scholarship of the “Fondazione Ing. Aldo Gini” (Padova, Italy). All the material reported in this Thesis is original, unless explicit references to the authors are provided. In the following, a list of the publications stemmed from this project is reported. PUBLICATIONS IN INTERNATIONAL JOURNALS Facco, P., F. Doplicher, F. Bezzo and M. Barolo (2009) Moving-average PLS soft sensor for online product quality estimation in an industrial batch polymerization process J. Process Control, in press. doi:10.1016/j.jprocont.2008.05.002 Faggian, A., P. Facco, F. Doplicher, F. Bezzo and M. Barolo (2009) Multivariate statistical real-time monitoring of an industrial fed-batch process for the production of specialty chemical Chem. Eng. Res. Des., in press. doi:10.1016/j.cherd.2008.08.019 Facco, P., R. Mukherjee, F. Bezzo, M. Barolo and J. A. Romagnoli (2009) Monitoring Roughness and edge shape on semiconductors through multiresolution and multivariate image analysis AIChE J., in press. PUBLICATION IN CONFERENCE PROCEEDINGS Facco, P., M. Olivi, C. Rebuscini, F. Bezzo and M. Barolo (2007) Multivariate Statistical Estimation of Product Quality in the Industrial Batch Production of a Resin Proc. DYCOPS 2007 – 8th IFAC Symposium on Dynamics and Control of Process Systems (B. Foss and J. Alvarez, Eds.), Cancun (Mexico), June 6-8, vol. 2, 93-98. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008) Using digital images for fast, reliable, and automatic characterization of surface quality: a case study on the manufacturing of semiconductors In: Workshop on nanomaterials production, characterization and industrial applications, December 3, Milan (Italy). Facco, P., A. Faggian, F. Doplicher, F. Bezzo and M. Barolo (2008) Virtual sensors can reduce lab analysis requirements in the industrial production of specialty chemicals In: Proc. EMCC5 – 5th Chemical Engineering Conference for Collaborative Research in Eastern Mediterranean Countries, Cetraro (CS, Italy), May 24-29, 178-181. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008). Monitoraggio multivariato e multiscala di processi di fotolitografia per la produzione di semiconduttori In: Proc. Congresso Gr.I.C.U 2008: Ingegneria Chimica, le nuove sfide, Le Castella (KR, Italy), September 14-17, 1383-1388. PUBLICATION IN TECHNICAL JOURNALS Barolo, M., F. Bezzo and P. Facco (2008) Sensori virtuali per monitorare la qualità di prodotti e processi ICP, 36 (4), 82-84.

Abstract Although batch processes are “simple” in terms of equipment and operation design, it is often difficult to ensure consistently high product quality. The aim of this PhD project is the development of multivariate statistical methodologies for the realtime monitoring of quality in batch processes for the production of high value added products. Two classes of products are considered: those whose quality is determined by chemical/physical characteristics, and those where surface properties define quality. In particular, the challenges related to the instantaneous estimation of the product quality and the realtime prediction of the time required to manufacture a product in batch processes are addressed using multivariate statistical techniques. Furthermore, novel techniques are proposed to characterize the surface quality of a product using multiresolution and multivariate image analysis. For the first class of products, multivariate statistical soft sensors are proposed for the realtime estimation of the product quality and for the online prediction of the length of batch processes. It is shown that, to the purpose of realtime quality estimation, the complex series of operating steps of a batch can be simplified to a sequence of estimation phases in which linear PLS models can be applied to regress the quality from the process data available online. The resulting estimation accuracy is satisfactory, but can be substantially improved if dynamic information is included into the models. Dynamic information is provided either by augmenting the process data matrix with lagged measurements, or by averaging the process measurements values on a moving window of fixed length. The process data progressively collected from the plant can be exploited also by designing time-evolving PLS models to predict the batch length. These monitoring strategies are tested in a real-world industrial batch polymerization process for the production of resins, and prototypes of the soft sensor are implemented online. For products where surface properties define the overall quality, novel multiresolution and multivariate techniques are proposed to characterize the surface of a product from image analysis. After analyzing an image of the product surface on different levels of resolutions via wavelet decomposition, the application of multivariate statistical monitoring tools allow the in-depth examination of the product features. A two-level “nested” principal component analysis (PCA) model is used for surface roughness monitoring, while a new strategy based on “spatial moving window” PCA is proposed to analyze the shape of the surface pattern. The proposed approach identifies the abnormalities on the surface and localizes defects in a sensitive fashion. Its effectiveness is tested in the case of scanning electron microscope images of semiconductor surfaces after the photolithography process in the production of integrated circuits.

Riassunto Nonostante i processi batch siano relativamente semplici da configurare e da gestire anche con un livello limitato di automazione e una conoscenza ridotta dei meccanismi che ne stanno alla base, spesso è difficile assicurare una qualità del prodotto finito riproducibile ed elevata. La strumentazione comunemente utilizzata nella pratica industriale riesce solo raramente a fornire misure in tempo reale della qualità di un prodotto. Inoltre, molte complicazioni nascono dalla natura multivariata della qualità, la quale dipende da una serie di parametri fisici, operativi o addirittura soggettivi. Sebbene le informazioni sulla qualità del prodotto non siano facilmente accessibili, esse sono racchiuse nelle variabili di processo abitualmente registrate dai calcolatori di processo e memorizzate in banche di dati storici. I metodi statistici multivariati permettono di ridurre la dimensione del problema proiettando le variabili di processo in uno spazio di dimensioni ridotte costituito di variabili fittizie che sono in grado di mantenere tutto il contenuto informativo sulla qualità, superando i problemi del rumore di misura delle variabili, della ridondanza e dell’elevato grado di correlazione. Inoltre, questi metodi sono in grado di trattare dati anomali o dati mancanti. Lo scopo di questa Tesi di Dottorato è di sviluppare dei sistemi innovativi per il monitoraggio della qualità di prodotti dall’alto valore aggiunto mediante tecniche statistiche multivariate. In particolare, i contributi scientifici di questo progetto di Dottorato sono: • l’elaborazione di tecniche per lo sviluppo di sensori virtuali per la stima in tempo reale della qualità del prodotto in sistemi produttivi di tipo batch; • l’applicazione non convenzionale di tecniche di proiezione su sottospazi latenti al fine di prevedere la durata di un batch o delle relative fasi operative; • lo sviluppo di metodiche innovative per il monitoraggio multirisoluzione e multivariato della qualità mediante l’analisi di immagini di un prodotto dall’alto valore aggiunto. Innanzi tutto, in questa Tesi vengono proposti sensori virtuali per la stima in linea della qualità del prodotto. Essi sono stati sviluppati e implementati prendendo in considerazione il caso di studio un processo industriale reale per la produzione di resine mediante polimerizzazione batch. I sensori virtuali proposti sono basati sulla tecnica statistica multivariate della proiezione su strutture latenti (PLS), che opera una regressione delle misure di processo usualmente disponibili in linea in tempo reale. Questo sistema riesce a garantire una accuratezza delle stime della qualità che è dello stesso ordine di grandezza delle misure di qualità fatte in laboratorio, col vantaggio che le stime in linea sono disponibili con altissima frequenza (sull’ordine di grandezza di s-1), cioè una frequenza centinaia di volte superiore delle misure che possono essere fatte in laboratorio (sull’ordine di grandezza di h-1). Inoltre, le stime sono accessibili in tempo reale e senza il ritardo che è tipico delle misure di laboratorio.

Riassunto

Al fine di compensare le non linearità dei dati e i cambiamenti nella struttura di correlazione fra le variabili, la procedura adottata divide il batch in una sequenza di un numero limitato di fasi di stima, all’interno delle quali lo stimatore virtuale è in grado di dare stime molto accurate per mezzo di modelli PLS lineari. Il passaggio da una fase a quella successiva avviene in corrispondenza di alcuni “eventi” facilmente riconoscibili nelle stesse variabili di processo. La caratteristica principale del sensore virtuale proposto è che esso tiene conto di informazioni sulla dinamica del processo per mezzo di modelli a “variabili ritardate” (i quali aggiungono informazioni sulla dinamica del processo da valori passati delle variabili di processo) o modelli a media mobile. Il filtro a media mobile aggiunge una “memoria temporale” al sensore virtuale che migliora l’accuratezza di stima e, mediando le variabili di processo all’interno di una finestra temporale di dimensione fissata, riesce ad eliminare il rumore di misura, attenuare il rumore di processo, appiattire valori anomali e compensare l’effetto di temporanee mancanze di dati. L’ampiezza della finestra deve comunque essere scelta con cautela, dato che una finestra temporale troppo larga potrebbe ritardare gli allarmi sull’attendibilità della stima. Da un punto di vista operativo, il sistema proposto aiuta il personale che opera nell’impianto a rilevare delle derive sulla qualità del prodotto, suggerisce tempestivamente le correzioni da apportare alla ricetta del processo, e aiuta a minimizzare i fuori specifica del prodotto finale. Inoltre, il numero di campioni per la misura della qualità in laboratorio può essere ridotto drasticamente, la qual cosa determina un guadagno sia sul tempo totale del batch, sia sui costi relativi al laboratorio che alla manodopera e alla sua organizzazione. Anche una seconda tipologia di sensori virtuali è stata sviluppata per assistere il monitoraggio in linea della qualità del prodotto e per fornire informazioni utili per una programmazione efficace della produzione: un sensore virtuale per la previsione in tempo reale della durata del batch. Questa strategia di monitoraggio si basa su modelli PLS evolutivi che sfruttano le informazioni progressivamente raccolte nel tempo durante il batch per prevedere la durata del batch o di ciascuno dei relativi stadi operativi. Anche l’accuratezza ottenuta dalle previsioni ottenute con questo sensore virtuale è del tutto soddisfacente, dato che l’errore di previsione è molto inferiore sia alla variabilità delle durata del batch che alla durata dei turni di lavoro degli operatori. Inoltre, la parte iniziale del batch conferma di essere di importanza fondamentale per la durata, in quanto le condizioni iniziali delle attrezzature, lo stato delle materie prime, e la fase di riscaldamento iniziale del reattore esercitano una grandissima influenza sulle prestazioni del batch stesso. Le informazioni che si ricavano sulla durata con grande anticipo rispetto alla fine del batch permettono una migliore organizzazione degli interventi sull’impianto, degli operatori d’impianto e dell’utilizzazione delle apparecchiature. L’efficacia dei sensori per la stima della qualità e per la previsione della durata del batch è stata verificata applicandoli ed implementandoli in linea nel caso della produzione di resine mediante polimerizzazione batch.

Riassunto

Infine, i metodi statistici multivariati sono stati utilizzati anche nel campo dell’analisi dell’immagine. Abitualmente, nella pratica industriale, le ispezioni di un prodotto mediante analisi dell’immagine vengono svolte con semplici misurazioni dei più importanti parametri fisici opportunamente messi in evidenza per mezzo di tecniche di filtrazione. Inoltre, queste misure vengono ottenute in modo non sistematico. Molte informazioni utili restano però “nascoste” nelle immagini. Queste permettono di identificare la natura complessa della qualità del prodotto finale. Per questo è stato sviluppato un sistema totalmente automatizzato per il monitoraggio in tempo reale da immagini di un manufatto dall’alto valore aggiunto. Questo sistema di monitoraggio basato su tecniche multirisoluzione e multivariate è stato applicato al caso della caratterizzazione della superficie di un semiconduttore dopo fotolitografia, un’operazione fra le più importanti nella fabbricazione di circuiti integrati. Tecniche avanzate di analisi multivariata dell’immagine estraggono le tracce che il processo lascia sul prodotto, aiutando sia il rilevamento di situazioni critiche nel processo che l’intervento con azioni correttive a neutralizzare eventuali problemi. L’approccio proposto in questa Tesi si basa su un filtraggio preliminare multirisoluzione dell’immagine mediante wavelet, seguito da uno schema di monitoraggio che conduce in parallelo un’analisi della rugosità superficiale e della forma della superficie di un prodotto. Ad esempio, la rugosità della superficie può essere esaminata con una analisi delle componenti principali “nidificata”. Questa è una strategia che si articola su due differenti livelli: il livello esterno che permette di discriminare parti differenti della superficie per mezzo di una analisi dei gruppi con PCA; il livello interno esegue il monitoraggio della rugosità superficiale con PCA. La forma della superficie viene analizzata per mezzo di un approccio PCA a “finestra mobile nello spazio”, il quale coglie l’informazione dell’immagine secondo il relativo ordine nello spazio e riesce anche a tener conto sia delle non linearità che delle differenze strutturali della superficie. Questo sistema è in grado di rilevare alcune delle caratteristiche qualitative del prodotto che abitualmente non sono accessibili senza richiedere l’intervento dell’uomo. Inoltre, il monitoraggio risulta essere veloce, attendibile e non ambiguo, ed esegue una scansione di un’immagine del prodotto localizzando in modo preciso difetti e anomalie e rilevando eventuali derive del processo. In conclusione, nonostante le metodologie proposte siano state testate su specifici casi di studio, esse hanno dimostrato di essere generali e vantano un grande potenziale. Per questo si ritiene sia possibile estenderle a differenti campi di ricerca e a diverse applicazioni industriali (ad esempio: ingegneria alimentare; industria farmaceutica; biotecnologie; etc…), nonché a differenti scale di indagine, dalla scala macroscopica alla microscopica o nanoscopica.

Table of contents LIST OF SYMBOLS ............................................................................................................................................ 1 General symbols, vectors and matrices .......................................................................................................... 1 Greek symbols................................................................................................................................................ 7 Acronyms ....................................................................................................................................................... 8

CHAPTER 1 - THESIS OVERVIEW AND LITERATURE SURVEY......................................................... 11 1.1 AIM OF THE PROJECT................................................................................................................................ 11 1.2 INTRODUCTION TO QUALITY AND STATISTICAL QUALITY MONITORING ................................................... 12 1.3 MULTIVARIATE STATISTICAL TECHNIQUES FOR PROCESS MONITORING ................................................... 14 1.3.1 Multivariate statistical process control for batch processes ....................................................... 18 1.3.1.1 Nonlinear multivariate models ......................................................................................................... 19 1.3.1.2 Multiway multivariate models.......................................................................................................... 19 1.3.1.3 Multiple multivariate models............................................................................................................ 21 1.3.1.4 Preliminary data treatment for multivariate statistical methods........................................................ 22

1.3.2 Multivariate image analysis......................................................................................................... 23 1.4 THESIS OVERVIEW ................................................................................................................................... 24 1.4.1 Realtime quality estimation and length prediction in batch processes ........................................ 24 1.4.2 Multivariate statistical quality monitoring through image analysis............................................ 27 1.4.3 Thesis roadmap............................................................................................................................ 29 CHAPTER 2 - MATHEMATICAL AND STATISTICAL BACKGROUND ............................................... 31 2.1 MULTIVARIATE STATISTICAL TECHNIQUES .............................................................................................. 31 2.1.1 Principal component analysis (PCA)........................................................................................... 31 2.1.1.1 PCA algorithm ................................................................................................................................. 34 2.1.1.2 Data collection, variable selection and data pre-treatment............................................................... 36 2.1.1.3 Selection of the principal component subspace dimension .............................................................. 37

2.1.2 Projection on latent structures (PLS; partial least squares regression)...................................... 38 2.1.2.1 Non-iterative partial least squares algorithm ................................................................................... 40 2.1.2.2 Variable selection in PLS models .................................................................................................... 41

2.1.3 Monitoring charts ........................................................................................................................ 42 2.1.3.1 Contribution plots, limits on the contribution plots, and relative contributions ............................... 47

2.1.4 Enhancement for multivariate statistical methods....................................................................... 50 2.1.4.1 Multi-way methods, data unfolding and data synchronization/alignment........................................ 51

2.2 MULTIRESOLUTION DECOMPOSITION METHODS ....................................................................................... 54

2.2.1 Continuous and discrete wavelet transform ................................................................................ 55 2.2.1.1 Bi-dimensional wavelet transform ................................................................................................... 59

CHAPTER 3 - INDUSTRIAL PROCESS FOR THE PRODUCTION OF RESINS BY BATCH POLYMERIZATION......................................................................................................................................... 63 3.1 THE INDUSTRIAL PRODUCTION PLANT AND THE OPERATING RECIPE ........................................................ 63 3.1.1 Resin A......................................................................................................................................... 64 3.1.2 Resin B......................................................................................................................................... 66 3.1.3 P&ID of the production facility ................................................................................................... 67 3.2 DATA ACQUISITION .................................................................................................................................. 68 3.2.1 Monitoring of the process variables ............................................................................................ 68 3.3 EMPIRICAL MONITORING OF THE PRODUCT QUALITY ............................................................................... 70 3.4 CHALLENGES FOR THE STATISTICAL MONITORING OF PRODUCT QUALITY................................................ 72 3.5 AUTOMATED QUALITY MONITORING THROUGH SOFT-SENSORS ............................................................... 72 CHAPTER 4 - SOFT SENSORS FOR THE REALTIME QUALITY ESTIMATION IN BATCH PROCESSES ....................................................................................................................................................... 75 4.1 QUALITY ESTIMATION IN RESIN A USING PLS MODELS ............................................................................ 75 4.1.1 Single-phase PLS model .............................................................................................................. 77 4.1.2 Multi-phase PLS model................................................................................................................ 78 4.2 INCLUDING TIME INFORMATION TO IMPROVE THE ESTIMATION PERFORMANCE ....................................... 82 4.2.1 Improving soft sensor performance through lagged process variables ....................................... 82 4.2.2 Improving soft sensor performance through moving-average process data................................ 85 4.3 COMPARISON OF THE ESTIMATION PERFORMANCES ................................................................................. 87 4.3.1 Reliability of the estimations........................................................................................................ 89 4.3.2 Diagnosis of the soft sensors faults.............................................................................................. 90 4.4 SOFT SENSOR FOR ESTIMATION OF QUALITY IN RESIN B........................................................................... 91 4.4.1 Estimation of the quality indicators............................................................................................. 92 4.5 CONCLUDING REMARKS ........................................................................................................................... 95 CHAPTER 5 - REALTIME PREDICTION OF BATCH LENGTH ............................................................. 97 5.1 DESIGN OF AN EVOLVING PLS MODEL FOR THE PREDICTION OF BATCH LENGTH ..................................... 97 5.2 PREDICTION OF BATCH LENGTH IN THE PRODUCTION OF RESIN B .......................................................... 100 5.2.1 Prediction of Stage 1 length....................................................................................................... 100 5.2.2 Prediction of Stage 2 length....................................................................................................... 103 5.3 PREDICTION OF BATCH LENGTH IN THE PRODUCTION OF RESIN A .......................................................... 105 5.4 CONCLUDING REMARKS ......................................................................................................................... 106 CHAPTER 6 - INDUSTRIAL IMPLEMENTATION OF A SOFT SENSOR PROTOTYPE................... 107

6.1 INDUSTRIAL SUPERVISION SYSTEM ........................................................................................................ 107 6.2 IMPLEMENTATION OF THE SOFT SENSOR ................................................................................................ 108 6.2.1 Architecture of the soft sensor ................................................................................................... 109 6.2.2 MatlabTM codes of the soft sensors ............................................................................................ 110 6.2.2.1 Prototype A.................................................................................................................................... 111 6.2.2.2 Prototypes B1 and B2 .................................................................................................................... 113

CHAPTER 7 - SURFACE CHARACTERIZATION THROUGH MULTIRESOLUTION AND MULTIVARIATE IMAGE ANALYSIS......................................................................................................... 115 7.1 PHOTOLITHOGRAPHY PROCESS AND INSPECTION TOOLS ........................................................................ 115 7.2 IMAGE ANALYSIS THROUGH MULTIRESOLUTION AND MULTIVARIATE STATISTICAL TECHNIQUES .......... 118 7.2.1 Image multiresolution denoising ............................................................................................... 119 7.2.2 Multivariate statistical surface monitoring methods ................................................................. 123 7.2.2.1 LER monitoring ............................................................................................................................. 123 7.2.2.2 Surface roughness monitoring ....................................................................................................... 124 7.2.2.3 Edge shape monitoring .................................................................................................................. 126

7.3 CASE STUDY: MONITORING RESULTS ..................................................................................................... 129 7.3.1 LER monitoring system.............................................................................................................. 129 7.3.2 Surface roughness monitoring system ....................................................................................... 130 7.3.3 Edge shape monitoring system .................................................................................................. 133 7.4 THE EDGE3

MONITORING INTERFACE ................................................................................................... 135

7.5 CONCLUDING REMARKS ......................................................................................................................... 136 CONCLUSIONS AND PERSPECTIVES....................................................................................................... 139 REFERENCES.................................................................................................................................................. 143 Web sites .................................................................................................................................................... 157

ACKNOWLEDGEMENTS.............................................................................................................................. 159

List of symbols General symbols, vectors and matrices a

= wavelet scale

A

= total number of latent variables

_______

AAE

= overall average absolute error

b

= wavelet location

br

= regression coefficient of the rth latent variable

B

= matrix of regression coefficients 2

2

ciT, j

= contribution ciT, j of the variable j to the Ti 2 of the ith observation

cit, j

= contribution of the variable j to the scores that compose the Ti 2 of the ith observation = contribution of the variable j to the square predicting error SPEi of the ith

ciE, j

observation = average contributions of variable j over all the I observations of the

c jE

reference for the SPE statistics

c Tj

2

= average contributions of variable j over all the I observations of the reference for the Hotelling statistics

c Ej ,lim (α )

E = the 100(1-α)% confidence intervals for the contributions ci , j

c Tj,lim (α )

= the 100(1-α)% confidence intervals for the contributions ciT, j

CE

= matrix of the contributions to SPE of all the J variables for all the I

2

2

observations in of X matrix

CT

2

= matrix of the contributions to T2 of all the J variables for all the I observations in of X matrix

dm

= detail of the signal x at the mth wavelet decomposition scale

D hm

= reconstruction of the horizontal detail Tmh

D mv

= reconstruction of the vertical detail Tmv

D dm

= reconstruction of the diagonal detail Tmd

2

List of symbols

ΔK

= lag on the process variables in the TP-PLS models

ΔK’

= length of the moving window in the MATP-PLS models

Δnsegm

= edge segment width (pixel)

Δnmw

= size of the moving window (pixel)

ei,j

= element of row i and column j of the residual matrix E

e

= residual of a test sample

eI+1

= error of reconstruction for the projection of xI+1 onto the latent variable space

e TI +1

= transpose of eI+1

E(x I +1 )

= expected value of xI+1

E

= 2D residual matrix of X

E

= 2D residual matrix of X

FJ , I − J ,α

= upper 100αth percentile of the F-distribution with I and I-J degree of freedom

F

= residual matrix of Y

h

= sampling instant of the quality variables in the 3D data matrix of regular shape

hi

= sampling instant of the quality variables in the 3D data matrix of irregular shape

h0

= parameter of the Jackson-Mudholkar equation

Hi

= total number of quality samples for the observation i

H0

= null hypothesis

H1

= alternative hypothesis

i

= observation of the reference dataset

i0

= image (pixel)

iM

= filtered image at the Mth scale of wavelet decomposition (pixel)

I

= total number of observations in the reference dataset

I

= identity matrix

L2 (ℜ )

= Hilbert space of square integrable functions in ℜ

j

= variable of the reference dataset

J

= total number of variables of the reference dataset

k

= sample for the process variables in the 3D data matrix of regular shape

3

List of symbols

ki

= sample for the process variables in the 3D data matrix of irregular shape

Ki

= total number of samples of observation i

m

= decomposition scale

M

= selected decomposition level

M1

= decomposition level selected for image denoising

Mr

= matrix of rank 1 of the rth latent variable

MRPEi,q

= mean relative prediction error for quality variable q in batch i during a single estimation phase

n

= counter

nel

= size of the edge length (pixel)

niwE

= size of the image width for the edges

niwV

= size of the image width for the valleys

nlevels

= number of selected topological levels

nsample

= total number of quality samples in an estimation phase

ntsw

= size of the trans-section width (pixel)

Nx

= length of the signal x

NA

⎛ mg KOH ⎞ ⎟⎟ = acidity number ⎜⎜ ⎝ g resin ⎠

Nimage

= number of images of edge segments

pi,j

= element of the ith row and jth column of the matrix P

pj

= row vector referring to the jth variable of the loading matrix P

pr

= loading of the rth latent variable of X

p Tr

= transpose of the loading of the rth latent variable of X

P

= probability function

P

= loading matrix of X

PT

= transpose of P

Pr

= matrix of the loadings for all the J variables and all the K samples

PRESS

= prediction error sum of squares

q

= quality variable

qr

= loading of the rth latent variable of Y

Q

= total number of quality variables

4

List of symbols

Q

= loading matrix of Y

QT

= transpose of Q

r

= generic counter

R

= rank of the X matrix

ℜ

= space of the real numbers

RMSECV

= root-mean-square error of cross-validation

R(X)

= 100(1-α) % confidence region of the likely value of a population containing

X = generic counter and spatial coordinate

s s

c Tj

= standard deviation of the contributions of variable j over all the I

2

observations of the reference for the Hotelling statistics = standard deviation of the contributions of variable j over all the I

sc E j

observations of the reference for the SPE statistics sr

= semi-axis of the confidence ellipse for the rth latent variable

s

= coordinate in the domain of pixel space (squared pixel)

Sm,n

= approximation coefficient at the mth scale of wavelet decomposition

S m +1,(n1 ,n2 )

= approximation coefficients of the multiresolution decomposition of an image

SPEi

= squared predicting error for the observation i

SPEI+1

= squared predicting error for the validation observation xI+1

SPElim (α )

= upper limit of SPEi at a confidence level 100(1-α)%

S

= estimated value of the covariance matrix Σ

Sm

= approximation matrix of an image at the mth wavelet decomposition scale

t*

= maximum time horizon for the prediction of the batch length (h)

t

I −1,

α 2

= Student t-distribution for I-1 and

α degrees of freedom 2

t lim (r , α )

= univariate limit at 100(1 − α )% confidence level for score tr

t1

= score vector of the first principal component

ti

= row vector referring to the ith observation of the score matrix T

tˆ I +1

= projection of the validation observation x I +1 onto the latent subspace

tr

= score vector of the rth principal component of X

t Tr

= transpose of tr

5

List of symbols

T2

= Hotelling statistics

TAAEi

= time-averaged absolute error of the batch i (h)

T(a,b)

= continuous approximation of the signal x for the wavelet decomposition by means of ψ at location b and scale a

Ti 2

= value of the Hotelling statistics for the ith observation

TI2+1

= value of the Hotelling statistics for a validation observation xI+1

2 ( A, I , α ) Tlim

= confidence limit of the Hotelling statistics at the 100(1-α) % level of confidence for a system of A latent variables and I samples

Tm,n

= detail coefficient at the mth scale of wavelet decomposition

Tmdenoised ,n

= denoised approximation of the wavelet decomposition of i0 at M1 scale

Tmh+1,(n1 ,n2 )

= horizontal detail coefficient of an image

Tmv+1,(n1 ,n2 )

= vertical detail coefficient of an image

Tmd+1,(n1 ,n2 )

= diagonal detail coefficient of an image

T

= score matrix of X

Tmh

= horizontal detail matrix of an image at the mth wavelet decomposition scale

Tmv

= vertical detail matrix of an image at the mth wavelet decomposition scale

Tmd

= diagonal detail matrix of an image at the mth wavelet decomposition scale

ur

= score vector of the rth latent variable of Y

u Tr

= transpose of ur

U

= score matrix of Y

VIPj

= importance of the variable j in the projection methods

wr

= weight of the rth latent variable

w Tr

= transpose of wr

W

= matrix of the weights

x

= generic signal

xi,j

= element of row i and column j of the X matrix

xi,j,k

= element of the X matrix

xi , j , k

= moving average of the variable j on batch i in the k-th time instant, element of the Xi matrix

6

List of symbols

xm

= approximation of the signal x at the mth wavelet decomposition scale

xi

= row vector of the ith observation of the X matrix

x I +1

= vector of a validation observation

xˆ I +1

= projection of x I +1 onto a latent space

xi,j

= jth variable time profile in batch i in form of column array of the Xi matrix

xj

= jth variable column vector of the X matrix

xj

= average value of the jth variable (column) of X

x i−,Δj K

= vector of the jth variable time trajectory for the ith batch lagged of –ΔK time instants

X

= reference data matrix of the process

X

= array of the mean values of the variables of X

ˆ X

= projection of the X matrix onto the space of the latent variables

X

= 3Dreference data matrix of the process variables

X0

= 2D data matrix at the zero decomposition scale

XBWU

= 2D data matrix derived form X form batch-wise unfolding

X BWU

= input matrix of the moving averages for the MATP-PLS model

XD

= matrix of lagged variables

Xi

= ith horizontal slice of X, i.e. matrix of the trajectories of all the J variables in all the Ki samples in time or space for the observation i

Xi

= matrix of the moving average data of the ith batch

X iD

= matrix of lagged variables for the ith batch

Xj

= jth vertical slice of X, i.e. matrix of the time/space evolution of the variable j for all the samples K and all the observations I

Xk

= kth vertical slice of X, i.e. matrix of the time/space sample k for all the J variables and all the I observations

XL

= augmented matrix with lagged variables for the LTP-PLS model

XM

= 2D data matrix at the Mth decomposition scale

XT

= transpose of X

XVWU

= bi-dimensional data matrix derived by variable-wise unfolding X

yi,q,h

= element of the Y matrix

yˆ i ,q ,h

= estimated value of yi,q,h

7

List of symbols

yˆ I +1

= estimated value of a quality index for the (I+1)th observation

Y

= matrix of the quality variables

Y

= three dimensional reference matrix of the quality variables

Yi

= ith horizontal slice of Y, i.e. matrix of the trajectories of all the Q quality variables in all the Hi samples in time or space for the observation i

zα

= normal standard deviate corresponding to the upper 100(1-α)% percentile

Greek symbols α

= percentile of the confidence limits

δ r ,s

= Kronecker delta

εi

= instantaneous error of estimation of stage length in batch i

θ

= generic parameter

θn

= parameter of the Jackson-Mudholkar equation

Θ

= space of all the possible parameters θ

λ

= forgetting factor

Λ

= diagonal matrix of the eigenvalues λr

λr

= eigenvalue of the rth latent variable

μ

= viscosity

μ0

= vector of the expected values of the J variables of the matrix X

φ m,n

= discretized father wavelet

φ(s )

= bidimensional wavelet function

Σ

= covariance matrix

τ

= batch length (h)

τi

= actual length of the stage in the same batch

τ∗

= number of samples corresponding to the time horizon t*

τˆ i (t )

= prediction at time t of the stage length in batch i

ψ

= mother wavelet function

8

ψ a,b

List of symbols

= mother wavelet function for a dilation parameter a and a location parameter b

ψ *a ,b

= complex conjugate of a “mother” wavelet function ψ a,b

ψ m ,n

= discretization of the mother wavelet function

ψ h (s )

= bidimensional horizontal wavelet

ψ v (s )

= bidimensional vertical wavelet

ψ d (s ) χ v2,α

bidimensional diagonal wavelet = χ2-distribution with v and α degrees of freedom

Acronyms

2D

= bi-dimesional

3D

= three-dimensional

AR

= autoregressive

ARMA

= auto-regressive moving average

BWU

= batch-wise unfolding

CA1

= carboxylic acid 1

CA2

= carboxylic acid 2

CD

= critical dimension

CD-SEM

= tool for the measurement of the CD through a SEM

D1

= diol 1

D2

= diol 2

DA1

= dioic acid

DPCA

= dynamic PCA

DPLS

= dynamic PLS

IC

= integrated circuit

IID

= independent identically distributed

LAN

= local area network

LER

= line edge roughness

LTP-PLS

= lagged three-phase PLS

9

List of symbols

LV

= latent variable

LV1

= first latent variable

LV2

= second latent variable

MATP-PLS = moving-average three-phase PLS MIA

= multivariate image analysis

MPCA

= multiway PCA

MPLS

= multiway PLS

NIPALS

= non-iterative partial least squares algorithm

NOC

= normal operating conditions

OLE

= object linking and embedding

OPC

= OLE for process control

PC

= principal component

PC1

= first principal component

PC2

= second principal component

PCA

= principal component analysis

PLC

= programmable logic controllers

PLS

= partial least squares method (projection on latent structures)

PV

= process value

P&ID

= pipelines and instrumentation diagram

RGB

= red, green, blue

RTU

= remote terminal units

SCADA

= supervisory control and data acquisition

SEM

= scanning electron microscopy (or microscope)

SIMPLS

= straightforward implementation of modified PLS

SP

= setpoint

SPC

= statistical process control

SQC

= statistical quality control

SQL

= structured query language

SWA

= side wall angle

TP-PLS

= three-phase PLS method

UV

= ultra violet

VIP

= variable importance in the projection methods

10

List of symbols

VO

= valve opening

VWU

= variable-wise unfolding

Chapter 1 Thesis overview and literature survey This Thesis is concerned with the development of technologies for product quality monitoring in the batch manufacturing of high value added goods. Two kinds of products are considered: those whose “quality” is determined by chemical/physical characteristics (e.g., viscosity, concentration, …), and those where surface properties (e.g. texture, roughness, …) define “quality”. Two main issues are investigated: i) the development of a strategy to design of soft sensors for the online estimation of product quality and the realtime prediction of batch length in batch chemical processes; and ii) the development of a strategy to design of automatic systems for surface characterization in the manufacturing of hardware devices. Tools from multivariate statistical analysis (namely, projection to latent subspaces) are used to develop the proposed technologies. In this Chapter, after an outline of the aims of the Thesis, the concepts of quality and statistical quality monitoring are briefly reviewed. Then, a survey will follow on the use of multivariate statistical tools for statistical process control, with particular reference to batch processes, for which several challenges are still open for investigation. A roadmap to the reading of the Thesis will conclude the Chapter.

1.1 Aim of the project Ensuring the conformance of the final product to a predetermined standard is of vital importance in high value added manufacturing in order to achieve the success in today’s increasing competitiveness of the global market. However, satisfying the requirements of the customers and meeting reproducibility and high quality of the final product is particularly difficult in most processes. Furthermore, most of the manufacturing processes are inherently multivariate, and quality itself is the multivariate expression of a plurality of indices that are related to process, possibly subject to visual features, and sometimes to personal judgement as well. The aim of this project is the development of multivariate statistical tools that enable to monitor the product quality in batch manufacturing systems in a systematic manner, in such a way as to analyze quality through the information embedded in process data or in images of the product. The proposed techniques are applied to different case studies:

12

Chapter 1

• the development of a strategy to design multivariate statistical soft sensors for the estimation of the product quality and for the prediction of the batch length in batch processes; • the development of a strategy to design an automatic method for the monitoring of the surface quality of a product through multiresolution and multivariate image analysis. The systems for the realtime estimation of product quality and for the realtime prediction of the batch length are applied to the case of a real-world industrial process for the production of resins by batch polymerization. This case study demonstrates that the proposed techniques are effective strategies to help the online adjustment of the process recipe when the quality deviates from the nominal conditions and before the final product is affected. Furthermore, these are a valid support for the organization of the production and for the scheduling of the use of the equipment and the coordination of the labour resources. The novel methodologies developed for the automatic characterization of the surface quality by image analysis are applied to the case of the surface monitoring in the afterphotolithography inspections that are carried out in the manufacturing of integrated circuits. In detail, a fully automatic system for the assessment of the surface characteristics of a semiconductor is developed to perform the monitoring of both the surface roughness and the surface patterns. To sum up, the main contributions of the PhD project are: • the development of innovative technologies for the online estimation of the product quality in batch processes; • the non-conventional application of latent variables subspace methods for the prediction of the length of batch processes; • the development of new methodologies for the multiresolution and multivariate systematic monitoring of the product quality from images of manufactured products.

1.2 Introduction to quality and statistical quality monitoring The quality movement traces its roots back to the late 13th century, when European craftsman began organizing into “guilds”, responsible for suggesting strict rules on the product and service quality, for adopting inspection committees, and for promoting special marks for flawless goods. Later, the industrial revolution followed this example. However, it was only after World War II that the idea of the “total quality” was introduced, and the notion of “inspection” extended to process technology improvement. Nowadays, “quality” embraces the entire organization of a company and, in the increasing competition of the global market, it is of critical importance that every process can manufacture high quality products with maximum yield. Meeting quality requirements is especially difficult when products consist of large numbers of components, or when processes consist of dozens, even hundreds, of

Thesis overview and literature survey

13

individual steps (Seborg et al., 2004). For example, batch processes for chemical manufacturing and microelectronic fabrication are carried out through a series of operating steps, where quality in each stage is strictly related to the quality of the other stages and heavily influence the final product quality. This results in the need of quality-oriented technologies. On October 1st, 2008, during the meeting on the “Future of quality” of the American Society for Quality (Milwaukee, WI, USA), it was pinpointed that the 21st century technologies are one of the key forces that will shape the future of the quality (http://www.asq.org/index.html). This PhD Thesis inserts in this scenario, developing automatic techniques for the realtime quality assessment in the high value added productions. The concept of quality is still not completely defined. In the common sense, quality is the degree of excellence of a product, a process, or a service. From the engineering point of view, quality is assumed to be a measurement of the conformance to a required standard, to guarantee high performances in terms of reliability, serviceability, durability, etc… (Montgomery, 2005). Namely, the purpose of quality is not only to force a product or a process to respond to predetermined features in order to reach a target or a nominal value in terms of physical, sensory, or time-oriented characteristics (quality of design), but also to improve the product and the process performances in order to reduce the defectiveness, the scraps, the costumer complaints, the rates of waste and of rework (quality of conformance). Therefore, the aim of quality monitoring is not only to monitor the quality of design, but also the quality of conformance (Montgomery and Runger, 2003). In summary, quality is inversely proportional to variability. Since the variability is an inconsistency that introduces unevenness and determines the major sources of poor quality, the improvement of quality can be reached through the decrease of the variability in products and processes. To reduce the variability, one of the most effective tools is the systematic use of statistics. In his pioneering work, Shewhart (1931) showed how the fundamental steps of the engineering quality control (i.e.: specification of the process goals; fabrication of in-spec products; and tests on the fabricated devices) can be traced by statistical quality control (SQC). SQC fixes (statistical) limits on the state of the production, and improves the uniformity of the quality, assessing the agreement of the product/process to an optimal reference. SQC has gained increasing interest both by the research community and by the industrial one (Hare, 2003). It should be acknowledged that quality is a synopsis of multiform attributes, depending on a composite combination of related parameters, which are often not accessible by common instrumentation hardware, sometimes not even measurable or quantifiable. Otherwise stated, quality is an inherently multivariable attribute. Furthermore, quality is often related to the values of all the process variables that can be measured during the product manufacturing. On this basis, classical SQC has moved a step forward to statistical process control (SPC) (Geladi and Kowalski, 1986; Wold et al., 1987; MacGregor et al., 1991; Jackson, 1991). SPC unveils

14

Chapter 1

the multivariate nature of a system and, furthermore, it can relate the quality parameters to the conditions in which the production process is carried out (Kresta et al., 1991; MacGregor et al., 1991).

1.3 Multivariate statistical techniques for process monitoring Generally speaking, SPC is a field of technology expansion, whose philosophy is to supervise the process performances over time for emphasizing the anomalous events leading to the degradation of the quality specifications (Kresta et al., 1991; Romagnoli and Palazoglu, 2006). Therefore, the goal of SPC is the quick and reliable detection of the existence, the amplitude and the time of occurrence of the changes that cause a process or a quality feature to deviate from a prescribed standard in the manufacturing of a product. SPC supports this task (MacGregor et al., 1991; Kourti and MacGregor, 1995; Seborg et al., 2004) and facilitates to quantify the probability in observing a process behaviour that does not conform to the expected one (Nomikos and MacGregor, 1994; Flores-Cerrillo and MacGregor, 2002 end 2003; García-Muñoz et al., 2003). Consequently, SPC not only provides underlying information on the state of a plant or of a product, but also assists the operators and the process engineers to remedy a process abnormality (fault1). The results are safer operations, downtime minimization, yield maximization, quality improvement, and reduced manufacturing costs (Chiang et al., 2001; Edgar, 2004). Since in the industrial practice every process exhibits some variability regardless how well it is designed, operated, and instrumented, it is important to discriminate between the common cause (natural and random) variability, which is a cumulative outcome of a series of unavoidable phenomena, and the abnormal (non-random) variability triggered by assignable causes, such as process changes, faulty conditions, errors, etc… The common cause variability is a sort of “background noise” that should operate with only “chance causes of variation” (Montgomery, 2005). This allow processes/products to stay in a state of statistical control. Unfortunately, other kinds of variability may occasionally be present in the output of a process, arising form improperly maintained (or controlled) machinery, operator errors, defective raw materials, unavoidable events, etc… The assignable causes lead to unacceptable levels of process performances or product defectiveness, and determine an out-of-control state. SPC helps investigating what does not work in a process and assists in undertaking the corrective actions before non-conforming products are manufactured. Therefore, monitoring is not only understanding the status of the process, but also the possibility of controlling the product quality. Direct inspection of the quality is usually impractical or, at least, delays the discovery of the abnormal process conditions, because the appearance of the defects in the 1

A fault is an unpermitted deviation in a system (i.e.: process changes, disturbances, problems to sensors or actuators), which is often not handled adequately by process controllers.

Thesis overview and literature survey

15

final product takes time. However, information about the quality is encoded in the process variables, which are often measured online, frequently and in an automatic fashion, thus enabling the refinement of the measure information and the inference of the product quality (Kresta et al., 1994; Çinar et al., 2003). In this way one can examine both the process performance and the product quality, ensuring repeatability, stability and the capability of the process to operate with little variability around an assigned target (i.e., the nominal conditions). Accordingly, SPC is a powerful tool to achieve process stability and improving process capability (Montgomery and Runger, 2003). Traditional monitoring methods consist of limit sensing and discrepancy detection (Chiang et al., 2001). The limit sensing raises an alarm if the state of the observed system crosses predetermined thresholds, while the discrepancy detection raises an alarm depending on model accuracy. The limit sensing imposes some limits to the observations of every process variable, but ignores the relation of each variable with the other ones (i.e., it is univariate). To detect the departures from a prescribed state of statistical control, control charts can be used. Their use is entrusted because they are proven techniques for improving productivity, are effective in defect avoidance, prevent unnecessary process adjustments, and provide diagnostic and process capability information. In statistical terms, the control charts are hypothesis testing techniques2 that verify if a process/product is in a state of statistical control. The in-statistical-control condition is the null hypothesis3 to be proved. The null hypothesis is verified, with a certain degree of uncertainty (level of confidence or significance) when the status of the observed phenomenon stays in proximity of the nominal conditions. Being the nominal conditions identified by the process average conditions, and the amplitude of the confidence limits identified by the common cause variability, moving the limits farther from the average conditions (rising the degree of uncertainty) decreases the risk of type I error4 (false alarm), and increases the chance of type II error5 (scarce sensitivity). The procedure suggested by Kourti (2003) for statistical process control develops through: • selection of the most representative observations (process data) from an historical database to the purpose of the model building. The selected observation should identify the so-called normal operating conditions (NOC) ; • pre-treating of the input data to facilitate the statistical analysis;

2

The statistical hypothesis testing is a methodology to make statistical decisions based on experimental data, almost always made rejecting, or failing to reject a null hypothesis. 3 The null hypothesis is a statement about a plausible scenario which may explain a given set of data and is presumed to be sufficient unless statistical evidence. The null hypothesis is tested to determine whether the data provide sufficient reasons to pursue some alternative hypotheses. 4 The type I error (or α-error, or false positive) is rejecting a correct null hypothesis, i.e. a false alarm. It occurs every time an out-of-control state is called by the monitoring charts when there is no assignable cause. 5 The type II error (or β-error, or false negative) is failing to reject a null hypothesis when it is false, i.e. an inadequate sensitivity. This is the risk that a point may still fall within the confidence limits of the monitoring charts when the status is really out of control.

16

Chapter 1

• model calibration; • checking the “observability” of the model, to test the efficiency of the monitoring model through a validatory procedure; • checking the performances of the monitoring model in the diagnosis of the special causes that affect a process or a product and determine a detriment of the quality or a loss of process performances. In typical industrial scenarios, hundreds, if not thousand of process data are available every few seconds, being collected online from process computers and stored in the supervision systems (Nomikos and MacGregor, 1995a; Nomikos, 1996). These data are characterized by spatial correlation (i.e. relations among variables) and serial correlation (i.e. relations among measurement of the same variable taken at different times or locations). Spatial correlation is due to the fact that several process variables are usually sampled throughout the process, and the response to a certain assignable cause affects several process variables. This means that the process variability is usually restricted to a much lower dimension than the one related to the number of variables collected in a process. The process data are serially correlated, as well, because of the relatively small sampling intervals. Furthermore, missing data and noise are often present. The need to handle correlation, noise, and missing data and the requirement to keep the dimensionality of highly correlated data to a reasonably low level calls for the calibration of multivariate statistical models, such as principal component analysis (PCA) and projection to latent structures (PLS, or partial least squares regression). PCA and PLS are data-driven methodologies with computationally non-expensive input-output model structures (Kresta et al., 1994; Cinar et al., 2003), whose frame is a typical black-box representation that derives from the historical data collected during experiments or industrial practice. For the purpose of SPC, PCA and PLS can be used to analyze process data, and to develop inferential models or statistical process control schemes (MacGregor et al., 1991). Both PCA and PLS extract the most important, systematic information hidden into process data, usually assembled in bidimensional (2D) matrices (observations×variables), and compress it through algebraic concepts, in such a way that the information is found in the correlation pattern rather than in the individual variables’ signals (Eriksson et al., 2001). Hence, massive volumes of highly collinear and noisy variables can be examined by projecting them onto a subspace made of few fictitious variables, called principal components (PCs) or latent variables (LVs), which explain the direction of maximum variability of the data and contain the greatest part of the relevant information embedded into data. Therefore, both methods are concerned with explaining the variance and covariance structure of a dataset through linear combinations (i.e.: PCs and LVs) of the original ones. This is the reason why PCA and PLS models are linear correlative representations, but not causal models. Note that PCA and PLS have slightly different meanings. In particular, if the case is interpreting and modelling one block of data (e.g., process data), PCA is the proper solution (Jackson, 1991; MacGregor et al., 1991;

17

Thesis overview and literature survey

Kourti and MacGregor, 1995). If it is necessary to investigate the relationship between two groups of data (e.g., process variables and quality variables) to solve a regression problem, the proper method is PLS, which can estimate or predict some response variables from a collection of predictor variables (Geladi and Kowalski, 1986; Höskuldsson, 1988; Kresta et al., 1991; Burnham et al., 1999; Wold et al., 2001). In summary, the former method maximizes the variance captured from the input data, while the latter maximizes the covariance between the predictor variables and the predicted ones. Although in this Thesis the main interest is in process engineering applications of multivariate statistical methods, several applications of these techniques are reported in the most diverse fields. An incomplete excerpt of some recent applications outside the process engineering community is reported in Table 1.1. Table 1.1 Topics of recent papers on applications of multivariate statistical methods in non-process engineering areas. Reference Dokker and Devis (2007) Giri et al. (2006) Škrbić and Onjia (2007) Viñasa et al. (2007) Harrison et al. (2006) Lee et al. (2008b) Tan et al. (2005) Whelehan (2006) Übeyli (2007) Giordani et al. (2008) Kirdar et al. (2008) Trendafilova (2008) Durante et al. (2006) Apetrei et al. (2007) Marín et al. (2007) Arvisenet et al. (2008) Clément et al. (2008) ElMasry et al., (2008) Quevedo et al. (2002) Doneski et al. (2008) Schievano et al. (2008) Viggiani et al. (2008) Liu et al. (2008) Qiao et al. (2007) Kim and Choi (2007) Liu et al. (2007b) Liu et al. (2005)

Area biology biology biology geology medicine medicine medicine medicine medicine energy bioprocessing mechanics food processing food processing food processing food processing food processing food processing food processing food processing food processing food processing food processing food processing image analysis image analysis image analysis

Topic sunflower and maize root cell structure study examination of the metabolism of nut alkaloids in mice detection of microelement content of wheat volcano surveillance texture analysis of non-Hodgkin lymphoma citoxicity of substances for cancer treatment persistence of pollutants in adipose tissue detection of ovarian cancer by proteomic profiles: automated diagnostic system for breast cancer electronic-nose for bio-diesel sources identification supporting key activities for bioprocessing vibration based damage detection in aircrafts wings fragrance sensing and taste estimation fragrance sensing and taste estimation fragrance sensing and taste estimation fragrance sensing and taste estimation fragrance sensing and taste estimation defects detection food classification and characterization food classification and characterization food classification and characterization food classification and characterization evaluation of aging or maturity quality survey face recognition mineral processing wood manufacturing

Finally, multivariate statistical techniques can be extremely useful in the analysis of data from non-conventional sensors (e.g., cameras) and are applied to the field of image analysis as multivariate image analysis (MIA; Geladi, 1995), either in some of the classic fields of chemical engineering, such as plastic material processing (Liu and MacGregor, 2005), steel industry (Bharati et al., 2004; Liu et al., 2007a), and furnaces flames control (Szatvanyi et al.,

18

Chapter 1

2006), or in other applications for high value added productions, namely wood manufacturing (Bharati et al., 2003), snack-food statistical quality monitoring and control (Yu and MacGregor, 2003; Yu et al., 2003) and food processing and packaging (Du and Sun, 2004; Brosnan and Sun, 2004; Du and Sun, 2008). Because batch manufacturing is the main focus of this project, in the following subsections a survey on how the SPC is applied to batch processes is presented.

1.3.1 Multivariate statistical process control for batch processes Batch and semi-batch processes are used to manufacture high value added goods, such as specialty chemicals and biochemicals, polymers, composites, pharmaceuticals, and materials for food, agriculture or microelectronics. With respect to their continuous counterpart, batch processes can accommodate multiple products in the same production facility, are flexible, easy to set up, and relatively simple to carry out, because the processing recipe usually evolves through a sequence of elementary steps performed in a assigned order to yield relatively small volumes of product with specified quality. Furthermore, for a batch process to be set up, it is often sufficient to have limited fundamental knowledge of the underlying process mechanisms. Although the batch manufacturing of a product is performed according to a given recipe, the product quality may show great variability, if no corrective actions are taken, and it is often difficult to manufacture multiple consistent products in accordance to strict requirements. In many instances, to meet the quality specification, only the batch duration is adjusted. Sometimes, the operating recipe can be corrected in real time in addition. There are several reasons that make batch monitoring and control an hard task (Seborg et al., 2004): the time varying characteristic of batch processes; their nonlinear and irreversible behaviour; lack of adequate mechanistic and fundamental models; lack of online sensors, sensor inaccuracy and infrequent sampling of quality indices; existence of constrains; unmeasured disturbances (i.e.: operators errors, fouling, impurities of raw materials, etc…). The data routinely obtained online from batch processes are not only multivariate in nature, but also nonlinear, highly auto-correlated and cross-correlated6, and time varying. The time variation implies that a new dimension should be taken into account in the data, i.e. the time. Namely, the data from batch processes can be collected in three-dimensional (3D) matrices (observations×variables×time) that hold both the variation between batches and the variation in time within a batch. PCA and PLS models are linear correlative models, which are valid when the correlation structure of the data remains unchanged in time. However, the correlation structure of the data usually changes during a batch run (Kourti, 2003). Moreover,

6

The auto-correlation identifies repeating pattern during time or along the space in a periodic signal. The cross-correlation is a measure of similarity between signals.

Thesis overview and literature survey

19

it changes not only within a batch, but also between batches, due to process changes, plant maintenance, sensor drifts, seasonal effects, etc… For this reason the multivariate statistical techniques evolved to embody not only the multivariable and correlative structure of the data, but also the nonlinearity and the time-varying nature of the batch data. To face the problem of time variation and change in the correlation structure of the data several methods have been suggested. Basically, four classes of approaches are highlighted in the literature: • nonlinear multivariate statistical methods, which are the traditional multivariate statistical techniques modified in a nonlinear manner and tailored for the nonlinear nature of the input data and the nonlinear correlation structure of the data; • multiway models, in which time is considered as an additional dimension of the data and the variability during time evolution can be assessed; • multiphase models, which split the data in series of segments in which a steady correlation structure of the data is preserved; • preliminary treatment of the data, in such a way as to rectify the inputs to a multivariate statistical method, either by decomposing the data signals in different resolution scales (e.g. through wavelets transform), or by de-correlating the dataset through auto-regressive moving average (ARMA) models or state space modelling. In the following sub-sections, the main characteristics and the limits of the abovementioned four classes of multivariate statistical methodologies are overviewed. 1.3.1.1 Nonlinear multivariate models Nonlinear multivariate statistical techniques were developed to overcome the problem of nonlinearity of the input data and of the nonlinear correlation structure of the data. The key strategy is to alter the algorithm of the PCA and the PLS to include the nonlinearity in the model, either through imposing nonlinear relation between variables (Wold et al., 1989; Baffi et al., 1999a), or through a neural network framework (Baffi et al., 1999b; Doymaz et al., 2003; Zhao et al., 2006b). The search for the right nonlinear structure of the model can be very demanding. 1.3.1.2 Multiway multivariate models When batch processes have to be examined and the third dimension (i.e.: time) is present in the data, the most popular multivariate statistical strategy is multiway SPC (Nomikos and MacGregor, 1994). Multiway PCA (MPCA) and multiway PLS (MPLS) are statistically and algorithmically consistent with PCA and PLS, respectively. In fact, MPCA and MPLS are equivalent to perform respectively PCA and PLS on an augmented 2D matrix derived by unfolding the 3D matrix.

20

Chapter 1

In the so-called batch-wise unfolding (BWU) method, the data are spread out in a 2D matrix that considers the data time order (Wise and Gallagher, 1996), putting side-by-side the time slices of the original 3D matrix. Simple pre-treatment of the input data (i.e., mean-centring7) can remove the major nonlinearity of the variables (Nomikos and MacGregor, 1995b). The result is that BWU-MPCA and BWU-PLS summarize the variability of the data with respect to both the variables and their time evolution (Kourti and MacGregor, 1995). Accordingly, the cross-correlation between variables is explained together with the auto-correlation within each variable. Namely, the entire history of the batch is taken into account and the batch dynamics is properly represented into the model. This is an effective approach for a batch-to-batch monitoring strategy, but some problems arise in the realtime monitoring during a batch run. In fact, not only the BWU approach starts to work well only by the time that at least the 10% of the batch history is already available (Nomikos and MacGregor, 1995b), but also it has two main drawbacks: i) the batch processes to be monitored must all have the same length, and ii) the entire history of a batch should be available during the batch evolution in order to be able to complete the 2D process data matrix. To solve the latter problem, Nomikos and MacGregor (1995a) suggested to fill the incomplete matrix under the hypothesis that either the future unknown observations conform to the mean reference conditions, or the current deviation from the mean variables’ trajectory remain unchanged for the rest of the batch duration. The problem of uneven batch duration is very demanding. Using the BWU-MPCA or BWUMPLS requires effective methods for the alignment and synchronization of the variables time trajectories, by stretching or shrinking the batch run to the length of a reference one. The most popular methods for the synchronization of the variables profiles are dynamic time warping (Kassidas et al., 1999) and indicator variable (Westerhuis et al., 1999). The latter method uses a monotonic variable as a batch maturity index, so that it is possible to align the batches, being the indicator variable an index of the percentage of batch completion. Otherwise, an indicator variable is not always available among the data. On the other hand, the dynamic time warping is a signal synchronization technique based on a pattern matching scheme of couples of trajectories, expanding or compressing a variable profile to match a reference one. Despite some attempts to streamline the computational burden (Kaistha and Moore, 2001; Ündey et al., 2002), the warping requires a very expensive algorithm structure and only few online applications of synchronizing strategies have been reported (Fransson and Folestad, 2006; Srinivasan and Qian, 2005 and 2007). Additionally, the synchronization is not always practicable, because it often entails the interpolation of the existing data in fictitious time points that can alter the auto- and cross-correlation structure of the data (Kourti, 2003).

7

Mean-centring is a pre-treating procedure operated subtract the mean of each variable to the actual value.

Thesis overview and literature survey

21

Alternative MPCA or MPLS strategies were developed. One such approach refers to a different unfolding methodology of the 3D data structure, i.e. the so-called variable-wise unfolding (VWU). VWU (Wold et al., 1987) spreads out the batch data in 2D matrices that preserve the direction of the variables, but do not consider the data time order. Variable-wise unfolded matrices are constituted by putting the horizontal slices of the original 3D matrix (i.e. observations) in vertical position one underneath the other. Using this procedure, neither estimating the future unknown part of the batch, nor synchronizing the batches are necessary. This results in easier online application than BWU approach, because filling the incomplete matrix with fictitious observations and aligning variables profiles of uneven length would introduce a certain degree of arbitrariness. On the contrary, VWU has the disadvantages that: i) it does not consider the time order, so the dynamics of the batch is lost, and the autocorrelation of the variables’ signals is not considered, and ii) the correlation structure is forced to be constant during the entire batch (Kourti, 2003). Accordingly, the issue in the VWU scheme is to take into account the dynamics of the process, the data auto-correlation, and the change of cross-correlation during time. The dynamics of a process can be included into a VWU framework assuming an autoregressive (AR) structure. An AR model regresses the present (or future) values of a variable through a linear combination of the values of the same variable at the previous time instants. This is completely consistent with the fact that in dynamic processes the current state depends on the past time points (Ku et al., 1995). This idea can be easily integrated into the VWU scheme by putting side-by-side the VWU data with the lagged version of the variables’ time signals in the so-called dynamic PCA (DPCA) and dynamic PLS (DPLS) procedures. Lu et al. (2005b) introduced a dynamic structure to compute the dynamic effect both within a batch and between consecutive batches. In general, DPCA and DPLS are straightforward methods to take into account the process dynamics, and the result is a much more limited correlation of the system (Chen and Liu, 2002). However, the issue of the data nonlinearity and the change in the correlation structure for the VWU approach are still present. 1.3.1.3 Multiple multivariate models Multiple model approaches based on a BWU strategy are: i) the local models (one model per sampling instant; Rännar et al., 1998); ii) the evolving models (one model for every sampling instants and all the past sampling instants; Louwerse et al., 2000; Ramaker et al., 2005); and iii) the moving window models (models for a limited part of the batch, the current sampling instant and few past observations; Lennox et al., 2001; Lee et al., 2004). The abovementioned multiple model approaches do not necessitate the filling of the incomplete data matrix with future observations. However, they require the synchronization of the batches, and involve a very large number of models, that is not always feasible.

22

Chapter 1

The alternative is splitting the process into a sequence of approximately linear segments (Kourti, 2003), following a multi-model structure based on the VWU-MPCA (or MPLS) analysis. At first, the need for phase division was introduced for the monitoring of multiple operating modes in continuous processes (Hwang and Han, 1999), but it revealed to be a viable and efficient solution for batch processes, too (Ündey and Çinar, 2002; Ündey et al., 2003a; Camacho and Picò, 2008a and 2008b). Therefore, more than one model is derived for a batch, each one for a different phase within the batch (Zhao et al., 2006a). The multiple phase modeling attenuates the problems related to the nonlinearity, and tracks the changes of correlation between variables during the batch. Camacho and Picò (2006a and 2006b), Lu et al. (2004a, 2004b and 2004c), Lu and Gao (2005a and 2006), Zhao et al. (2007a; 2007b) and Yao and Gao (2009) have designed different strategies for the automatic phase detection and switching. 1.3.1.4 Preliminary data treatment for multivariate statistical methods The preliminary treatment of the multivariate input data can be performed through: i) multiresolution methodologies of decomposition of the input signals on different frequency scales, or through ii) ARMA models and state space modelling to remove the correlation between data. The latter methods intend to erase any correlation on the latent space of the PCs (or LVs). Indeed, the multivariate statistical representations usually show high degree of autocorrelation of the PCs and the LVs. This determines high rate of false alarms in SPC systems. Furthermore, the filtering of the PCs (or LVs) with ARMA models can remove the autocorrelation. However, the univariate ARMA approach may not be sufficient for clearing the correlation, as demonstrated by Xie et al. (2006). Furthermore, the faults magnitude and time signatures of a process may be distorted by the ARMA filtering action (Lieftucht et al., 2006), so a Kalman innovation or state space models result to be preferable (Table 1.2) to better represent the multivariate case (Ljung, 1999). Table 1.2 Some papers on the methods for the data linearization based on Kalman innovations and state space models. Paper Xie et al. (2006) Lieftucht et al. (2006) Shi and MacGregor (2000) Li and Qin (2001) Treasure et al. (2004) Lee and Dorsey (2004)

Topic Kalman innovation Kalman innovation state space models state space models state space models state space models

In order to de-correlate the variables and to extract the deterministic features of a signal the wavelet transform can be used. The wavelet transformation produces a rectification of a signal

Thesis overview and literature survey

23

for any aperiodic, noisy, intermittent and transient signal, examining it in both the time and frequency domain (Addison, 2002). Mathematically speaking, the wavelet transform is a convolution of a wavelet function with a signal, which converts the signal in a more amenable way (Addison, 2002). In fact, the transformed version of the signal is filtered in such a way as to result more easily manageable (linear and stable white noise) by multivariate statistical techniques, making it suitable to work with data that are typically non-stationary and represent the cumulative effect of many underlying phenomena, each operating at a different scale, such as in batch processes (Kosanovich and Piovoso, 1997). In this way, the contributions of different scales of resolution are detected for all the events whose behaviour change over time and frequency. Once the signal is decomposed in different scales of resolution, the multivariate statistical model can be built both in the domain of the frequency (through the approximations and the details of the signal) and in the time domain (reconstructing the filtered version of the signal). Usually, one model is built for each decomposition scale (Bakshi, 1998; Bakshi et al., 2001; Yoon and MacGregor, 2004; Lee et al., 2005b; Maulud et al., 2006; Chang et al., 2006), and considers only the most interesting scales to the purpose of the monitoring, either by denoising the signal (Shao et al., 1999) or by removing the higher frequencies to avoid the effects of the process drifts or the seasonal fluctuations (Teppola and Minkkinen, 2000). Moreover, these techniques are very useful for an unambiguous fault detection (Misra et al., 2002) and isolation (Reis et al., 2008).

1.3.2 Multivariate image analysis In recent years, some attractive industrial applications involve the use of non-conventional and non-invasive sensors, such as cameras, for product quality characterization. Images are 2D light intensity mapping of a 3D scene, and are characterized by several challenging issues: • high dimensionality of the space, because images are may not only be monochromatic representations on gray levels, but may also have several channels of transmission (e.g.: RGB8 images, hyperspectyral images, etc…); • multivariate nature, because an image is an aggregation of a wide plurality of pixels9; • different characteristics in different scales of resolution; • high spatial correlation, because of the effect of neighbourhood of the pixels; • non-linearity, because of the physical structure of the object that is represented in the image; • combination of spatial and spectral information; • presence of noise, a random fluctuation of the light intensity that is an artefact of the signal.

8 9

RGB is a representation of the colours from an additive model derived by the primary colours red, blue and green. In digital imaging, the pixel is the smallest piece of information of an image arranged in a 2D grid.

24

Chapter 1

Multivariate statistical methods are ideal techniques to deal with the high dimensionality of the images and their inherent multivariate nature. Accordingly, multivariate image analysis (MIA) gained increasing interest (Geladi and Grahn, 1996) for both inferential modeling and statistical process control. MIA is a set of multivariate statistical techniques that allow to analyze images in a reduced dimension space rather than in the image space (Kourti, 2005). The aim of this approach is to extract subtle multivariate information from the image, in a different way from the usual digital image processing where the image is enhanced in such a way that its features become visible. Note that the problems of spatial correlation (correlation between pixel), neighbourhood, nonlinearity and noise can be faced analogously to what was suggested in the Section 1.3.1. Indeed, nonlinear models, as well as multiway, multi-model and multiresolution approaches, can be extremely useful and well tailored to the purpose of the image inspection. In fact, to a certain extent, it is possible to associate the concepts of neighbourhood and spatial correlation with the ones of process dynamics, auto- and crosscorrelation, and the concept of spatial nonlinearity to the one of temporal nonlinearity. Moreover, images combine spectral (in terms of both light intensity and colour) and spatial information. In the literature, the use of multi-resolution MIA is often suggested (Liu and MacGregor, 2007; Bortolacci et al., 2006), where the spectral information are properly studied by MIA classical approach, while the wavelet transform (Mallat, 1989; Ruttimann et al., 1998) is adopted to grasp the spatial information. Furthermore, the spatial information can be assessed including the study of the textural features of the inspected image (Salari and Ling, 1995; Tessier et al., 2007). In this way, effective frameworks are developed through image analysis for the task of either quality monitoring and control (Yu and MacGregor, 2003; Yu et al., 2003; Borah et al., 2007), or quality classification (Bharati et al., 2004), or quality prediction (Tessier et al., 2006).

1.4 Thesis overview As was mentioned earlier, the two main topics of this Thesis are the design for multivariate statistical techniques for: i) the realtime product quality estimation and length prediction in batch chemical processes, and ii) product quality monitoring through image analysis in batch manufacturing. The challenges of both topics are presented and discussed in the following.

1.4.1 Realtime quality estimation and length prediction in batch processes In principle, the operation of a batch process is easy, because the processing usually evolves through a “recipe”, i.e. a series of elementary steps (e.g.: charge; mix; heat-up/cool; react; discharge) that can be easily carried out even without supervision if the production facility is outfitted with a fairly large degree of automation. However, it is often the case that batch

Thesis overview and literature survey

25

plants are poorly instrumented and automated, and may require intervention by the operating personnel to provide online adjustments of the operating recipe with midcourse corrections to avoid the production of off-specification products. In fact, if the instantaneous product quality is not found to track a specified trajectory, the processing recipe must be adjusted in real time (possibly several times during a batch), and the batch is kept running until the end-point quality meets the specification. Unfortunately, most of the batch processes are run in an openloop fashion with respect to product quality control, because information about product quality is not available online, but is obtained offline from laboratory assays of few product samples. To contain the laboratory-related expenses (in terms of: need of dedicated personnel, consumption of chemicals, use of analysis equipment, etc…) only few product samples are taken during the course of a batch and sent to the lab for analysis. Even so, in a typical industrial scenario where several productions are run in parallel, 15,000-20,000 samples may need to be taken and analyzed each year, which add up to an important fraction of the total product cost. Because of the lack of real time information on the product quality, it may be difficult to promptly detect quality shifts and to counteract them by adjusting the operating recipe accordingly. Therefore, significant drifts on the quality profiles may be experienced before any intervention can be done on the batch. The net result is that the recipe adjustments are delayed, the total length of the batch is increased, and the economic performance of the process is further penalized. In this context, two typical challenges need to be addressed by a monitoring system in the production of specialty chemicals: the real time estimation of the instantaneous quality of the product, and the real time estimation of the length of the batch (or the length of any production stage within the batch). In fact, the performance of a batch process could be highly improved if accurate and frequent information on the product quality were available. Software sensors (also called virtual sensors or inferential estimators) are powerful tools for this task. They are able to reconstruct online the estimate of “primary” quality variables from the measurements of some “secondary” process variables (typically, temperatures, flow rates, pressures, valve openings), by using a model to relate the secondary variables to the primary ones. These issues are faced in this Thesis with reference to a real-world industrial case study, i.e. a batch process for the production of resins by polymerization. It is well known that developing a first-principles model to accurately describe the chemistry, mixing and heat, mass and energy transfer phenomena occurring in a batch process (e.g.: polymerization; crystallization; etc…) requires a very significant effort. Several designed experiments may be needed to identify the most representative set of equations and all the related parameters. Furthermore, if the plant is a multi-purpose one, this effort must be replicated for all the products obtained in the same facility. Finally, the resulting firstprinciples soft sensor may be computationally very demanding for online use.

26

Chapter 1

Multivariate statistical soft sensors may overcome these difficulties (Kresta et al., 1994; Chen et al., 1998; Neogi and Schlags, 1998; Chen and Wang, 2000; Kano et al., 2003; Kamohara et al., 2004; Zamprogna et al., 2004; Lin et al., 2007; Kano and Nakakagawa, 2008; Gunther et al., 2009). This class of inferential estimators does not require to develop extra information on the process in terms of mechanistic equations or values assigned to physical parameters. Rather, they extract and exploit the information already embedded in the data as these data become available in real time from the measurement sensors. Very often, a multivariate statistical method, i.e. PLS, can be exploited to design a soft sensor for the online estimation of quality properties. Several studies about the online estimation of product quality through multivariate statistical techniques are available for continuous polymerization processes. Most of the literature on the application of multivariate statistical methods to batch polymerization processes is related to the prediction of the end-point product quality only, or to batch classification, or is limited to simulation studies, as can be seen in Table 1.3. Table 1.3 Literature review on the estimation of the product quality in polymerization processes: papers and topics. Reference Russel et al. (1998) Komulainen et al. (2004) Lee et al. (2004) Lu et al. (2004b) Warne et al. (2004) Kim et al. (2005) Aguado et al. (2006) Sharmin et al. (2006) Zhang and Dudzic (2006) Zhao et al. (2006a) Yabuki and MacGregor (1997) Kaitsha and Moore (2001) Flores-Cerrillo and MacGregor (2004) Ündey et al. (2004) Zhao et al. (2008b) Nomikos and MacGregor (1995) Rännar et al. (1998) Chen and Liu (2002) Ündey et al. (2003a) Ündey et al. (2003b) Zhang and Lennox (2004) Lu and Gao (2005) Camacho and Picò (2006) Doan and Scrinivasan (2008) Zhao et al. (2008a)

Processing continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous batch batch batch batch batch batch batch batch batch batch batch batch batch batch batch

Problem realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation end-point estimation end-point estimation end-point estimation end-point estimation end-point estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation realtime estimation

Data industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial industrial simulation simulation simulation simulation simulation simulation simulation simulation simulation simulation

Very few papers present industrial applications of multivariate statistical software sensors for the realtime estimation of the product quality for industrial batch processes (Marjanovic et al., 2006; Chiang and Colegrove, 2007). In this PhD Thesis, multivariate statistical techniques are

Thesis overview and literature survey

27

proposed to provide the online estimation of product quality in batch industrial polymerization processes. There are several specialty productions for which the total batch length is not known a priori, nor is it the length and the number of the processing stages within the batch. Knowing in advance the processing time is useful for several reasons. In fed-batch processes, for example, fresh raw material and catalysts should be loaded into the process vessels at a convenient time instant to adjust the batch run in real time. The ability to estimate in real time this instant (which may change from batch to batch) can result in savings both in the number of quality measurements to be processed by the laboratory and in the required total processing time (Marjanovic et al., 2006). On a different perspective, realtime estimation of the total length of the batch can be very useful for production planning, scheduling of equipment use, as well as to coordinate the operating labor resources. For these reasons, the non-conventional use of multivariate statistical techniques for the realtime prediction of the batch length is suggested and discussed in the Thesis. The abovementioned multivariate statistical techniques are applied and implemented to an industrial case study of batch polymerization for the production of resins. This process is monitored online through a fairly large number of process measurements. Several challenging features are present in this case study: • process measurements are noisy, auto-correlated and cross-correlated; • quality measurements are available offline from lab assays, but are scarce, delayed with respect to the sampling instant and unevenly spaced in time (a case which is rarely considered in literature); • the batches evolve through a nominal recipe, which is subject to several online adjustments made by the plant personnel depending on the actual evolution of the batch, as it is monitored by the offline quality measurements, and their personal judgment; • the process is poorly automated; • the batch length exhibits a large variability. All of these features make each batch hardly reproducible, and the online quality estimation a challenge.

1.4.2 Multivariate statistical quality monitoring through image analysis There is a class of products whose quality is not related to chemical or physical properties, but to surface properties (like roughness, pattern, colour, texture, and the like). For these products, quality is assessed by the analysis of an image of the manufactured device. For example in semiconductor manufacturing image analysis is used for quality monitoring, but only for the task of measuring the most important physical parameters of the manufactured device, despite several other key features of the semiconductor which determine the device quality are hidden and remain unmeasured. In particular, image inspections are used in

28

Chapter 1

photolithography. Photolithography is a process that selectively removes parts from a thin film using light, so that a geometric pattern can be transferred (often from a mask) to a light sensitive chemical (the resist) deposited on a substrate. This process is used during the fabrication of integrated circuits (IC) as well as in many other micro-fabrication processes (e.g., micro-compressors in mechanics: Waits et al., 2005; in biotechnology applications: Lee et al., 2008a). In particular, a microelectronics manufacturing process comprises an extensive sequence of complex semi-batch processes (Helbert and Daou, 2001), among which photolithography is referred to as one of the most important (Blais et al., 2001). In fact, photolithography: i) recurs up to 35 times for a given device; ii) defines the wafer critical dimension (CD) and the other most influencing parameters; and iii) affects all the successive processing phases (e.g., the doping) and the interconnection between different segments of the device. From an economical point of view, the lithography is responsible for about 60% of the processing time and 35-40% of the total cost of the IC fabrication (Blais et al., 2001). As a consequence, it is quite clear that monitoring the product quality during photolithography through a fast, sensitive, and reliable system is highly advocated. Although considerable effort has been dedicated to define technologies and procedures to meet the requirements on the product quality (Guldi, 2004; Yaakobovitz et al., 2007), automatic process control has not yet been implemented on a large scale in semiconductor manufacturing, and the industrial practice is often carried out empirically with relatively little understanding of the underlying physics and chemistry (Edgar et al., 2000), or through run-torun control strategies (Zhang et al., 2007 and 2008). Statistical process control techniques, too, are sometimes adopted (Edgar et al., 2000; Yue et al., 2000; Waldo, 2001) in order to monitor the variability of the process, to detect the abnormal conditions, and to identify the cause for a perceived anomaly. Currently, the most advanced monitoring strategies exploit hardware and software devices for both signal filtering and image processing (Rao, 1996; Lee, 2001). For instance, the use of scanning electron microscopy (SEM) images is common for the measurement of the physical parameters of a device (Knight et al., 2006) such as the CD (Constantoudis et al., 2003; Patsis et al., 2003). However, the typical inspecting tools focus on inline optical metrology systems measuring the CD of the pattern and its variability; only the most sophisticated instruments also determine the edge height and the side-wall angle (SWA; El Chemali et al., 2004). Several important quality features like the line edge roughness (LER), the edge surface smoothness, the actual shape of an edge (and its variability) are still rather resilient to effective, fast and low-cost monitoring technologies. Only recently some researchers (e.g., Zhang et al., 2007; Yaakobovitz et al., 2007; Khan et al., 2008) have suggested procedures to start tackling some of the above issues. Thus, the demand of satisfying the multiple requirements of wafer fabrication and the dynamics of a quickly changeable microelectronics market call for new and more powerful

Thesis overview and literature survey

29

monitoring tools. The quality of the manufacturing could be greatly improved if fast and more meaningful information were retrieved in a reliable fashion. For this reason, an innovative methodology is presented to inspect the surface of a product. In particular, the main components of the proposed quality monitoring strategy are: • sensitive filtering pre-treatment, to denoise the image signal removing the artifacts (i.e., the non-systematic fluctuations of the image light intensity) without affecting the featured parts and their peculiar characteristics (i.e., the real surface roughness); • tailored multivariate statistical monitoring models, based on a principal component analysis approach, which extract the information content on surface roughness and patterned shape. In particular, the analysis is performed by PCA on different scales of resolutions. Innovative modifications of the PCA model are proposed to analyze both the surface roughness and the shape of the patterned surface. The effectiveness of the proposed approach is tested in the case of semiconductor surface SEM images after the photolithography process, but the approaches are general and can be applied also to inspect a product through different types of images or different phases of the same production systems, or through different types of processes.

1.4.3 Thesis roadmap Chapter 2 overviews the mathematical and statistical background of the methods adopted in this Thesis, i.e. multivariate statistical models and multiresolution techniques. In particular, PCA and PLS are presented, and the issue of both data pre-treatment and model enhancement are discussed. Finally, multiresolution methodologies are recalled. Chapter 3 describes the industrial process under study (i.e. production of resins by batch polymerization). Details on the plant and on the production recipe are provided. The industrial system of supervision is briefly presented. Chapter 4 show how to design a multivariate statistical estimators of the product quality for the processes under study. Different architectures of the soft sensor are presented, and improvements of the estimation performance are proposed by including a multiphase structure and dynamic information on the process. The problem of the prediction of the batch length is the topic of Chapter 5, in which the effectiveness of time-evolving methods is demonstrated. In Chapter 6, the industrial implementation of prototypes of the abovementioned soft sensors is briefly described. Chapter 7 deals with the development of a fully automatic monitoring systems for the characterization of the surface of high value added products by means of multiresolution and multivariate image analysis. Reference is made to the manufacturing of integrated circuits. A prototype interface for photolithography monitoring is also presented.

30

Final remarks conclude the Thesis.

Chapter 1

Chapter 2 Mathematical and statistical background This Chapter overviews the mathematical and statistical techniques that are adopted in the development of the multivariate and multiresolution quality monitoring strategies. Details about the multivariate statistical techniques and the multiresolution wavelet transformation are presented and discussed. In particular, the theoretical formulation of PCA and PLS is recalled. After that, it is shown how these techniques can be integrated in monitoring frameworks of batch processes. Finally, the wavelet transform techniques are reviewed, describing their ability to extract the properties of a signal through a multiscale decomposition.

2.1 Multivariate statistical techniques In the following sections, the mathematical and statistical background of the multivariate statistical techniques used in this Thesis is overviewed. In particular, details are given on both the principal component analysis and the projection on latent structures, from both the theoretical and the algorithmic points of view.

2.1.1 Principal component analysis (PCA) PCA is a multivariate statistical method that allows to summarize the information of a wide set of correlated data projecting them onto few fictitious orthogonal variables which capture the variability of and the correlation between the original data. Let suppose that a set of data (i.e.: I observations of J variables) are collected in an (I×J) X matrix from an in-control reference, after being conveniently pre-treated (see §2.1.1.2). PCA performs a decomposition of the original variables to a system of eigenvalues of the covariance matrix of X through a principal axis transformation (Jackson, 1991). In this way, PCA can find the combination of the J original variables that describe the most meaningful trend of the dataset. From a mathematical point of view, PCA relies on an eigenvector decomposition of the covariance matrix of X: Σ = cov(X ) .

This method splits the X matrix of rank R in a sum of R matrices Mr of rank 1:

(2.1)

32

Chapter 2

X = M1 + M 2 + K + M r + K + M R ,

(2.2)

in which every Mr matrix can be represented by the outer product of two vectors: the scores tr and the loadings pr: X = t1p1T + t 2p T2 + K + t r p Tr + K + t r p TR ,

(2.3)

where p Tr is the transpose of pr. This operation is a principal axis transformation that shifts the data in a set of uncorrelated data tr described by orthogonal loading vectors pr. In fact, the simplest way to reduce the dimensionality of the original dataset is to find a standardized linear combination p Tr X of the original variables (Härdle and Simar, 2007), which maximizes the covariance of the system to deal with the correlation between the original J variables:

[

]

[

]

(

arg max cov(p Tr X) = arg max p Tr cov(X )p r = arg max p Tr Σp r {p r : p r =1}

{p r : p r =1}

{p r : p r =1}

)

with r = 1,K, R . (2.4)

The solution of the optimization problem corresponds to the maximization of a quadratic form for points on a unit sphere, which is the following eigenvector problem (Johnson and Wichern, 2007):

(

)

arg max p Tr Σp r = λ r with r = 1,K, R , {p r : p r =1}

(2.5)

where the loadings pr are eigenvectors of Σ, and λr are the eigenvalues associated to pr: Σp r − λ r Ip r = 0 ,

(2.6)

being I the identity matrix, and pr the director cosines of the new coordination system on which the original data are projected. As a result, λr is a measure of the variance explained by the product t r p Tr , where variance assume the meaning of quantity of information embedded into the model. Geometrically, the scores are orthogonal: ⎧⎪var(t r ) = p Tr Σp r = λ r , ⎨ ⎪⎩cov(t r , t s ) = p Tr Σp s = 0 for r ≠ s

(2.7)

while the loadings are orthonormal:

⎧⎪p Tr p s = 0 for r ≠ s . ⎨ T ⎪⎩p r p s = 1 for r ≠ s Furthermore, if the PCs are zero mean and zero covariance, it follows that:

(2.8)

33

Mathematical and statistical background

⎧R ⎪∑ var(t r ) = tr (Σ ) ⎪ r =1 . ⎨ R ⎪ var(t ) = Σ r ⎪⎩∏ r =1

(2.9)

where tr(Σ) is the trace of the covariance matrix and Σ is the determinant of Σ. As underlined by Jackson (1991), the new variables tr are principal components of X and the terms of equation (2.3) are usually presented in descending order of the eigenvalues (explained variance). When data have a large number of highly correlated variables, X is not a full rank matrix and it is possible to represent it through a small number of PCs, in such a way that the greatest part of the variance can be captured by a limited number of latent variables, defining A PCs with A Tlim model (SPE I +1 > SPElim (α )) . This monitoring procedure is equivalent to test the hypothesis of conformance of xI+1 to the reference set for both TI2+1 and SPE I +1 . However, a word of caution regarding the use of confidence limits on scores and Hotelling statistics is advised by Wise and Gallagher (1996). In fact, the confidence limits can be found only under specified conditions. Indeed, when the hypothesis of normal and uncorrelated input data (IID, independent and identically distributed variables) is assumed, the central limit theorem1 can be invoked. The assumption that a sample is drawn from a IID population is necessary to obtain the distribution of the test statistics, to build the confidence limits, and to estimate the proportion of a population that falls within certain limits (Jackson, 1991). Actually, it is possible to invoke the central limit theorem, which states that the scores, which are linear combinations of the original variables, derived from a sufficiently large X dataset are normally distributed, only if the J variables of X are IID random variables. On the contrary, if the original variables are not IID, the abovementioned fundamental assumption is violated, and the scores are not normally distributed. This determines that the confidence limits can not be valid in the aforesaid form. Therefore, PCA and PLS models can be adequate representation of a phenomenon only if there is no data autocorrelation, the cross-correlation among variables is constant through the available samples, and the original variables are normally distributed (Kourti, 2003). In other words, the multivariate statistical techniques are successful only when common cause variability affects a process and when the process variables are normally distributed and independent over time or space. These conditions are rarely satisfied, and the processes products often show clear non-linear behaviour and changes in the correlation structure between variables, and in space and time. Finally, a geometrical interpretation of PCA and PLS is shown in Figure 2.2. The samples of an optimal reference are projected form the original space onto a space of reduced dimensions made of latent variables, which are the directions of maximum variability of the data. Within this sub-space, the new observations can be analyzed through the T2 and SPE indices. In particular, the T2 index indicates the distance of the new observation from the average conditions of the reference, and the SPE indicates the distance of the new observation from the hyper-plane of latent variables.

1

The central limit theorem states that the sum of a sufficiently large number of IID random variables will tend to be normally distributed.

47

Mathematical and statistical background

x1 PC1

High Ti 2: anomalies within the model

High SPEi: lack of model representativity

x2

PC2

x3

Figure 2.2 Geometrical interpretation of the confidence limits in multivariate SPC and SQM.

A large value of the Hotelling statistics (i.e. the new observation is out of the elliptical limits) is an indicator of an unusual variation within the model, while a large SPE value (i.e. the new observation overcome the limit perpendicular distance from the hyper-plane of the latent variables) identifies anomalies outside the model. 2.1.3.1 Contribution plots, limits on the contribution plots, and relative contributions When a new observation xI+1 does not meet the NOC and an abnormal variation is detected by

the monitoring charts, further analyses are needed to find which variable (or set of variables) causes the current state of the process (product) to be out-of-control (out-of-spec). The contribution of the J variables to the observed value of TI2+1 or SPE I +1 helps to make a sound guess for the assignable causes of the abnormality (Nomikos, 1996). The use of contribution plots is the most common approach to detect the root cause of the problem. The contribution plots evaluate the contribution of each primary variable j to the relevant monitoring statistics, either T2 or SPE. When an anomaly is detected by the Hotelling statistics or the residuals, it is helpful to compare the contribution of every original variable to the relevant statistics with the usual value of the contribution in the NOC identified by the reference dataset. For this reason, the use of confidence bounds for the contribution to the Hotelling statistics and to the residuals were proposed (Conlin et al., 2000).

48

Chapter 2

2

The contribution ciT, j of every variable j to the Ti 2 for an observation xi is determined by the square root of the Hotelling statistics of equation (2.44): c

−

T2 i, j

1 2

= t i Λ p Tj .

(2.61)

This is derived by the contribution cit, j of every variable j to the scores that compose the Ti 2 : cit, j = x i pi , j ,

(2.62)

where xi is the row vector of the X data matrix referring to the ith observation, ti is the row vector referring to the ith observation of the score matrix T, pj is the row vector referring to the jth variable of the loading matrix P, pi,j is an element of P. Similarly, the contribution ciE, j of every variable j to the square predicting error SPEi of the ith observation is a single element ei,j of the residual matrix E: ciE, j = ei , j

.

(2.63) 2

The values of ciT, j , cit, j , and ciE, j describe how each variable contributes to the Hotelling statistics, to the scores and to the residuals, respectively, and can be positive or negative (Westerhuis et al., 2000). In summary, it is possible to collect the contributions of all the J variables for all the I observations in the (I×J) X matrices of the contributions to T2 and SPE: 2

{ } 2

CT = ciT, j

i = 1,K, I and j = 1,K, J

,

(2.64)

and:

{ }

C E = ciE, j

i = 1,K, I and j = 1,K, J

,

(2.65)

respectively. T2 Based on the assumption that both the contributions ci , j to the T2 statistics and the E contributions ci , j to the residuals are IID, the 100(1-α)% confidence intervals for the contributions can be found by: c Tj,lim (α ) = c Tj ± z α s 2

2

2

c Tj

2

c Ej ,lim (α ) = c jE ± z α sc E , 2

j

,

(2.66)

(2.67)

49

Mathematical and statistical background

which are the upper (when the sign + is retained) and lower (when the sign - is retained) confidence bounds. The confidence limits of the contribution plots of equations (2.67) and (2.68) are calculated by the average contributions that a determined variable j assumes over 2 all the I observations of the reference c Tj and c jE : c Tj =

1 I T2 ∑ ci, j I i =1

c jE =

1 I E ∑ ci , j I i =1

2

,

(2.68)

,

(2.69)

and the respective standard deviations s s

2 c Tj

=

sc E = j

(

2 1 I T2 ci , j − c Tj ∑ I i =1

(

1 I E ∑ ci, j − c jE I i =1

)

)

,

,

c Tj

2

and sc E : j

(2.70)

(2.71)

2

where the mean values c Tj and c jE should be zero, because they should derive from standard normal distributions. Therefore, when the values of TI2+1 or SPE I +1 exceed the respective confidence limits during the monitoring of a new observation xI+1, instead of considering the absolute value of the contributions, the relative size of the contribution have to be inspected (Choi and Lee, 2005), and this can be done by comparing the contribution of a single variable to the average contribution of the same variable in the reference NOC. The cause of the TI2+1 or SPE I +1 E T2 alarm can be diagnosed by comparing the current values of the contributions ci , j or ci , j to the respective limits for the entire set of the original variables j=1,…, J. In particular, if 2 ( A, I , α ) and a variable j* (with j*=1,…, J) is found to satisfy: TI2+1 > Tlim c IT+1, j* > c Tj*,lim (α ) , 2

2

(2.72)

then j* is the variable that “feels” the effect of the fault on TI2+1 . In the same way, if SPE I +1 > SPElim (α ) and: c IE+1, j* > c Ej *,lim (α ) ,

(2.73)

for a determined j*, the variable j* is suspected to be the variable which mainly affected by the root cause of the anomaly. When (2.73) or (2.74) are satisfied for more then one value of j*, this means that the effect of a certain fault distributes on different variables. This situation

50

Chapter 2

can arise when the effect of the anomaly impacts on more than one variable. Otherwise, if the anomaly distributes on all the J variables, the variable with the highest contribution-toE E T2 T2 contribution limit ratio c I +1, j c j ,lim (α ) or c I +1, j c j ,lim (α ) is the most responsible for the perturbation of the system. Therefore, interrogating the relative contributions demonstrates to be one of the most powerful methods to get a diagnosis, whenever a fault is detected (Facco, 2005; Choi and Lee, 2005).

2.1.4 Enhancement for multivariate statistical methods In Chapter 1 it was mentioned that some of the main complications that may arise when dealing with data through multivariate statistical methods are: i) the varying nature of the data along time or space, and ii) the changeable correlation structure between data. In addition to being multivariate in nature, process data are often highly auto- and cross-correlated, and often non-linear. This situation determines that data are far from being normally distributed or independent either from other variables and from observations which are neighbour in time/space. For these reasons, the time/space varying nature and the change in the correlation structure of the data have to be taken into account. Furthermore, non-normally distributed input data make the application of the abovementioned control limits in the monitoring charts impossible. In particular, the methods described in the previews sections can be applied only to bidimensional matrices X (I×J) and Y (I×Q) of IID data. These methodologies are good mathematical representation of the relationship between variables only when the correlation between the J variables remain the same throughout the evolution of a batch. It is often the case that data have a determined order in time or space. This adds a third dimension on the data array, and the variability in the third dimension should be considered, in addition to the correlation between variables. For example, in chemical batch processes the variables are not steady state, but show time trajectories. Another example is the one of images which can be represented by matrices of light intensities (also in different spectral channels), where neighbouring data (i.e. pixels) are correlated in space. In these examples, process/product data can be collected in 3D matrices X (I×J×Ki) or Y (I×Q×Hi), where I different batches (or different images) are treated as different observations, while time, space or different spectral channels represent the third dimension. Ki and Hi are the number of the samples collected along time (space) for the observation i, respectively in the matrix X and Y. Correlation is present in both the direction of the variables j (cross-correlation) and the direction of the time/space samples ki or hi (auto-correlation). Further complications are added when the 3D matrices X and Y have irregular shape, due to the differences in the number of samples (ki or hi) taken in time (or space). Moreover, the time trajectories of different processes variables or quality variables are sometimes not

Mathematical and statistical background

51

synchronized between the I batches, or the spatial characteristics of an image are not aligned between the I observations, so K r ≠ K s and H r ≠ H s with r ≠ s , and r=1,…, I and s = 1,K, I . To deal with the changeable nature of the correlation structure between data and the varying nature of the data, multi-way multivariate statistical techniques are commonly used. 2.1.4.1 Multi-way methods, data unfolding and data synchronization/alignment When the third dimension (i.e. time or space), is present in the data, and when the data collected in an ordered manner are assembled in regular 3D matrices X (I observations× J variables× K samples), the multi-way SPC (Nomikos and MacGregor, 1994) demonstrated to be a very effective strategy. Multi-way PCA (MPCA) and multi-way PLS (MPLS) are consistent with PCA and PLS, respectively, from both the mathematical and the algorithmic point of view. In fact, MPCA/MPLS have the same aims and benefits of PCA/PLS, because they are equivalent to perform PCA/PLS on enlarged 2D matrices derived by unfolding the 3D data matrices: A

X = ∑ t r Pr + E ,

(2.74)

r =1

where the tr’s are the score vectors, and the Pr’s are the loading matrices of the loading for all the J variables and all the K samples, the direction of maximum variability for every variable in every sample in time or space, and E is the residual matrix. “Unfolding” is a technique to derive 2D matrices by spreading out the original 3D matrices in a meaningful way to highlight the relevant variability to be inspected. Different unfolding methods were developed (Kourti, 2003), corresponding to different ways to unfold the 3D matrix, but two of them are the most significant (Figure 2.3): • batch-wise unfolding (BWU); • variable-wise unfolding (VWU). The BWU unfolding spreads out the 3D data in 2D matrices XBWU that consider the time/space order of the data (Wise and Gallagher, 1996), putting the time slices of the original 3D matrix side-by-side along the direction of the batches. Considering a 3D matrix X={xi,j,k} with i=1,…, I, j=1,…, J, k=1,…, K, where: • the ith horizontal slice Xi is the matrix of the trajectories of all the J variables in all the K samples in time or space for the observation (i.e. batch or image) i; • the jth vertical slice Xj is the matrix of the time/space evolution of the variable j for all the samples K and all the observations I; • the kth vertical slice Xk is the matrix of the time/space sample k for all the J variables and all the I observations.

52

Chapter 2

Figure 2.3 Unfolding of the three-dimensional data matrix in both the variable-wise direction and the batch-wise direction.

MPCA and MPLS can be performed using PCA and PLS respectively on the batch wise unfolded 2D matrix:

X BWU = [X k =1 X k =2 K X k = K ] ,

(2.75)

which is a (I×JK) matrix. Mean-centring the batch wise unfolded data matrix (i.e., subtracting the mean trajectory of each variable) removes the major non-linearity of the input variables (Nomikos and MacGregor, 1995b), summarizing the variability of the variables with respect to both the variables and their time/space evolution (Kourti and MacGregor, 1995). Accordingly, the

Mathematical and statistical background

53

cross-correlation between variables is analyzed together with the auto-correlation in time/space within each variable. This means that, in the example of batch processes, the entire history of the batch is taken into account and the batch dynamics is properly represented into the model. In the example of the images, the spatial structure (or different spectral channels) are considered throughout the entire image in the BWU. Anyway, some difficulties arise in the realtime application of BWU to the case of batch processes, because data are collected sequentially and are available for the entire batch only after the completion of the batch itself. In fact, BWU is successfully applied to run-to-run monitoring and control strategies. However, when online applications are required some issues have to be faced. In fact, before batch completion BWU works well only if at least 10% of the batch history is already available (Nomikos and MacGregor, 1995b). Furthermore, BWU presents two main limitations about the data collected in real time: • data are often not synchronized-aligned; • data are not available for the entire batch to perform a sequential test during a batch run. The latter problem can be solved filling the incomplete matrix for the future unknown samples under three alternative hypothesis (Nomikos and MacGregor,1995a): • the future samples conform to the mean reference conditions; • the current deviation from the mean variables’ trajectory remain unchanged for the rest of the batch duration; • using the ability of the PCA and PLS to handle missing data. The abovementioned methods to treat missing data can be used to this purpose. The synchronization of batches of uneven duration can be a very demanding issue. Using MPCA or MPLS on batch-wise unfolded data requires effective methods for the alignment/synchronization of the variables image features or time trajectories, stretching or shrinking the actual observation to the length of a reference one. The most popular synchronization methods are: • the dynamic time warping (Kassidas et al., 1999); • the indicator variable (Westerhuis et al., 1999). The VWU (Wold et al., 1987) represents the data in 2D matrices XVWU (IK×J) that preserve the direction of the variables (Eriksson et al., 2001) and do not consider the data time or space order, because they are constituted putting the slices of the observations Xi of the original 3D matrix in vertical position one underneath the other:

X VWU

⎡ X i =1 ⎤ ⎢X ⎥ = ⎢ i =2 ⎥ . ⎢ M ⎥ ⎢ ⎥ ⎣Xi=I ⎦

(2.76)

54

Chapter 2

Using this procedure, it is neither necessary to estimate the future unknown part of the batch, nor to synchronize or to align the signals. The VWU approach is easier to implement online. However, if the variables do not consider the time/space order, the dynamics of the data is lost in batch processing, and the effect of the neighbourhood is lost in images. In summary, the auto-correlations are not considered in VWU. Furthermore, the VWU forces the correlation structure between data to be constant within the entire batch or image (Kourti, 2003). But considering a fixed and unchangeable correlation structure of the data is too a restrictive condition. Considering auto-correlation and the change of cross-correlation during time is the main difficulty of the VWU scheme.

2.2

Multiresolution decomposition methods

Multiresolution decomposition methods are techniques which transform a signal into a representation that is more useful and easily manageable (Addison, 2002). To perform this decomposition a transformation process is needed: the wavelet transform. This procedure entails the use of wavelet functions, which are localized waveforms that spread out the signal from the original domain to the domain of frequency. This means that it is possible to convert the signal in a series of profiles, which are more linear and more normally distributed.

(a)

(b) Figure 2.4 Example of “Mexican hat” wavelet (a) location in the domain and (b) dilation of the scale.

Mathematical and statistical background

55

The wavelet transform mechanism entails the comparison between a wavelet of scale a and location b and an arbitrary signal x. To carry out a proper decomposition, the waveforms can be translated varying its location b (moved along the domain in which it is defined, i.e. time or space) or dilated varying its size a (shrinking or widening the wavelet) (Figure 2.4). The transform results in a positive contribution when the signal and the wavelet are both positive or both negative, while the transform is negative if the signal and the wavelet are of opposite signs (Figure 2.5). The higher the correlation between signal and wavelet is, the higher the absolute value of the transform is.

Figure 2.5 Mechanism of transformation of a signal through wavelet transform.

This means that, when the signal trajectory has approximately the same shape and size of the wavelet profile in a determined location, the transform produce a large positive value, and vice versa when signal and wavelet are out-of-phase. As a consequence, the smallest size of the wavelet are correlated to the highest frequencies of the signal (i.e., noise), while the widest size of the wavelet are related to the long-term fluctuations of the signal, such as drifts or seasonal effects. Note that an advantage in the choice of multiresolution techniques is that the signatures of a signal in its domain (i.e. time or space) are maintained in the transformation of the signal from the original domain to the frequency domain. In fact, the transformed signal can be rebuilt preserving the time/space information (which is unfeasible, for example, in the Fourier transform). In the following sections the mathematical and algorithmic aspects of the wavelet transform are presented, together with their main applications.

2.2.1 Continuous and discrete wavelet transform In mathematical terms, the wavelet transform is the convolution of a signal x with ψ *a ,b (s ) , the complex conjugate of a “mother” wavelet function ψ a,b , integrated over the signal range:

56

Chapter 2

+∞

T ( a , b) =

∫ x( s )ψ

* a ,b

( s )ds .

(2.77)

−∞

where s ∈ ℜ identifies the domain. The localized and normalized waveform of the mother wavelet is: ψ a ,b ( s ) =

1 ⎛ s −b⎞ ψ⎜ ⎟ , a ⎝ a ⎠

(2.78)

where a is a dilation parameter and b a location parameter. These family of translations and dilations is a basis of the Hilbert space of square integrable functions L2 (ℜ) . The transformation procedure compare the signal to the mother wavelet, shifting its location b and shrinking or stretching the scale a. If the signal and the wavelet are both positive or both negative in the original domain, the transform will be positive, otherwise it will be negative. The practical implementation of the wavelet transform entails the discretization of the scales a and of the step size between different locations b. The discretization can be given by: ψ m ,n ( s ) =

⎛ s − nb0 a0m ⎞ ⎟⎟ , ψ⎜⎜ a0m a0m ⎝ ⎠ 1

(2.79)

where n and m are integer parameters which respectively control the wavelet dilation and translation. The size of the translation step is Δb = b0 a 0m and the transform becomes: +∞

1

−∞

m/2 o

Tm ,n = ∫ x( s )

a

ψ(a0−m s − nb0 )ds = x, ψ m ,n

.

(2.80)

The inner products of x and ψ m,n are called detail coefficients Tm,n. The simplest and most efficient discretization is the so called dyadic grid. It generates orthonormal wavelets, where a0=2 and b0=1:

(

ψ m ,n ( s ) = 2 − m / 2 ψ 2 − m s − n

)

.

(2.81)

In a discretized wavelet transform there is a finite number of wavelet coefficients, which require the evaluation of an integral. After having passed the signal through the abovementioned high-pass filter ψ m,n , another function φ m,n (the so called “father” wavelet) is needed to avoid the numerical complication. The father wavelet have the same form as the mother wavelet:

(

φ m ,n ( s ) = 2 − m / 2 φ 2 − m s − n

)

,

(2.82)

57

Mathematical and statistical background

which are orthogonal to its translation, but not to its dilation, and performs a low-pass filter, i.e. a scaling function, which establishes the multiresolution features of the wavelet decomposition. The convolution of the scaling function with the signal produces the approximation coefficients: +∞

1

−∞

m/2 o

S m ,n = ∫ x ( s )

a

φ(a0−m s − nb0 )ds ,

(2.83)

so that the continuous approximation of the signal at the scale m can be generated by summing a sequence of scaling function at the scale factored by the approximation coefficients: +∞

xm (s ) =

∑S

n = −∞

m ,n

φ n ,m (s ) .

(2.84)

This is an approximated, smoothed version of the original signal. Also the original signal can be rebuilt following the reconstruction representation of the inverse wavelet transform: x m −1 (s ) = x m (s ) + d m (s ) .

(2.85)

The reconstruction has no redundancy, because of the normality of the wavelet. The term dm(s) is constituted of the detail coefficients at scale m: d m (s ) =

+∞

∑T

n = −∞

ψ m , n (s ) .

(2.86)

m,n

The result is that a signal can be represented combining the approximation coefficients and the series expansion of the details: x(s ) =

+∞

∑ S M , n φ M , n (s ) +

n = −∞

M

+∞

∑ ∑T

m = −∞ n = −∞

ψ m ,n (s )

m ,n

(2.87)

where M is an index of the chosen scale. For a signal of finite length Nx, M = log 2 N x is the maximum number of scales which can be investigated with the dyadic grid discretization. In summary, the wavelet transform is a band-pass filter, which allows the components within a predefined and finite range of frequency to relapse into the detail coefficients at each scale (Addison, 2002). Namely, at each scale the original signal is increasingly cleansed by the higher frequency components, by means of two complementary filters: a low-pass filter and a high-pass one. From the numerical point of view, the wavelet pyramidal algorithm (Mallat, 1989) decomposes a signal trajectory sequentially in a series of approximated versions of the

58

Chapter 2

profile (lower frequency scales) and details (higher frequency scales), iterating the procedure at every decomposition level.

Figure 2.6 Schematic diagram of the algorithm for the wavelet transform filtering.

Passing through the filters, the original signal is split into two parts: the approximation (which retains the high scale and low frequency part of the signal), and the detail (which summarizes the high frequency, low scale part). In this way the original signal can be studied at different resolution scales, or denoised and detrended in a meaningful manner.

(a)

(b) Figure 2.7 Wavelet signal filtering: (a) down-sampling associated to the signal wavelet filtering and (b) up-sampling associated to the signal reconstruction from approximations and details.

Mathematical and statistical background

59

Note that, when the signal is convolved with a low-pass filter (moving the filter along the signal step-by-step of the discretized domain) a dyadic down-sampling is applied: the signal is down-sampled by a factor 2 generating the approximation, that contains the odd elements of the signal. The signal is also convolved with a high-pass filter and down-sampled to form the detail that contains the even elements of the signal (Figure 2.7). The down-sampling retains only the odd elements of the for the approximation only the and for the detail only. This means that the approximation and the detail at scale m+1 are half of the dimension of the signal at scale m. In the signal reconstruction from scale m to scale m+1, the filtering process is reversed, feeding back the larger scales components (approximations and details) through the filter, which up-samples the scales components approximation and detail and assemble them re-building the original signal. In mathematical terms this is an inverse wavelet transform. Different types of wavelet are available for signal transformation: mexican hat wavelet (the second derivative of a Gaussian distribution function); Haar wavelet (the most effective for the representation of the discontinuities); Daubachies wavelets (the most frequently used in the texture analysis; Salari and Ling, 1997); etc.... What can be inferred by the literature is that the selection of the most proper wavelet is case sensitive, but it is suggested (Ruttiman et al., 1998) the use of wavelets that: • determine limited phase distortion; • maintain a faithful localization on the domain; • de-correlate the signal in a sensitive manner for both the smooth features and discontinuities. Also the choice of the proper decomposition scale is case sensitive. However, some general methodologies to select the most relevant scales are available in literature, such as the comparison of some statistical indices in different scales of resolution derived by the signals and the respective approximations (moments, entropy, skewness, kurtosis, etc…; Addison, 2002). 2.2.1.1 Bi-dimensional wavelet transform In many applications (e.g. image analysis) the dataset is a 2D matrix in the domain of the variables s ∈ ℜ 2 (Mallat, 1989). The wavelet transform can be used either to compress the data in a meaningful manner, or to perform a multiresolution characterization of the matrix. In both the cases, two-dimensional wavelet transforms are required. The two-dimensional wavelet transforms can be generated by the tensor product of their mono-dimensional orthonormal counterparts (Addison, 2002), using the same scaling procedure as the one-dimensional scale on both the rows and the columns of the data matrix.

60

…..

scale

Chapter 2

Figure 2.8 Schematic diagram of the matrix manipulation to decompose the 2D array on a bi-dimensional grid through wavelet transform.

Two-dimensional scaling and wavelet functions can be defined as: • 2D scaling function: φ(s ) = φ(s1 )φ(s 2 ) ;

(2.88)

• 2D horizontal wavelet (in the sense of the rows): ψ h (s ) = φ(s1 )ψ(s 2 ) ;

(2.89)

• 2D vertical wavelet (in the sense of the columns): ψ v (s ) = ψ(s1 )φ(s 2 ) ;

(2.90)

• 2D diagonal wavelet: ψ d (s ) = ψ(s1 )ψ(s 2 ) ;

(2.91)

Mathematical and statistical background

61

where s1 and s2 are elements of the 2D domain defined by all the s ∈ ℜ 2 (e.g., spatial coordinates of images). Accordingly, the multiresolution decomposition can be expressed as: 1 ⎧ ⎪S m+1,(n1 ,n2 ) = 2 ∑∑ ck1 ck2 S m ,( 2 n1 + k1 , 2 n2 + k2 ) k1 k 2 ⎪ ⎪ h 1 ⎪Tm+1,(n1 ,n2 ) = 2 ∑∑ bk1 ck2 S m,( 2 n1 + k1 , 2 n2 + k2 ) k1 k 2 ⎪ ⎨ 1 ⎪T v ∑∑ ck bk S m,( 2n1 +k1 ,2n2 +k2 ) m +1,( n1 , n2 ) = ⎪ 2 k1 k2 1 2 ⎪ 1 ⎪T d ∑∑ bk bk S m,( 2n1 +k1 ,2n2 +k2 ) m +1,( n1 , n2 ) = ⎪ 2 k1 k2 1 2 ⎩

(2.92)

where k1 and k2 are scaling coefficients and n1 and n2 are location indices. The general idea of a 2D wavelet decomposition is shown in Figure 2.8. After the first decomposition, the original data matrix X0 is split into four distinct sub-matrices: an approximation S1; an horizontal detail T1h ; a vertical detail T1v ; and a diagonal detail T1d . In the next decomposition scale, the details are left untouched, and the next iteration decomposes only the approximation S1. The transformation at scale m=2 decomposes S1 in a new approximation S2 and the details T2h , T2v and T2d . This procedure can be iterated M times for a (2M×2M) matrix, where the dimension of the matrices Sm, Tmh , Tmv and Tmd is down-sampled to (2M-m×2M-m). Once more, the original matrix can be reconstructed as:

X 0 = X M + ∑ (D mh + D mv + D dm ) , M

(2.93)

m =1

where the matrix XM is the smooth version of the original matrix at the largest scale index M, while the D hm , D mv and D dm are the reconstruction of the details from the coefficients in Tmh , Tmv and Tmd , respectively.

Chapter 3 Industrial process for the production of resins by batch polymerization In this Chapter, an industrial batch process for the production of resins is presented as a typical example of batch polymerization process for the production of high value added goods. The process is discussed showing both the methodology for the production of resins (i.e. the recipe) and the plant in which the production is carried out. After that, the main challenges of the batch polymerization processes are briefly analyzed from the operational and the organizational point of view. Finally, it is shown how process and quality data are collected, how quality and process are usually monitored in the industrial practice, and how the recipe can be adjusted to pursue a determined target. Accordingly, the benefits of the implementation of soft-sensors for the online estimation of quality and for the real time prediction of a batch duration are explained.

3.1 The industrial production plant and the operating recipe Batch processing is used to manufacture high value added goods, such as specialty chemicals and biochemicals, products for microelectronics, and pharmaceuticals. With respect to their continuous counterparts, batch processes are easier to set up and often require only limited fundamental knowledge of the underlying process mechanisms. In principles, the operation of a batch process is easy, because the processing recipe usually evolves through a series of elementary steps (e.g.: charge; heat-up/cool; mix; react; discharge) that can be easily carried out even without supervision, if the production facility is outfitted with a fairly large degree of automation. Important features of batch processes are: i) their flexibility; ii) the fact that they allow to manufacture several different products in the same production facility; and iii) a consistently high and reproducible quality of the final product can be achieved by properly adjusting the operating recipe, in spite of changes in the raw materials and in the state of the equipment or of the utilities, with a limited degree of automation. To ensure consistency and productivity, batch plants often need the manual intervention of the operating personnel to correct the operating recipe or to adjust the conditions of the reaction environment. To the purpose of quality control, information about product quality is required, but is usually obtained from a

64

Chapter 3

scarce number of laboratory analysis of product samples taken from the reactor. The lack of online measurements on product quality determines a delayed detection of the quality shifts and difficulties to counteract them by adjusting the operating recipe. Therefore, a quality control strategy for a batch process often reduces to the online control of some key process variables, that are available online, and to a midcourse correction policy to compensate for the shifts detected in the product quality measured offline. All these characteristics of batch processes can be found in the case study considered in this Thesis: an industrial batch polymerization process for the production of resins. These products are high molecular weight polyester resins, produced by batch catalytic polycondensation of carboxylic acids and alcohols. Several different resins are produced in the same reactor in different production campaigns. In the following sections, the case of two resins is considered. To protect the confidentiality of the data, the resins under study are called resin A and resin B. The production of resin A and B are carried out running batches through prescribed sequences of operating steps, most of which are triggered manually by the operators. The switching from one operating step to the subsequent one is determined by the current values of the quality of the resin, which is determined by the combined values of two indicators, namely the resin acidity number (NA) and the resin viscosity (μ). The process recipe evolves through an operating policy that accounts for: • the initial load of raw materials, catalyst and additives into the reactor; • the mixing and heating-up to a preset temperature; • the reaction with simultaneous separation of water; • the vacuum phases; • the discharge of the final product. Since midcourse corrections of the recipe are always required, fresh raw materials and catalyst are added to the reaction environment to adjust the batch and to counteract the deviation from nominal conditions.

3.1.1 Resin A Resin A is a polyester resin manufactured via batch poly-condensation between a diol (D1) and a dioic acid (DA1). Besides the desired product, the polycondensation reaction leads to the formation of water, which must be removed from the reaction environment to promote the forward reaction. The typical sequence of the operating steps for the production of resin A runs as follows. Cleaning of the equipment and lines is done when a different resin has been produced in the preceding batch. Then, the reactants, additives and catalyst are loaded into the reactor. The charge of liquid D1 is automated, while the DA1 is charged manually as a solid. Being DA1 a product of fermentation, its quality may vary markedly; minor changes may be experienced in

Industrial process for the production of resins by batch polymerization

65

the quality of fresh D1. None of these quality changes in the raw materials can be detected in advance. During the reactor loading, the mixing and heating systems are switched on, and heat-up continues until the reactor reaches the set-point temperature (202 °C).

load to the reactor

diol D1 dioic acid DA1 catalyst anti-oxidizing promoter

mixing and heating

water separation

vacuum phase corrections (D1 or DA1) catalyst quality in spec?

final raise of reactor temperature

stop batch

reactor discharge

final product

Figure 3.1 Schematic representation of the single stage production process of the resin A (the broken arrows point to the fresh materials which are fed to the process or the products discharged by the reactor, while the full arrows indicate temporal transitions between different operations throughout the production recipe).

The raising temperature in the reactor activates the polycondensation reaction; hence, water is produced and must be removed to promote the forward reaction. Water is generated as a vapor phase that leaves the reactor. In the early stages of the batch, this vapor phase contains significant amounts of D1, which must therefore be recovered and recycled for further processing. Therefore, the vapor phase leaving the reactor is sequentially processed in the following ways: i) by differential condensation in the packed column, in such a way as to

66

Chapter 3

recover liquid D1 and recycle it back to the reactor; ii) by total condensation in the condenser; iii) by washing and contact condensation in the scrubber. Vacuum needs to be applied during the course of a batch to adjust the viscosity and the molecular weight distribution of the resin. Furthermore, to ensure that the final product quality is on specification, the operating recipe always requires at least two additions of fresh raw materials and catalyst during the course of a batch. The first addition is made before vacuum is applied for the first time. Therefore, when fresh materials and catalyst are charged into the reactor again, vacuum must be broken and then resumed. When the resin quality fails to approach the target values in the expected amount of time, further amounts of fresh material is charged. Following the operators’ jargon, these supplementary additions are known as “corrections” to a batch. Corrections are the way the operators act online to compensate for any unmeasured disturbance affecting a batch, and more than one third of the batches undergoes to corrections. When the end of the batch is approaching, the reactor temperature is increased to 220÷230 °C. The batch is stopped and the product is discharged when the resin reaches the desired quality targets in terms of quality indices. The duration of the batches for the production of resin A vary between 30 and 65 h.

3.1.2 Resin B Resin B is a high molecular weight polyester resin produced by catalytic poly-condensation of carboxylic acids CA1 and CA2 and alcohols D1 and D2 with additives and catalysts. Resin B is produced in the same production facility as resin A. A scheme of the production process is shown in Figure 3.2. first load to the reactor

carboxylic acids CA1+CA2 poly-alcohols D1+D2 catalysts Ct1 anti-oxidizing agent AO

stage 1

second load to the reactor

carboxylic alcohol C3 poly-alcohol D3

stage 2

catalyst Ct2

reactor discharge

final product

Figure 3.2 Schematic representation of the production process of the resin B.

Industrial process for the production of resins by batch polymerization

67

The production process of resin B is similar to the one of resin A. It differs because two distinct production stages are present. Raw materials, catalyst and additives are initially loaded to the reactor. When the charge is completed, production Stage 1 is started, and mixing, heating-up and separation via distillation column are performed. The objective of Stage 1 is to react partially the fresh materials until a pre-polymer with loosely specified characteristics is obtained. As soon as Stage 1 is completed, the pre-polymer is cooled down, and new ingredients are loaded into the reactor. Then, production Stage 2 begins and the prepolymer is further processed (through heating, water separation, and vacuum) using a fresh catalyst. In this way, the material is processed until it reaches the quality specifications. At that point also Stage 2 terminates, and the final product is discharged, and the processing equipment is ready for a new batch (cleaning of the equipment may be necessary).

3.1.3 P&ID of the production facility Resins A and B are produced in a plant whose process and instrumentation diagram P&ID is shown in Figure 3.3.

Figure 3.3 Process and instrumentation diagram of the batch polymerization facility.

The most important pieces of equipment of the plant are: • a stirred tank reactor (volume 12 m3) provided with external and internal coils; • packed distillation column (packing height 3 m); • water-cooled condenser;

68

Chapter 3

• vacuum pump; • scrubber; • thermal conditioning systems with utilities such as steam, cold water and dowtherm oil. The polymerization reaction is carried out in the reactor heated passing through an external coil. To promote the forward reaction, the water of polycondensation has to be removed from the reaction environment. The packed distillation column (which is run in dry mode for the production of the resins under study) separates water by partial condensation of the heavier compounds, and recycling them to the reactor. Downstream the column, the water separation is sequentially performed in ancillary equipment: a water-cooled condenser and a scrubber. A vacuum pump allows to operate the final part of the batch under vacuum, which is needed to get a narrower molecular weight distribution of the product and to guarantee safer operations.

3.2 Data acquisition Data from the plant are managed by a supervision system. The supervision system is Movicon 9.1®, a supervisory control and data acquisition (SCADA) scheme realized by Progea S.r.l. (www.progea.com). The supervision system communicates with three programmable logic controllers (PLC) S5 Siemens (DK3964 communication driver), two PLC S7 Siemens (via OPC, object linking and embedding for process control), and 30 Eurotherm regulators (via Modbus remote terminal units RTU drivers). The information form the plant is registered in a structured query language (SQL) database management system that is easily consultable by plant operators. This system allows both to execute the fundamental operations (e.g. activating the mixing system, switching on the vacuum pump, etc…) and to perform the regulatory actions, such as manipulating the set-points of the controlled variables.

3.2.1 Monitoring of the process variables A lot of hardware sensors and controller loops are present throughout the entire plant. In particular, the measurements of 34 variables (Table 3.1) are routinely collected by online sensors and recorded by process computers every 30 s. Typically, these measurements (such as temperatures and pressures) include process values (PV), setpoints (SP) and valve openings (VO) in different sections of the plant. Different sensors may be present for the measurement of the same variable. Some process variables have been discarded because they showed to be irrelevant from a statistical point of view, while some other are inappropriate from an engineering point of view. The selection of the most proper subset of process variables for the model calibration is case sensitive and is discussed in the following Chapters.

69

Industrial process for the production of resins by batch polymerization

Table 3.1 Process variables measured online by the monitoring system, and numbering of the respective sensors in the plant flow-sheet. process variables monitored online mixing rate (%) mixing rate (rpm) mixing rate SP vacuum line temperature (°C) inlet dowtherm temperature (sensor 1) (°C) outlet dowtherm temperature (°C) reactor temperature (sensor 1) (°C) dummy column head temperature (sensor 1) (°C) valve V25 temperature (°C) scrubber top temperature (°C) inlet water temperature (°C) column bottom temperature (°C) scrubber bottom temperature (°C) reactor temperature (Sensor 2) (°C) condenser inlet temperature (°C) valve V14 temperature (°C) valve V15 temperature (°C) reactor differential pressure (°C) dummy column top temperature PV (sensor 2) (°C) column top temperature SP(°C) V42 way-1 VO (°C) inlet dowtherm temperature PV (sensor 2) (°C) inlet dowtherm temperature SP (°C) V42 way-2 VO (%) reactor temperature PV (sensor 2) (°C) reactor temperature SP (°C) dummy valve V25 temperature PV (°C) valve V25 temperature SP (°C) valve V42 VO (%) reactor vacuum PV (mbar) reactor vacuum SP (mbar)

sensor number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

However, all the signals of the process variables are affected by noise, missing values and outliers, possibly caused either by maintenance operations, or by unintended interruption of the sensor connections, or by sensor faults. The setpoints are manipulated manually by the operators and this determines different time profiles of the controlled variables (Figure 3.4a and 3.4c). Moreover, the complex sequence of operations and the unpredictable number of corrections cause uneven batch length, as well as shifts and variations on the time trajectories of the process variables (Figure 3.4a and 3.4b).

70

Chapter 3

200

240 220 different duration

200 180 160

batch #1 batch #4 batch #6

140 120

reactor relative pressure (mbar)

reactor temperature (°C)

260

0 -200 -400 -600

-1000 0

batch #1 batch #4 batch #6

-800

1000 2000 3000 4000 5000 6000 7000 sample

0

1000 2000 3000 4000 5000 6000 7000 sample

(b)

260

260

240

240 temperature (°C)

dowtherm oil SP temperature (°C)

(a)

220 200 180 160 140 120

batch #1 batch #4 batch #6

outliers

0

1000 2000 3000 4000 5000 6000 7000

220 200 180 160

dowthermal oil inlet (sensor 1) dowthermal oil outlet reactor (sensor 1) dowthermal oil inlet (sensor 2) reactor (sensor 2)

140 120 100

0

1000 2000 3000 4000 5000 6000 7000

sample

(c)

sample

(d)

Figure 3.4 Process variables time trajectories of: (a) reactor temperature (different batches); (b) reactor pressure (different batches); (c) dowtherm oil temperature setpoint (different batches); (d) correlation between the time profiles of the reactor and the dowtherm oil temperatures of batch #4.

Finally, the sampling frequency determines strong auto-correlations on the data, and some variables are strongly cross-correlated (e.g. the dowtherm oil temperature and the reactor temperature profiles, Figure 3.4d). All these features make it impossible to monitor the process through the interrogation of the time profile of single process variables.

3.3 Empirical monitoring of the product quality The product quality is defined in terms of two indices: the resin viscosity (μ) and the resin acidity number (NA). However, realtime measurements of product quality are not available. Rather, product samples are taken manually, quite infrequently and irregularly (i.e., one sample each 1.5-2 h, depending on the operators’ availability and on the actual evolution of the batch) and sent to the laboratory for analysis; the full analysis takes about 20 min. Furthermore, the quality measurements are not available for the entire duration of the batch, because the product sampling is initiated 8-10 h after a batch starts. Typically, 15-25 quality

71

Industrial process for the production of resins by batch polymerization

viscosity

measurements per batch are available; the accuracy of the laboratory assay is about 10% of the reading. From an operation perspective, offline quality measurements suffer from two drawbacks. First, they are expensive, because, when quality assessment is needed, a sample is taken (manually) and sent to the laboratory for analysis. Therefore, specific personnel must be dedicated to sample collection and sample analysis (the number of sampling/measurements for a single product may easily exceed 10,000/year in a typical industrial scenario). Typically, a company would try to save on the personnel-related expenses by reducing the number of samples to be analyzed. Secondly, quality measurements are delayed: the analysis results are available at best 20 min after a sample has been taken.

Figure 3.5 A typical quality monitoring chart used industrially for resin B production during Stage 2. Acidity number is reported as the abscissa (decreasing values from left to right), and viscosity as the ordinate. Non-standard units are used for both quality indicators. Circles indicate a sample for which quality measurements are available from the lab. The measured values of acidity number and viscosity should always fall within the broken bounds. Realtime recipe adjustments are needed when a sample falls outside the bounds. Time increases (nonlinearly) from the lower-left corner to the upper-right one.

The evolution of product quality is monitored by using empirically-derived monitoring charts. For example, typical such a chart for quality monitoring of Stage 2 in the production of resin B looks like the one shown in Figure 3.5. If the sample quality is found to lie outside the broken bounds, the operators must adjust the production recipe according to a given procedure. Note that only few samples are taken to monitor Stage 2 in the case of Figure 3.5, despite the fact that this stage lasted as long as 32.5 h.

72

Chapter 3

More timely information on product quality evolution would be highly desirable, because the production recipe could be adjusted more promptly.

3.4 Challenges for the statistical monitoring of product quality The net result of the quite complex and mostly manually driven series of batch stages is that, although the end-point quality of a resin usually falls within a very narrow range, the “internal” variability of the batches is very large. Indeed, there are several sources of variability within a batch, most of which cannot be eliminated: • different state of the pieces of equipment (hot/cold); • optional cleaning of the lines and of the pieces of equipment; • variable state of the utilities (e.g. the heating system serves at the same time for several reactors in the same production facility and the duty of the heating furnace is fixed. This results in different durations of the heat-up period from one batch to another); • only the charge of liquid raw materials is automated, while the solid raw materials are loaded manually by operators. Consequently, errors in weighting and contaminations can be experienced; • the quality of raw materials may vary from batch to batch, because of different levels of impurities, different suppliers, loss of the activity of catalysts, etc…; • midcourse corrections, subject to laboratory measurement delay and operators’ judgment; • the switching from one operating step to the subsequent one is triggered manually by operators, depending on the actual value of the quality indices; • most of the operations performed manually by operators, depending on their availability, experience and judgment (e.g. the set points of the controlled process variables are manipulated manually by operators). Most of this variability reflects itself in the trajectories of the process measurements, and eventually in the product quality and in the total batch duration. Therefore, it is hard for the management to appropriately schedule both the production policy, and the use of equipment when several batches are to be processed in series or in parallel. All these situations make the quality estimation and the batch duration prediction a challenge.

3.5 Automated quality monitoring through soft-sensors The monitoring strategy based on the offline laboratory assays is not an efficient approach to guarantee high quality products and reproducible operations. Because quality measurements are available quite infrequently, the switching from an operating stage to the following one may be substantially delayed, with the result of a poor monitoring of the product quality and an increase of the duration of a batch. More timely information on product quality evolution

Industrial process for the production of resins by batch polymerization

73

would be highly desirable, because the production recipe could be adjusted more promptly. Delays on recipe adjustments may result in significant increase of the processing time and in potential loss of the end-point quality. The performance of a batch process could be improved if accurate and frequent information on the product quality and the batch duration were available. Software sensors are powerful tools for this task. They are able to reconstruct online the estimate of “primary” quality variables from the measurements of some “secondary” process variables (typically, temperatures, flow rates, pressures, valve openings), by using a model to relate the secondary variables to the primary ones. Therefore, the design of a soft-sensor for the online estimation of μ and NA is considered, with the objective to make available online frequent and accurate estimations of the product quality indicators, to avoid off-spec products and to obtain welltimed adjustments of the recipe. Moreover, the monitoring scheme can be endowed with a realtime system for the prediction of the duration of the batches with the purpose of assisting the production planning and organization, the scheduling of the equipment use, and the coordination of the operating labour resources. Summing up, to improve the process operation and to reduce the measurement-related costs, a realtime monitoring system is sought that allows: • to estimate online the instantaneous values of the quality indicators, in such a way as to promptly counteract any deviation from the desired quality profile by adjusting in real time the processing recipe; • to predict the duration of the batches and of the respective operating stages, in such a way as to reduce the number of quality measurements that are required to assess online the termination of the stage, and to allow to schedule the use of the different pieces of equipment for different productions in the same facility. The first of the Thesis considers the design and implementation of the abovementioned softsensors in a real-world industrial batch polymerization process for the production of resins.

Chapter 4 Soft sensors for the realtime quality estimation in batch processes This Chapter1 describes how PLS models can be used for the realtime estimation of the product quality in a batch polymerization process. In particular, PLS-based soft sensors are designed and their performances are evaluated and enhanced using multi-phase PLS modelling. Furthermore, information about the time evolution are included into the model using both a lagged variables technique and a moving average technique. These strategies are compared and the benefits on the accuracy and the precision of estimation are shown. Finally, it is explained how the reliability of the model is checked through statistical indices and how the causes of the soft sensor faults can be diagnosed in real time.

4.1 Quality estimation in resin A using PLS models The quality monitoring approach that has been developed relies on the PLS regression technique (Geladi and Kowalski, 1986; Wise and Gallagher, 1996). For the production of resin A, the available dataset includes measurements of the process variables and of the quality variables from 33 batches (16 months of operating effort in the plant facility described in Chapter 3). This dataset was split into two subsets: 27 batches constitute the reference (i.e., calibration) dataset, while the remaining 6 batches represent the validation dataset. The reference process data are collected into a three-way matrix X (I×J×Ki). Each of the J = 23 columns of this matrix contains one measured process variable, while each row corresponds to one of the I = 27 reference batches; time occupies the third dimension, and Ki is the total number of recordings taken for each of the J process measurements during batch i. As was already mentioned, the duration of the generic batch i is not fixed, and this makes Ki change from batch to batch (typically Ki=4000-8000 samples). The process variables included into the model (Table 1) were selected discarding some variables that result to be not relevant either form the engineering point of view (e.g. valve V43 temperature), or form the statistical

1

Portions of this Chapter have been published in Facco et al. (2007), Facco et al. (2008a), Facco et al. (2009a), and Faggian et al. (2009).

76

Chapter 4

point of view (constant setpoints). The generic element of matrix X is denoted with the symbol xi , j ,ki . Table 4.1 Subset of the measured process variables included into the PLS models, and their relevant position j in the process data matrix X. online monitored variable mixing rate (%) mixing rate vacuum line temperature (°C) inlet dowtherm temperature (sensor 1) (°C) outlet dowtherm temperature (°C) reactor temperature (sensor 1) (°C) column top temperature PV (sensor 1) (°C) scrubber top temperature (°C) inlet water temperature (°C) column bottom temperature (°C) scrubber bottom temperature (°C) reactor temperature (Sensor 2) (°C) condenser inlet temperature (°C) valve V14 temperature (°C) valve V15 temperature (°C) reactor differential pressure (°C) column top temperature PV (Sensor 2) (°C) V42 way-1 VO (°C) inlet dowtherm temperature PV (Sensor 2) (°C) V42 way-2 VO (%) reactor temperature PV (Sensor 2) (°C) valve V42 VO (%) reactor vacuum PV (mbar)

matrix column 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

The arrangement of the three-way Y (I×Q×Hi) matrix is similar; however, only M = 2 columns are present, which correspond to the two quality variables to be estimated (NA and μ). The third dimension of Y is scanned unevenly and with a much lower frequency than the one of the X matrix (i.e. H i > H i results, when static estimators were designed the Xi matrices were pruned in such a way as to eliminate all the rows that do not correspond to a time instant where a quality measurement is available. Note, however, that this pruning is needed only during the PLS model calibration phase. When the model has been designed, it can be interrogated any time a process measurement is available, regardless of the fact that a quality measurement is available or not, thus obtaining NA and μ estimates at the same frequency as the process measurements (i.e., every 30 s). Finally, note that two distinct Y matrices were considered, one for NA and one for μ.

4.1.1 Single-phase PLS model As discussed previously, the operating recipe for the production of the resin results in a complex series of operations, most of which are subject to the operators’ manual intervention. Therefore, also owing to the intrinsic nonlinear nature of the process, it is quite unlikely that the correlation structure between the variables remains the same during the whole duration of a batch. In turn, this means that a single linear PLS model may not be able to provide an accurate prediction of the quality variables along the whole duration of a batch. To check this

78

Chapter 4

conjecture and provide a term for comparison, a single PLS model on 5 LVs for the estimation of μ and NA was built from the reference dataset as a first attempt. The number of latent variables was chosen in such a way as to minimize the estimation error in the validation dataset. Typical validation results are reported in Figure 4.1, where the acidity number and the viscosity predicted by this model are compared to the measured values. 45

1.2

estimation measurement

estimation measurement

1.0

35 30

viscosity (Pa×s)

acidity number (mgKOH/gresin)

40

25 20 15

0.8 0.6 0.4

10 0.2

5 0

0

1000

2000

3000 4000 5000 time sample

6000

7000

0.0

0

1000

2000

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.1 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using a single PLS model. The vertical bars represent the laboratory assay accuracy.

Although the frequency at which the quality estimations are made available is much higher than the frequency of the lab measurements (which can improve the monitoring of the process), the estimation accuracy is not satisfactory. The use of nonlinear transformations on the process variables or of a nonlinear inner relationship in the PLS algorithm did not improve the results significantly.

4.1.2 Multi-phase PLS model One approach to overcome the nonlinearity problems (i.e., a changing correlation structure among the variables) is to divide a batch in different phases, and to develop a linear PLS submodel for each of these phases (Kourti, 2003; Zhao et al., 2006). In this case, a criterion also needs to be found to detect online a phase change so as to dictate the switching between one submodel and the subsequent one. In the process under study, designing a different PLS model for each operating step (multistage PLS model, as referred to in the literature: Ündey et al., 2002; Lu and Gao, 2005a; Camacho and Picò, 2008a) is not a viable solution, not only because the number of operating steps is large, but also because too few quality measurements are usually available within a single operating step to design the relevant PLS submodel. Furthermore, the actual number of operating steps in a batch is not known a priori, being dependent on the number of corrections that the batch will be subject to.

79

Soft sensors for the realtime quality estimation in batch processes

An alternative approach is to check whether different operating steps in a batch share the same correlation structure among the measurements. If this is the case, the same PLS submodel can be used to represent these operating steps. These “shared” operating steps constitute the same estimation phase. Following this approach, we may end up with a number of estimation phases that is (possibly much) lower than the number of the operating steps. To detect how many estimation phases can be recognized in the reference batches, the scores on the first two LVs can be plotted one against the other when a single PLS model is built from the reference dataset. In Figure 4.2a, each point represents the batch state in a certain instant of time when quality samples are available, and this is repeated for each of the reference batches. 6

6

2 0 -2 -4

Phase 1 -6 -8

4

305 230 248 244 221 314 254322 304 326 337 235 318 252 225 183 266 300 284227 222 233 240 251182 319 262 297 196 218 310 249 282 327 320 317 239 344 288 179 201 277 301323 272 258 226 341 261 309 357 255 267 175 241 481 245 316 280 238204 209 349 473 294 321 479 203354 469 362 216 358 194 361 187 363 173210 219 123 407483 333348 360 421422 475 39 315 111 396 428 436 133 99110 359345192306 150 429484 438 381 24 176 379 215 11 166 167 279 286 290 139 370 180 69 98 22 64 303 524 104 367 523 508 434 185 232 356 256 382 83131 85 28 307 229 260273 453 137 4762 237 103 516 122 114 408 10 102 298 283 20 142 332 330 27128 418 446 495 328 152 263 159 311 242 66 19 55149 36 329 313 236 285 338 350 259 498 525 228 403 101 113 528 353 207 308 38 153109 198 339 485 439 399 15158 77 411 5234 73 143 271 243 246 275 57 164 130 281 200206 274 299184 169 278 208 197 352 89 335 391 72 80 312 325 342 32 76 503 90 334 375 340211 224 42 13481 234 171 58 253 336 264 351 181 295 346 188 397 195 364494374 174289 412 231 136 199 331223 212 472 44 118 296269 372373 178190 116 86 270 177 126 474 250 125 17 385 386 257247 293 172 189 388465 426 268 220 387 84 31 14 46 186287205 460 389 74 4 276 193 291 488 191440 521 515 511 480 413 213343 292 355 217 119 302 405 467486 507 477 448 392 415 369 347 9582 456 414 394476 160 92 214 324 383 452380 384 450 454 202 265 432 377 5 7 138 406 59 409 155 156 441 94135 2 132 487 393 416 33 458376 365 108 478 115 504455 395 466468471 442 420 514 449 433 91 497 410 402 417 154140 496 459 63 447 437 430 482 398 513 120 144 16 65 56 129 506 162 54 423 45 443 23 78 165 435 470 145 461 371 366 489 445 3107 512 509 492493427 161 424425 70 368 147 96 510 1 127105 390 451505 526 527 419 112 93 1468775 97 9106 517 529431 5161 499 71 53 148 88157 490 100170 502 49 457 378 522500 400 15121 501 8 121 404 61241326 163 518 12 444 462 519 37 60168 464 463 48 520491 401 67 79 14125 68 40 29 35 43 30 117 5018 41

-6

-4

Phase 2 -2 0 2 score on LV1

2 0

14 10 15 9 7

3

-2

1

2

6 45

Phase 1 6

23 19

22 20 17 21 16 18 25 24

-4

Phase 3 4

13 12 811

score on LV2

score on LV2

4

8

-6 -8

-6

-4

Phase 2 -2 0 2 score on LV1

Phase 3 4

6

8

(a) (b) Figure 4.2 Scores plot on the first and second latent variables for a single PLS model: (a) whole reference dataset and (b) validation batch #3. The boxes indicate the approximate boundaries of the estimation phase regions.

It can be seen that the score points are mainly clustered into three distinct regions of the score plane. A closer inspection of the score points related to each single batch (Figure 4.2b) revealed that all batches are characterized by a similar pattern in the “movement” of the score points: at the beginning of the operation, a score point is located at the left of the score plane (“Phase 1” cluster), then it moves to the center of the plane (“Phase 2” cluster) as time progresses, and finally it shifts to the plane right (“Phase 3” cluster) towards the end of the batch. The correlation structure between variables is more similar for points within a cluster than for points between clusters. Otherwise stated, each cluster represents an estimation phase, and can be envisioned as a series of operating steps that maintain the same correlation structure among the variables. Therefore, one distinct PLS submodel can be developed for each estimation phase to predict the quality variables from the process ones within that phase. The resulting quality estimator is called a three-phase PLS (TP-PLS) estimator. Note that clusters could also be identified without using process knowledge. To this purpose, clustering

80

Chapter 4

techniques based on PCA and PLS can be an effective way to obtain automatic cluster detection (Lu and Gao, 2005). In the presence of auto-correlated and cyclic process data, like the ones encountered in the process under study, the clustering algorithm proposed by Beaver et al. (2007) can also prove useful. Note, however, that the number of clusters must be kept as small as possible because if too few quality measurements are available within a cluster, it may be impossible to design the relevant PLS submodel. 180

10 8

reference set validation batch #3

1

6 1

-2 -4 -6 -8 -10

5440 5461 5482 5419 5503 5398 5524 5566 5545 5881 5902 5860 5944 5839 5923 5965 5608 5986 6007 5818 5587 5797 6028 5629 5776 5650 5671 5692 5734 5713 5755 5377

22

5454 5437

time

43

Phase 3

5356 5230 5209 5188 5167 5251 5146 5293 5314 5335 5272 5125 5104 4873 4852 5083 4096 4894 4831 4117 4789 4810 5062 5041 4075 4054 4159 4201 4138 4999 4768 4978 4033 4222 4243 5020 4180 4264 4915 4936 4957 4012 4348 4390 3991 4306 4327 4369 4600 4747 4411 4432 4621 4642 4726 4705 4663 4684 3970 4453 4558 4579 4495 4537 3865 3949 3907 39284285 3886 3739 3844 4474 4516 3718 3802 3823 3760 3697 3781 3676 3655 3634 3592 3613 3550 3571 3529 3508 3109 3487 3130 3088 3466 3151 3172 3298 3382 3445 3319 3193 3403 85 3361 3277 3424 3214 3256 3235 3340 3046 3025 106 3004 3067 2983 148 169 2962 127 2941 190 2836 2920 2857 211 2773 2794 2731 2710 2815 2878 2752 2899 232 253 2689 2668 274 2647 2626 337 295 316 358 2605 379 2584 2542 400 2500 2521 2563 442 421 2479 463 505 526 484 2458 547 568 589 610 2437 631 652 2416 2122 2038 2080 2101 2143 2017 673 694 1996 736 2164 757 2185 715 1975 1954 2059 2395 2374 2206 2353 1933 778 2332 2311 2227 1912 2269 2248 799 2290 1849 1891 1870 1828 1786 1807 820 1765 1744 1723 1702 841 1681 1660 862 1639 883 1618 904 946 1072 925 967 1093 988 1009 1030 1051 1135 1156 1219 1177 1240 1114 1282 1198 1261 1303 1429 1492 1513 1387 1450 1471 1576 1324 1408 1555 1534 1597 1366 1345 64

1017

-15

-10

-5 0 5 score on LV1

1034

100

1051 1068 1119 1085 1102 1136

80

1153 1170 1187 1204 1221 1238 1612 1629 1255 1561 1595 1646 1884 1850 1901 1272 1527 1816 1578 1918 1544 1731 1765 1867 1289 1663 13231510 1306 1833 1782 1748 1799 1935 1340 1697 1714 1476 14931680 1459 1357 1442 2105 1952 2071 1374 1408 2037 2122 2054 1425 1391 1969 2139 2088 1986 2020 2003 2156 2173 2190 2224 2241 2207 2258 2275 2343 2292 2309 2394 2326 2377 2360 2411

40

10

15

20

4519 4502 4485 4468 4451 4434

4417 4400 4383 3261 3244 3210 3278 4349 3227 3295 3193 3176 4366 3159 3125 3142 3108 3329 2734 30913312 4332 3346 3363 3380 4298 4315 2751 3414 3057 2428 2768 2445 2717 3074 3397 3431 2802 2836 2479 3448 3788 3941 3754 4128 2462 2700 3771 4111 2496 2785 2853 2870 3040 3465 2564 2819 2530 2547 2513 2598 2581 4145 2972 4094 4247 2887 2921 2904 3482 3822 4281 2632 2683 3550 3584 3737 2615 2649 2938 4264 3907 3958 2955 3023 2666 3006 3601 4179 2989 3992 4043 4213 3618 4077 3635 4162 3805 3839 3856 3975 4230 3669 3686 3873 4009 3499 3516 3533 3652 3703 3720 4060 3890 3924 4196 4026 3567

20 0

-20

5097 5063 5114 5403 5080 5488 4740 4791 5046 4774 4757 4723 5131 5386 55055743 5777 5760 5148 4689 5029 5794 5726 4706 4808 5811 5012 55225709 4672 51655352 58286032 5369 4825 53355539 5845 6015 4655 4995 5692 46384842 5182 5556 5199 4978 5318 5675 4859 56585862 5998 5301 5573 5641 4621 4604 4961 5216 5624 5879 5964 4944 5233 5284 5981 5267 5607 5896 4587 4876 4910 5913 4927 5250 4570 4893 5947 5590 5930 4536 4553

1000

60

Phase 2

Phase 1

5420 5471

120 Q residual

score on LV2

0

Phase 3

140

4 2

Phase 2

Phase 1

160

0

1000

2000

3000 4000 5000 time sample

(a)

6000

7000

(b)

30 estimation measurement

acidity number (mgKOH/gresin)

25 20 15 10 5 0 -5

Phase 1 0

1000

2000

Phase 2 3000 4000 time sample

Phase 3 5000

6000

7000

(c) Figure 4.3 TP-PLS estimator: reliability of the Phase 2 submodel in the estimation of acidity number for validation batch #3. (a) Scores plot for the first two latent variables, (b) squared prediction error plot; and (c) comparison of the estimated and measured values. The dashed lines in (a) and (b) indicate 95 % confidence limits. To improve the readability, most of the samples have not been plotted in (a) and (b).

A key issue in the development of such a multi-phase estimator is finding a proper criterion to switch from one PLS submodel to the subsequent one (Camacho and Picò, 2006a and 2006b; Lu and Gao, 2004a, 2004b, 2004c, 2005a, 2005b and 2006). Switching from one submodel to the subsequent one means being able to recognize in real time that the correlation structure of

81

Soft sensors for the realtime quality estimation in batch processes

the data is changing. It was observed that, due to the large inter-batch variability, “time” is not a good indicator to assess phase switching for the process under study. Therefore, submodel switching was linked not to time, but to events: there are certain events that do occur in all batches and mark a change in the correlation structure, although they may occur at a different time from batch to batch. Analysis of the process and quality data for all the reference batches revealed that the switching from Phase 1 to Phase 2 occurs the first time vacuum is applied to the reactor, while Phase 3 begins as soon as the final rise of temperature takes place. Following this approach, not only does the number of submodels to be developed remain sufficiently low, but clearly detectable events can also be recognized during a batch to trigger the switching between submodels. It should be stressed that each submodel is representative only of the phase it refers to. Figure 4.3 clarifies this issue with respect to Phase 2 submodel. The scores plot of Figure 4.3a refers to a typical validation batch, and shows the similarity of each sample to the Phase 2 samples of the reference set of batches. Only during Phase 2 do the validation scores fall within the 95 % confidence ellipse of the Phase 2 submodel. When the process measurements start to be recorded (Phase 1), the score points all lie well outside the confidence ellipse; they enter into, and stay within, the ellipse during Phase 2, while during Phase 3 they tend to move again outside the ellipse to a different region of the score plot. The squared prediction error plot in Figure 4.3b shows that Phase 2 submodel is not reliable as a quality estimator during Phase 1 or Phase 3. In fact, the estimations of the acidity number is accurate only during Phase 2, while during Phases 1 and 3 the estimations are affected by severe errors (Figure 4.3c). Therefore, care must be taken to identify the proper switching criterion and detect it online; anticipated or delayed detection may lead the soft sensor to provide unreliable quality estimates. 1.2

45 Phase 1

Phase 2

40

Phase 2

Phase 3

1.0

35 viscosity (Pa×s)

acidity number (mgKOH/gresin)

Phase 1

Phase 3

30 25 20 15 10

0

0

1000

2000

3000 4000 time sample

0.6 0.4 0.2

esimation measurement

5

0.8

0.0

5000

6000

7000

estimation measurement 0

1000

2000

3000 4000 5000 time sample

6000

(a) (b) Figure 4.4 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using the TP-PLS model. The vertical bars represent the laboratory assay accuracy.

7000

82

Chapter 4

The typical estimation performance of the TP-PLS soft sensor in a validation batch is shown in Figure 4.4. It can be seen that the performance is greatly improved with respect to that of the single PLS model. The estimation accuracy is generally within the accuracy of the laboratory analysis. Yet, some noise in the estimation is present (for example, in the estimation of the acidity number during Phase 1, and in the estimation of viscosity during Phase 3). Furthermore, the viscosity estimation displays a somewhat erratic behavior during Phase 2. In the next section, two different approaches will be considered to further improve the estimation performance.

4.2 Including time information to improve the estimation performance The variable-wise unfolding of the three-way X and Y matrices (Wold et al., 1998) has the advantage of being very simple to carry out, because it can be applied in a straightforward way to sets of batches that have different time duration, without the need of synchronizing the batch length. The price to pay for this simplicity is that the “time footprint” of the data is lost, because the order in which the rows of the two-way X and Y matrices are assembled following a variable wise unfolding is unimportant for the design of a PLS estimator. Otherwise stated, the TP-PLS model is inherently static, and this may affect the estimation performance given the fact that a batch process in inherently dynamic. To account for the process dynamics, two different techniques were evaluated, namely the augmentation of the process data matrix with lagged values, and the use of averaged values instead of point values in the X matrix.

4.2.1 Improving soft sensor performance through lagged process variables A PLS model on variable-wise unfolded data is inherently static. To take into account the dynamic behavior of a batch process, the use of dynamic PLS models has been suggested (Ku et al., 1995; Chen and Liu, 2002; Sharmin et al., 2006). By following this approach, the process data matrix is augmented with lagged values of the process variables at the past sampling instants. To keep reasonably small the column dimension of the XVWU process data matrix, lagged values of only the three most important variables, as identified by the VIP method (Chong and Jun, 2005), were considered in three past time instants (Figure 4.5). These variables are the column top temperature (variables # 7 and 17), the column bottom temperature (variable # 10), and the reactor temperature (variables # 6, 12, and 21). To the

83

Soft sensors for the realtime quality estimation in batch processes

purpose of minimizing the number of the variables retained into the model, only variables # 7, # 10, and # 21 are considered into the lagged models. 12 Phase: 1 2 3

10

VIP index

8 6 4 2 0

2

4

6

8

10 12 14 16 18 20 22 variable number

Figure 4.5 The most important variables in the PLS projection method according to the VIP index.

By trial and error, it was found that a good performance of the estimator could be obtained by considering the current measurement value plus the values lagged by 1 h (120 time instants), 3 h (360 time instants) and 5 h (600 time instants) for the selected process variables. Information on the most proper values for the lags can be obtained also by studying the autocorrelation and cross-correlation structure of the process and quality measurements. By using this approach, it was found that, depending on the process variable and on the estimation phase, the most appropriate lags range from 300 to 900 time instants, which is consistent with the values we considered in our simplified approach. Note that the variety of lags existing for different process variables is an indication of the variety of time scales that may exist in the process dynamics. In order to account for the effect of different time scales, a more rigorous approach could be taken using a multiresolution PLS approach (Bakshi, 1998). However, this was found to be unnecessary in the present application. The reference process data matrix was therefore augmented by including 3 ⋅ 3 = 9 additional columns:

[

X L = X VWU

where:

XD

]

,

(4.3)

84

Chapter 4

⎡ X1D ⎤ ⎢ D⎥ ⎢X 2 ⎥ ⎢ M ⎥ XD = ⎢ D ⎥ , X ⎢ i ⎥ ⎢ M ⎥ ⎢X D ⎥ ⎣ I⎦

(4.4)

and

[

X iD = x i−,120 7

x i−,360 7

x i−,600 7

120 x i−,10

360 x i−,10

600 x i−,10

x i−,120 21

x i−,360 21

x i−,600 21

]

,

(4.5)

− ΔK

where x i , j is the vector of the jth variable time trajectory for the ith batch lagged of –ΔK time instants. This approach introduces process variables that are more collinear with the quality ones. As a result, the variability in XL is more representative of the variability in Y, and the estimation capability is improved. It was verified that, also in this case, each batch can be segmented into three estimation phases. The resulting soft sensor is called a lagged three-phase PLS (LTP-PLS) estimator, in which 5, 4 and 3 LVs were chosen for Phase 1, 2 and 3, respectively. The reduction of the number of LVs in Phase 2 and 3 prevents overfitting problems, particularly when the signalto-noise ratio is low. 1.2

45 40

Phase 3

estimation measurement

1.0

35

viscosity (Pa×s)

acidity number (mgKOH/gresin)

Phase 2

Phase 1

30 25 20 15 10

0

0

1000

2000

3000 4000 time sample

0.6 0.4 0.2

estimation measurement

5

0.8

Phase 1 5000

6000

7000

0.0

0

1000

2000

Phase 2

Phase 3

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.6 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using the LTP-PLS model. The vertical bars represent the laboratory assay accuracy.

Figure 4.6 shows the estimation results for this model on a typical validation batch. It can be seen that including information about the batch dynamics, through the use of lagged measurements, suppresses most of the noise that was apparent in the TP-PLS estimations. A slight improvement is also obtained in the accuracy of the viscosity estimation during Phase

Soft sensors for the realtime quality estimation in batch processes

85

2, although the estimated values of this quality indicator in this phase still seem to suffer from some inaccuracy. It should be noted, however, that quality estimation during Phase 2 is inherently difficult because all the corrections to the operating recipe take place during Phase 2, which is therefore subject to a much larger inter-batch variability than the other phases.

4.2.2 Improving soft sensor performance through moving-average process data An alternative way to account, although indirectly, for dynamics in the process data is to build the process data matrix with averaged values of the measurements, instead than with current process measurement values. Modifications to the standard PLS algorithm that consider moving windows on weighted past process measurement values have already been proposed for use within recursive and adaptive process control strategies (Wold, 1994; Dayal and MacGregor, 1997b; Rännar et al., 1998; Qin, 1998; Wang et al., 2003). However, a different approach was taken here. The PLS algorithm itself was not altered; what was altered instead is the process measurement matrix: each entry in a column of the process data matrix X BWU represents the average value xi, j ,k of the relevant process measurements j, in a certain batch i, within a window including the previous ΔK’ time samples. Namely, the value included in the process data matrix at any time instant is the average of the last ΔK’ samples:

X BWU

⎡ X1 ⎤ ⎢ ⎥ ⎢X2 ⎥ ⎢ M ⎥ =⎢ ⎥ , X ⎢ i⎥ ⎢ M ⎥ ⎢⎣ X I ⎥⎦

(4.6)

where: ⎡ xi ,1,1000 ⎢x i ,1,1001 Xi = ⎢ ⎢ M ⎢ ⎣⎢ xi ,1, K i

xi , 2,1000 K xi , J ,1000 ⎤ xi , 2,1001 K xi , J ,1001 ⎥⎥ M O M ⎥ ⎥ xi , 2, K i K xi , J , K i ⎦⎥

(4.7)

and k − ΔK ′

xi , j , k =

∑x r =k

i , j ,r

ΔK ′

.

(4.8)

86

Chapter 4

The moving averages are used not only to dampen measurement noise (as done for example by Kamohara et al., 2004), but also to smooth out short-term fluctuations (process noise) while, at the same time, preserving the capability to highlight long-term trends. Smoothing process noise was necessary because when a correction takes place in a batch most process variables change abruptly (e.g., when vacuum is broken, most process variables undergo a step-wise change), while bulk properties (like μ and NA) are practically insensitive to such abrupt changes. Therefore, the performance of a purely static estimator may be disrupted when these events occur. A similar effect is found when, due to poor controller tuning, some process variables tend to cycle (typically, the reactor temperature), while at the same time this cycling does not affect the product quality properties. The length of the moving window was set by trial-and-error to 900 time samples (7.5 h). The wide extension of the time window also allows to incorporate the variability within most of the first part of the batch, when no quality measurements are taken. Furthermore, the size of the time window (that is about 10-15% of the entire batch) underlines the importance of including a long-term past history information into the data. In fact, other moving average strategies were explored: • a weighted moving average approach in which the input data are calculated as:

xi , j ,k =

xi , j ,k + λxi , j ,k −1

(1 + λ )

;

(4.9)

• an exponentially weighted moving average techniques, where input data are:

xi , j ,k = λxi , j ,k + (1 − λ )xi , j ,k −1 ,

(4.10)

and λ is a forgetting factor that give less weight to the past time instants. These strategies, which give less weight to the past measurements, revealed to be less effective from the estimation point of view than the moving average one, as a proof of the importance of incorporating long-term “memory” to the soft sensor. The resulting estimator is a moving-average three-phase PLS (MATP-PLS) soft sensor built on 5 LVs for every Phase, and its performance is illustrated in Figure 4.7. As expected, the estimated profiles of the quality variables are smoother than with the other models. An improvement is apparent in the accuracy of the estimated viscosity profile during Phase 2.

87

Soft sensors for the realtime quality estimation in batch processes

45

Phase 2

Phase 1

1.2

Phase 3

estimation measurement

1.0

35

viscosity (Pa×s)

acidity number (mgKOH/gresin)

40

30 25 20 15 10

0

0.6 0.4 0.2

estimation measurement

5

0.8

0.0 0

1000

2000

3000 4000 5000 time sample

6000

7000

0

1000

Phase 3

Phase 1

Phase 2

2000

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.7 Prediction of (a) the acidity number and (b) the viscosity for validation batch #4 using the MATP-PLS model. The vertical bars represent the laboratory assay accuracy.

Finally, it should be stressed that more sophisticated nonlinear methods (i.e. the multiresolution methods proposed, for example, by Kosanovic and Piovoso, 1997, and by Maulud et al., 2006) were investigated, but they provided results which are similar to the ones of the moving-average strategy, though with an heavier computational burden.

4.3 Comparison of the estimation performances Table 4.2 allows for a quantitative comparison of the three three-phase estimators that were designed: the static one, the “lagged” dynamic one, and the “averaged” dynamic one. It is clear that including some form of time information into the X matrix greatly increases the amount of variance that can be explained on the quality data, and using averaged measurements (MATP-PLS estimator) appears to be better than using lagged measurements (LTP-PLS estimator). Note that the amount of variance captured in the Y matrix during Phase 3 is relatively small for all estimators and both quality variables. This is due to the fact that, when the end of the batch is approaching, the process measurements profiles flatten considerably and the signal-to-noise ratio decreases, making the process variables much less effective predictors of the quality variables. It is also interesting to note that in the LTP-PLS estimator the variance captured in the Y matrix increases considerably in Phases 2 and 3 with respect to the TP-PLS model, despite a smaller number of retained LVs and a larger number of process variables that causes the captured variance of the X matrix to decrease. This indicates that the lagged measurements do bring “new” valuable information for the prediction of quality, and this information was not present in the original X matrix; although this new information contained in the “lagged” XL matrix cannot be captured to a large extent, it is nevertheless much more predictive of the quality matrix.

88

Chapter 4

Table 4.2 Explained variance of the TP-PLS, LTP-PLS and MATP-PLS estimators on the process and quality variables for both acidity number and viscosity (calibration dataset). Phase

1 2 3

TP-PLS model NA estimation μ estimation on X on Y on X on Y (%) (%) (%) (%) 62.00 88.50 63.74 86.57 67.38 82.97 67.47 78.46 73.78 59.84 72.56 52.93

LTP-PLS model NA estimation μ estimation on X on Y on X on Y (%) (%) (%) (%) 66.12 96.47 67.47 93.14 57.11 88.92 57.14 83.66 61.68 67.47 61.01 55.59

MATP-PLS model NA estimation μ estimation on X on Y on X on Y (%) (%) (%) (%) 70.77 95.56 71.74 94.52 67.42 91.13 68.63 85.04 74.21 72.26 75.57 61.90

During each phase of the generic validation batch i, the estimation accuracy on the quality variable q can be evaluated in terms of mean relative prediction error MRPEi,q: nsample

∑

MRPE i ,q =

h =1

⎡ ⎢ ⎢ ⎣

(y

2 − yˆ i ,q ,h ) ⎤ ⎥ ⎥ yi ,q , h ⎦ ⋅100 ,

i , q ,h

(4.11)

nsample

where yi,m,h is the (measured) value of quality variable q at the hth sampling instant of that phase, ^ indicates an estimated value, and nsample is the total number of quality samples in the phase. This error can be averaged on all the validation batches, to get an MRPE value for each estimated quality variables during any of the estimation phases. 14

14

TP-PLS LTP-PLS MATP-PLS

10 8 6 4 2 0

TP-PLS LTP-PLS MATP-PLS

12 MRPE on μ (%)

MRPE on NA (%)

12

10 8 6 4 2

Phase 1

Phase 2

Phase 3

0

Phase 1

Phase 2

Phase 3

(a) (b) Figure 4.8 Comparison between the estimation accuracy of the three-phase PLS models in terms of the average mean relative prediction errors (MRPE) on the validation datasets for (a) the acidity number and (b) the viscosity. The dashed lines represent the laboratory analysis accuracy.

In Figure 4.8, the MRPE on the acidity number and on the viscosity for the three soft sensors is shown. Despite the explained variance on the viscosity during Phase 3 is lower then the other phases, the predictions are very accurate. It can be seen that, although all three soft

89

Soft sensors for the realtime quality estimation in batch processes

sensors provide an estimation accuracy generally within the one of the laboratory analysis, the MATP-PLS model shows a superior overall performance.

4.3.1 Reliability of the estimations The estimations of the quality indices can not be trusted blindly, because occasionally the soft sensor may provide wrong estimations. Using diagrams similar to those reported in Figures 4.3a and 4.3b, the reliability of an estimate provided by any of the estimators can be assessed online during each estimation phase. Indeed, the reliability of the estimation can be evaluated by comparing the instantaneous value of either the scores, or T2, or SPE with the respective confidence limits. If all the observed statistical indices are within the respective 95% limits, the estimation is considered to be reliable (Figure 4.9) with an uncertainty of the 5%.

180 160

SPE residual

140 120 100 80 60 40 20 0

0

1000

2000

3000

4000

5000

6000

time sample

Figure 4.9 SPE residuals control chart with 95% confidence limits for the online assessment of the reliability of the viscosity estimation using the MATP-PLS model in validation batch #4.

On the contrary, if one of the statistical indices exceeds the respective limit, the estimation loose reliability, and the farther the statistics are from the limits, the lower the reliability is. For example, the estimated viscosity profile during validation batch #5 is shown in Figure 4.10a. While the batch is being run, the quality measurements of course are not available to assess the reliability of the estimation. However, by complementing the estimation results (Figure 4.10a, full line) with a SPE-residuals plot (Figure 4.10b), one can detect online when the estimated quality values are not reliable (a T2 plot can be used additionally, providing similar results).

90

Chapter 4

1.2

180

1.0 0.8

Phase 3

120

0.6 0.4 0.2

90 60 30

Phase 1

0.0

Phase 2

150 SPE residual

viscosity (Pa×s)

Phase 1

estimation measurement

0

1000

2000

Phase 2

Phase 3

3000 4000 5000 time sample

6000

7000

0

0

1000

2000

3000 4000 5000 time sample

6000

7000

(a) (b) Figure 4.10 Online assessment of the reliability of viscosity prediction using the MATPPLS estimator in validation batch #5: (a) viscosity profile and (b) SPE residuals control chart with 95 % confidence limits.

This example shows that, at the beginning of batch #5 and for the entire Phase 1 and Phase 2, the estimations are reliable. Instead, during Phase 3 the estimator starts to fail around time instant #5450, providing unreliable estimations. However, the alarm on the SPE residuals points out the lack of reliability of the estimation, exceeding the confidence limit (dashed line in Figure 4.10b).

4.3.2 Diagnosis of the soft sensors faults When an alarm on the Hotelling statistics T2 or on the SPE residual statistics indicates an unreliable estimation, an in-depth analysis can be done to understand why the soft sensor fails to give reliable estimations. The causes responsible for the mistaken estimations can be found inspecting the contributions of each variables to the relevant statistics. Since the T2 statistics and the SPE residuals are cumulative values and each process variable contributes to their value, the variables that give the highest contribution to the alarms are considered to be the main candidate to disclose the root cause of the soft sensor malfunctioning. Accordingly, to get information about a soft sensor fault, the contribution plots can be inspected. In this way, the variables that are most affected by the root-cause of the malfunctioning can be identified. First, the contribution to the relevant statistics exceeding the limits have to be studied, to identify which variable contributes more to the exceeding statistics. Secondly, the time trajectory of the variable contribution should be compared with the respective 95 % confidence limits fixed by the study of the variance of the reference dataset to identify the time instant when the cause of the soft sensor fault appears.

91

Soft sensors for the realtime quality estimation in batch processes

3 2 variable #14 E contribution

instantaneous E contribution

2

0

-2

-4

1 0 -1 -2 -3 -4 -5 -6 -7

-6

2

4

6

8

10

12

14

variable number

16

18

20

22

-8

0

1000

2000

3000

4000

5000

6000

7000

time instant

(a) (b) Figure 4.11 Online identification of the causes of unreliability for the estimation of the viscosity prediction using the MATP-PLS estimator in validation batch #5: (a) instantaneous contribution of all the variables to the to the residual with the respective confidence intervals of 95% and (b) SPE residuals control for variable #14.

For instance, in the case of batch #5, where the soft sensor gives reliable and very accurate estimations during Phase 1 and 2, while the estimation looses reliability during Phase 3 E (around time instant #5450), variable #14 shows a pronounced contribution c#5,14 compared to its limit and to the other variables (Figure 4.11a) . On a relative basis, variable 14 is the one with the highest relative contribution, if compared with the respective limit c14E (5% ) , that is a function of the current estimation phase within which it is maintained constant. The same result is shown in Figure 11b, where this single variable is monitored in time, and where the time instant in which the cause of the malfunctioning appears is detectable.

4.4 Soft sensor for estimation of quality in resin B In Chapter 2 it is highlighted that the production of resin B is carried out through two different stages: • Stage 1, which produces a pre-polymer with loosely specified characteristics; • Stage 2 to complete the resin B, giving the desired end-point quality to the product. Stage 1 has a short duration and only few quality samples are available. For this reason, it is important to determine when the pre-polymer is in-spec and when to stop Stage 1. However, it is not possible to build a relevant model for the estimation of the quality in Stage 1, due to the scarce number of available quality lab assays. During Stage 2 the pre-polymer is transformed into the end product. Interventions on the recipe are carried out by the operators during this stage in response to the quality measurements coming from the lab. A soft sensor estimating in real time the product quality from process measurements was designed.

92

Chapter 4

In the next section, the quality estimator for resin B is presented, and its performances are discussed.

4.4.1 Estimation of the quality indicators As also done for resin A, two moving-average PLS regression models were designed, one for the estimation of the acidity number and one for the estimation of the viscosity of the reacting mass. The total number of batches available for this study was 36 (19 months of operating effort in the plant facility). Of these batches, 27 were designated as the calibration dataset; the remaining 9 batches constituted the validation dataset. Reference data were collected in a 3D matrix X (I×J×Ki), where J = 19 is the number of process measurements that were eventually retained. Also in this case batch alignment proved unsuccessful. Therefore, being the process variable trajectories quite dissimilar in this stage, the three-way process data matrices were variable-wise unfolded. However, VWU preserves the nonlinearity between the predictors space and the responses one (Kourti, 2003), and is preferred when the correlation structure of a process is roughly constant (Camacho et al., 2008a). 40

Y-scores (LV 1)

30

15 14 13 12

Cluster 2

11

20 10 9

10 0 Cluster 1

-10 1

-20 -30 -5

-4

3

2

-3

-2

4

5

6

78

-1 0 1 2 X-scores (LV 1)

3

4

5

Figure 4.12 Stage 2 scores plane in the first latent variable for the calibration dataset (reference is made to the time instants when viscosity measurements are available). The squares indicate how a single batch within the dataset projects onto this plane as time progresses from the beginning [1] to the end [15] of the stage. Dashed lines indicate the approximate locations of the clusters.

To compensate for these drawbacks, the same approach used in the quality estimation of resin A was used, i.e. the production stage was split into different estimation phases, and distinct PLS submodels were designed for each estimation phase. To determine the number of estimation phases, a simple approach proved satisfactory: plotting X-scores ur vs. Y-scores qr for the whole calibration dataset (Figure 4.12) clearly shows that two clusters are present in the score plane, each cluster representing an estimation phase. Therefore, two submodels were

Soft sensors for the realtime quality estimation in batch processes

93

built to estimate NA (or μ) within Stage 2. Note that cluster analysis (Lu and Gao, 2005a; Beaver et al., 2007) could have been used for an automatic detection of the clusters in the scores space. However, it should also be noted that that the number of clusters must be kept as small as possible because if too few quality measurements are available within a cluster, it may be impossible to design the relevant PLS submodel. We observed that “time” is really not a good indicator to assess phase switching in this process. Run-to-run variability is extremely large (for example, Stage 2 length ranges from 27.5 to 48.9 hours), and the switching time shows a large variability, too. Therefore, submodel switching was linked not to time, but to events (as in the previous case study): there are certain processing events that do occur in all batches and change the correlation structure between the variables, although they occur at a different time from one batch to another one (a similar approach has been used recently by Doan and Srinivasan, 2008). The occurrence of these events (which can be easily detected on line) dictates phase switching. We believe that this approach is more general than what could be obtained if time was used to designate submodel switching. We found that the switching event was the same both for the NA-model and for the μ-model, and is related to a change of pressure in the reactor that is part of the production recipe during Stage 2. To attenuate the effect measurement and process noise, and to provide the PLS model (which is inherently static) with “memory”, the moving-window approach was used. The process measurements included into the X matrix were averaged over a moving time window of 900 past time instants (7.5 h), this width having been determined in such a way as to minimize the mean relative prediction error on the validation dataset. Not only did this provide a significant smoothing of the estimated quality profiles, but it also increased the amount of predictive information included in the X matrix, which made the quality estimation more accurate. As observed also by Ku et al. (1995) in a different context, cross-validation proved ineffective for the determination of the number of latent variables (LVs) to be retained in the submodels. Therefore, this number was determined by minimizing the estimation error in the validation dataset. As far as the estimation of NA is concerned, 6 and 3 LVs were used for Phase 1 submodel and Phase 2 submodel, respectively; for the estimation of μ, 2 LVs were used during Phase 1, and 3 during Phase 2. Typical estimation results in a validation batch are shown in Figure 4.13, where quality estimations are compared to lab measurements for both the acidity number (Figure 4.13a) and the viscosity (Figure 4.13b). It can be seen that the estimated profiles of quality indicators are smooth (noise-free), and the estimation accuracy is within the accuracy of the laboratory measurements, but the estimation frequency is much more faster than the laboratory measurement one. These results are then displayed in an “industrial” monitoring chart (Figure 4.14), where quality estimations (solid line) are compared to lab measurements (dots). This is a further proof that the estimations compare well to the actual measurements, and indeed can

94

Chapter 4

be used as surrogate measurements to guide the operators throughout the application of the processing recipe. 300

estimation measurement

40

250 viscosity (Poise)

acidity number (mgKOH/gresin)

50

30 20 10

200 150 100 50

0

estimation measurement

0

4000

5000

6000 time sample

7000

8000

4000

5000

6000 time sample

7000

(a) (b) Figure 4.13 Comparison of estimated vs. measured quality variables (a) acidity number and (b) viscosity in a typical validation batch.

acidity number

Figure 4.14 Comparison between lab measurements (circles with measurement uncertainty) and realtime estimation (solid line) of the resin quality in an industrial monitoring chart (validation batch) during Stage 2. Acidity number is reported as the abscissa (decreasing values from left to right), and viscosity as the ordinate. Non-standard units are used. The measured values of acidity number and viscosity should always fall within the bounds (broken lines). Time increases (nonlinearly) from the lower-left corner to the upper-right one.

8000

95

Soft sensors for the realtime quality estimation in batch processes

Table 4.3 shows the summary of the results for the bi-phase PLS model (BP-PLS) and the moving-average BP-PLS modeling (MABP-PLS). The improvement of performances with the inclusion of dynamic information is apparent, and the accuracy of the online estimation demonstrate to be near to the one of the lab assays. It should be highlighted that, despite the average relative prediction errors on the acidity number of the validation batches during Phase 2 are ~20%, this error is tolerable, due to the fact that the absolute value of NA is very small, and such an error corresponds to really low absolute errors. Table 4.3 Comparison of the mean relative estimation errors (%) of different models for the quality estimation during Stage 2 of validation batches for the production of resin B. The optimal number of retained latent variables is shown for every model. Model BP-PLS MABP-PLS

Phase 1 NA (%) μ (%) 17.2 (4 LV) 10.6 (2 LV) 15.3 (2 LV) 11.1 (4 LV)

Phase 2 NA (%) μ (%) 30.0 (5 LV) 14.1 (2 LV) 20.1 (2 LV) 12.3 (2 LV)

Moreover, it should be remarked that lab measurements are spaced (roughly) by 2 h, whereas quality estimations are made available at the same frequency as process measurements (about two per minute). Therefore, the recipe adjustments can be carried out much more promptly if the soft sensor is employed in real time, the chances that product quality drifts outside the acceptable bounds are minimized, and the length of the batch can be shortened. The actual implementation of this soft sensor also allows to significantly reduce the number of samples to be taken and analyzed during a batch, which contributes to cut the lab-related expenses, and allows to redirect the operators to more qualifying duties. As for the real time implementation of the soft sensor, online detection of the switching instant from available process measurements was a key issue to guarantee a good performance of the sensor. Standard digital filters were used to protect the soft sensor from measurements noise and spurious events that might disrupt its performance by erroneously triggering a phase switch.

4.5 Concluding remarks Partial least squares regression has proved to be a reliable tool for the online estimation of the product quality properties in industrial batch polymerization processes. The process under study (i.e. manufacturing of resins) was characterized by a large number of available process measurements, uneven batch duration, scarce number of quality variable measurements with uneven sampling of (and lag on) these variables, complex and almost entirely nonreproducible sequence of processing steps. It was shown that the product quality can be estimated in real time from the available process measurements with an accuracy similar to

96

Chapter 4

that of the lab instrumentation, but at a much higher frequency, with no delay, and with no need for dedicated personnel. The frequency at which the quality estimations are made available is 2 min-1, i.e. 240 times faster than the lab measurements frequency. To compensate for the nonlinear nature of the input/output mapping and the changing correlation structure between variables, a segmentation of the batches into a limited number of estimation phases was carried out by highlighting different clusters of score points in the scores plot of the reference dataset. Within each of these phases, linear PLS submodels were shown to provide accurate quality estimations. Switching between one submodel to another one was triggered by clearly detectable landmark events occurring in the process. Inclusion of time information into the process data matrix was shown to substantially improve the estimation accuracy. Namely, augmenting the process measurement matrix with lagged measurements dampened most of the noise on the estimated values of the quality variables. Furthermore, incorporating the soft sensor with a “memory” through a moving window approach was highly beneficial to increase the estimation accuracy, without introducing any significant complication in the structure of the soft sensor. Averaging the process measurement values on a moving window of fixed length provided valuable information on the batch evolution that proved to be useful both to suppress measurement noise and to attenuate process noise, especially during the phases with high degree of inter-batch variability (i.e. when correction and addition of fresh raw material take place). One of the advantages of the resulting moving-average multiphase PLS estimator is that it is very easy to implement, because it does not require to modify the structure of the PLS algorithm. Furthermore, using averaged measurement represents an easy way to handle noise spikes or temporarily missing values of the process measurements. However, care must be taken in selecting the length of the moving window, because too wide a window may delay the appearance of out-of-threshold values in the T2 or SPE residuals control charts. Therefore, the product quality can be estimated in real time and, to compensate for quality drifts, adjustments on the production recipe can be carried out very promptly. This minimizes the risk to obtain off-spec products and reduces the overall processing time. Furthermore, the number of product samples that need to be taken and analyzed in the lab can be drastically reduced. Indeed, product samples can be taken only when the product is deemed to be close to specification, which contributes to cut the lab-related expenses, and allows to redirect the operators to more qualifying duties. In summary, realtime knowledge of product quality can significantly improve the operation of a batch and cut the expenses related to sample handling and analysis.

Chapter 5 Realtime prediction of batch length Typically, a monitoring system in the production of specialty chemicals has to tackle the challenging issue of understanding the “maturity” of a product in real time. In fact, for several specialty productions it is not possible to know in advance either the total batch length, or the length of any processing stage within the batch, being the batch duration and the stage duration determined by several different causes (such as variable quality of the raw materials, uneven quality and quantity of midcourse corrections, delayed timing of the operations on the plant) most of which can not be known in advance. On the other hand, the realtime estimation of the total length of the batch would be very useful for production planning, scheduling of equipment use, as well as to coordinate the operating labour resources (Marjanovic et al., 2006). In this Chapter1, the realtime prediction of the batch length and of the duration of the production stages within a batch is addressed for the case study of the production of resins. Multivariate statistical techniques based on the projection onto latent structures (Wold et al., 2001) are used for this purpose. Reference is made to the fed-batch process where resin A and B are produced. It is shown that, by appropriately tailoring some existing techniques, information can be extracted from available process measurements that can significantly improve the overall performance of the process, offering an insight into the time development of a batch production.

5.1 Design of an evolving PLS model for the prediction of batch length In batch processes the number of quality measurements within a processing stage is often too small to allow designing a PLS model for the online estimation of the product quality within that stage. To monitor the evolution of the batch and the respective stages, the batch length τ or any stages’ length can be predicted instead. The possibility of knowing the value of τ in advance would allow the operators to perform timely interventions on the plant and to reduce

1

Portions of this Chapter have been published in Facco et al. (2008a) and in Faggian et al. (2009).

98

Chapter 5

the number of samples needed for laboratory analysis, because samples would only be taken starting from the time when the stage is expected to terminate. Thus, if the performance of the estimator is adequate, only one or two samples may be sufficient to detect the stage termination time.

Figure 5.1 Scheme of the evolving PLS model procedure to perform the online prediction of the batch and stage length.

By checking the process variable profiles, it was noted that, whatever product was processed, during the time evolution of a single stage the profile of each process variable displays similar trends in all batches, and only the stage length seems to discriminate one batch from another one. Therefore, because a certain degree of similarity was apparent among the early stages of

Realtime prediction of batch length

99

all the batches, a multi-way PLS model (Nomikos and MacGregor, 1994) using BWU was developed to provide realtime estimation of the stage length. Alignment techniques based on the indicator variable approach (Kourti, 2003; Ündey et al., 2003) proved unsuccessful for synchronizing the process variables trajectories. This is probably due to the fact that the process variable trajectories are correlated to time in a highly nonlinear way. Therefore, a simpler approach was taken: the values of process measurements that had been collected at a time exceeding a threshold value t* were simply disregarded. As for the number of process measurements to be included into the X matrix, an engineering analysis suggested to discard a small subset of the available measurements, i.e. those measurements that either were expected to provide no contribution to the batch dynamics or had a markedly non-smooth profile throughout the batch. Following these indications, constant set-points and on-off variables were eliminated from the dataset. This resulted in X being unfolded to a (I×J·k) two-way matrix XBWU, where J is the number of process measurements that were eventually retained, and k is the number of time instants used to estimate τ, where k=1,…,τ* and τ* is the number of samples corresponding to the time horizon t*. The response matrix Y reduced to a column vector containing the length of the I batches. Before further processing, the XBWU and Y matrices were auto-scaled. The realtime estimation of τ was accomplished by designing a set of time evolving PLS models to be used within each batch (Wold et al., 1996; Westerhuis et al., 1998; Louwerse and Smilde, 2000; Ramaker et al., 2005). Each model refers to the time instant k ≤ τ* at which a process measurement becomes available, and uses the process variable values from time instant 1 up to time instant k to estimate τ, as represented in Figure 5.1. Therefore, the dimension of XBWU increases as time progresses. However, note that this has a negligible effect on the calculation time. The goodness of prediction is checked through the time-averaged absolute error TAAEi of the stage length prediction in a batch i over the whole estimation window t*, which is defined as: t*

1 TAAEi = * ∫ ε i (t ) d t , t 0

(5.1)

where:

ε i (t ) = τˆ i (t ) − τi

(5.2)

is the instantaneous error of estimation of stage length in batch i, τˆ i (t ) is the value of stage length in the same batch as estimated at time t (i.e., at time instant k), and τi is the actual length of the stage in the same batch. Equation (5.2) is evaluated with a finite difference approximation. Further averaging the value of TAAEi over all the validation batches provides _______ the value of the overall average absolute error AAE for the whole validation dataset.

100

Chapter 5

The number of latent variables to be retained in the evolving PLS model can be determined by _______ minimizing the value of AAE .

5.2 Prediction of batch length in the production of resin B In the production of resin B, the total duration of a batch is very variable (from 50 to 80 h) and seemingly unpredictable, as are the lengths of Stage 1 and Stage 2. Therefore, it is hard for the management to appropriately schedule the use of equipment when several batches are to be processed in series or in parallel. Evolving PLS models were built to predict the duration of the Stages 1 and 2. In particular, I=27 batches were used as reference dataset, while 9 batches where used for the validation step. Finally, J=19 process measurement were selected to the purpose of the stage and batch duration prediction.

5.2.1 Prediction of Stage 1 length The length of Stage 1 in the available dataset ranged between 14.2 h and 26.1 h, i.e. the length variability (~12 h) was about 1.5 times the length of an operator’s shift window (8h). The threshold length was set to be as the shortest Stage 1 length among all the available batches. Namely, t * = 14.2 h (i.e., τ* = 1700 time instants) was set. _______ One latent variable was retained in the model to minimize AAE . This single latent variable was not able to capture much of the variability in the XBWU space of the calibration dataset. In fact, for any batch of the calibration dataset only 23 to 26 % of the XBWU variance was explained by the first latent variable, which indicates that only a small fraction of the information embedded in the measured process variable profiles is actually correlated with the length of the stage. Correspondingly, about 50 to 70 % of the variance in Y was explained. _______

Nevertheless, the length prediction was quite satisfactory, because the value of AAE was calculated as 196 time instants (~1.6 h), i.e. ~8 % of the average length of Stage 1, and ~14 % of the variability in the length. Figure 5.2 shows that, with reference to average of the nine validation batches, the batchaveraged instantaneous absolute error of estimation is relatively low (~2.1 h) at the very beginning of the operation; it soon decreases down to ~1.8 h; then, starting from k ≅ 600 time instants, it further decreases steadily and reaches a minimum of ~1.3 h at the end of the estimation window.

average inst. abs. error of estimation (h)

Realtime prediction of batch length

101

2.2 2.1 2.0 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2

0

200 400 600 800 1000 1200 1400 1600 1800 time instant

Figure 5.2 Resin B, Stage 1: batch-averaged instantaneous absolute error of Stage 1 length estimation in the validation dataset as a function of time within the length of the estimation window

This is an indication that it is the information progressively collected during the evolution of the batch that proves useful for the estimation of τ. This issue is further clarified in Figure 5.3, which refers to a single validation batch. When incremental information is used to build matrix XBWU (i.e., when the column dimension of XBWU grows with time; evolving model), the estimation of τ is smooth and steadily improves after 600 time instants. However, if only instantaneous information is used to build the predictor matrix Xk (i.e., at time k, Xk is only made with measurements taken at k; local model), the estimation of τ is much more erratic; it would be hard to have the process operators trust such an estimation (note that the actual length, which is also indicated in Figure 5.3, is obviously not known when the batch is being run). The information of Figure 5.3 can be complemented online with the plots of the Hotelling T2 and squared prediction error statistics, which would provide an indication on whether the estimation is reliable or not. To appreciate how variable the validation results are, Table 1 provides the time-averaged _______ results for each of the validation batches. Note that the overall AAE of the validation dataset can be calculated as the average of the curve shown in Figure 5.3, or as the average of the data reported in Table 5.1. For a practical perspective, the results shown in Figure 5.3 would be implemented in a slightly different way: the projected time of the day at which Stage 1 is expected to terminate would be shown onto the operators’ display at selected time instants. About 1.6 h before the stage is expected to terminate, the operators can take one product sample and send it to the lab for analysis. Thus, the number of product samples that need to be analyzed can be minimized, which reduces the operator-related costs. Furthermore, it is possible to know in advance

102

Chapter 5

whether the sample should be taken during the current shift or during the next one, which has a favourable impact on the workload organization. Table 5.1 Resin B, Stage 1: time-averaged absolute estimation error (TAAEi) of Stage 1 length for each of the validation batches. Batch 1 2 3 4 5 6 7 8 9

Actual length (h) 17.4 16.1 16.9 16.4 18.9 22.6 24.1 21.3 23.3

TAAEi 2.2 1.2 0.3 0.8 0.6 0.6 4.8 3.2 1.1

20 evolving model local model

stage 1 lenght (h)

19 18 17 16 15 14

0

200 400 600 800 1000 1200 1400 1600 1800 time instant

Figure 5.3 Resin B, Stage 1: time evolution of the estimated length of Stage 1 in one validation batch for two different arrangements of the process data matrix X. Broken line: incremental information is used to build XBWU (evolving model); solid line: only instantaneous information is used to build the matrix Xk (local models). The actual length of the stage is also indicated (thin solid line), although it can only be known at the end of the stage.

It is interesting to note that only a subset of the variables included in the XBWU matrix provides a significant contribution to the estimation of τ. To appreciate this, the index of variable importance in the projection method (VIP; Chong and Jun, 2005) can be calculated for each process variable at each time instant. The results of this “dynamic” VIP analysis are reported in Figure 5.4. Process variables with VIP > 1 are considered “important” for the estimation of τ.

Realtime prediction of batch length

103

5.5

5.5 16 5 15 14 17 4

8 10

6 9

13 7 11 19 12 18 2 3

1

5.0

4.5

4.5

4.0

4.0

3.5

3.5

VIP index

VIP index

5.0

3.0 2.5

var. #5

var. #15

var. #14

var. #17

3.0 2.5

2.0

2.0

1.5

1.5

1.0

1.0

0.5

0.5

0.0

var. #16

0.0 1700

length of the estimation window, repeated (time instants)

1700 1700 1700 1700 1700 length of the estimation window, repeated (time instants)

(a) (b) Figure 5.4 Resin B, Stage 1: profile of the VIP index over the Stage 1 estimation window length (1700 time instants) for (a) all process measurements (the numbers above the curves indicate the process variable number designation) and (b) the most “important” five measurements.

Five process variables show a value of VIP consistently larger than 1 throughout the whole estimation window. These variables are the reactor temperature (variable #16), the outlet temperature of the heating oil (variable #5), the setpoint for the inlet temperature of the heating oil (#15), the inlet temperature of the heating oil (#14), and the setpoint for the reactor temperature (#17). This suggests that the most important variables for the estimation of τ are those associated to the thermal behaviour of the reactor, which is consistent with what one would expect from engineering judgment. Furthermore, starting from k ≅ 400 time instants (i.e., 3.3 h), the VIP index keeps increasing for all of these variables, indicating that “thermal behaviour” and stage length get more and more correlated after that time. Then, after k ≅ 1500 time instants (~12.5 h), VIP decreases for these variables, meaning that the correlation between thermal behaviour and stage length starts vanishing after that time. This indicates that the “temperature footprint” of the process is almost completely traced about twelve hours after the process has been started, which is consistent with the fact that the profiles of most process variables start flattening after ~12 h from the beginning of the batch.

5.2.2 Prediction of Stage 2 length Production planning is difficult for this process, because the length of a batch is not known a priori, and changes a lot from batch to batch (e.g., the range of variability of Stage 2 length is as large as three operator’s shift windows). Estimating the stage length in advance is very important to schedule the use of the equipment in the subsequent batches, and to plan the operating labour requirements. For these reasons, a soft sensor was designed to estimate the stage length in realtime. The approach was the same used for the realtime prediction of Stage

104

Chapter 5

8

14

7 6

12

5

10 VIP index

average inst. abs. error of estimation (h)

1 length (evolving PLS model), and τ* = 3000 time instants was set (all symbols refer now to Stage 2). The results obtained were satisfactory, as shown in Figure 5.5a.

4 3 2

8

4

9

6

17 14

7 15

5 18 16 12

8 6 4

1 0

19 13 11 10

2 0

500 1000 1500 2000 2500 3000 time instant (from the beginning of Stage 2)

0

length of the estimation window (repeated)

(a) (b) Figure 5.5 Resin B, Stage 2: (a) batch-averaged instantaneous absolute error of τ estimation in the validation dataset as a function of time within the length of the estimation window (evolving model); and (b) profile of the VIP index over the estimation window length for all process measurements (the numbers above the curves indicate the process variable number designation).

Table 5.2 Resin B, Stage 2: time-averaged absolute estimation error (TAAEi) of Stage 2 length for each of the validation batches. Batch 1 2 3 4 5 6 7 8 9

Actual length (h) 42.0 51.6 42.0 44.7 35.6 47.4 38.0 31.6 43.9

TAAEi 2.1 6.9 2.3 4.8 9.6 3.3 2.3 3.3 2.2

Figure 5.5b shows that no process variable is really “dominant” during Stage 2 as far as the stage length estimation is concerned. Almost all variables provide some kind of contribution to the estimation of τ, and the variables related to the “temperature footprint” of the reactor (e.g., variables # 5, 14, 15, 16, and 17) are among the least important. On the average, the estimation error on the validation dataset is larger than in Stage 1 _______ ( AAE = 490 time instants, i.e. ~4.1 h). However, this is only ~11 % of the average length of Stage 2, and ~20% of the variability in the length, which is well below the length of an operator’s shift. Note that it takes only ~250 time instants (~2 h) to have a satisfactory estimation of the overall length of the stage; after ~1500 time instants from the beginning of

Realtime prediction of batch length

105

the stage, the average absolute estimation error further decreases by about 1 h. Table 5.2 provides time-averaged results for each of the validation batches.

5.3 Prediction of batch length in the production of resin A

average inst. abs. error of estimation (h)

An evolving PLS soft sensor for the prediction of the batch length was designed also in the case of the resin A. The batches considered in this case are the same as the ones used in Chapter 4 to design the quality estimator. I=27 calibration batches were considered, while the number of validation batches is 5. The predictor variables are J=21. The total length of the of the batches ranges between 40 and 60 h,. The prediction of the batch length is interesting when accomplished in the first part of the batch, i.e. when the quality measurement are not available and it is not possible to design a soft sensor for the estimation of the quality indices. For this reason the threshold value was set at t * =8.3 h (i.e., τ * =1000 time instants).

6 5 4 3 2 1 0

0

200

400

600

800

1000

time instant

Figure 5.6 Resin A, start of the batch: batch-averaged instantaneous absolute error of total length prediction in the validation dataset as a function of time within the length of the estimation window. _______

One latent variable was retained in the model to minimize the AAE also in this case. Despite a single LV is not able to capture much of the variability of both XBWU (~17 %), the _______ prediction of the batch length results satisfactory, because the value of AAE is 3.23 h, i.e. ~6.5 % of the average length of batch and ~16.7 % of the variability in the length. This error is much lower than one operator’s shift window length. Furthermore, although the batchaveraged instantaneous absolute error of estimation in the validation batches (Figure 5.6) is relatively high at the beginning of the operation (~6 h), it decreases down to ~1.1 h at the time

106

Chapter 5

t*, because the information accumulated after 8 h of the batch becomes more correlated to total processing time τ. Table 5.3 provides the time-averaged results for each of the validation batches. Table 5.3 Resin A, start of the batch: time-averaged absolute estimation error (TAAEi) of the total batch length for each of the validation batches Batch 1 2 3 4 5

Actual length (h) 53.3 50.2 52.3 50.4 52.8

TAAEi 12.4 1.9 0.4 0.5 1.0

The time trajectories of the VIP index for each of the process variables highlight that the most relevant predictor variables are the reactor temperature, the temperature of the heating oil and the top and bottom temperatures of the distillation column. Consequently, the most important variables for the prediction of the total batch length are associated to the thermal behaviour of the reactor and to the performances of the distillation column.

5.4 Concluding remarks In this Chapter it was shown through an industrial case study how, by considering a blend of engineering judgment and mathematical modeling, multivariate statistical techniques can be exploited to assist the realtime monitoring of product quality and to deliver helpful information for an effective production planning in the semi-batch processing of specialty chemicals. An evolving PLS modeling approach was exploited for the prediction of the duration of the batch. Namely, it was shown that, by incrementally using the information gathered during the evolution of the batch, a sound estimation of the length of the batch (or of any processing stage within the batch) can be obtained in realtime with an average error that is at most as large as 17 % of the inherent batch-to-batch variability. Such piece of information is particularly useful in batch processing, as it allows to schedule manual interventions, to optimize the manpower in terms of shifts and roles, to forecast the production time, and to schedule the most convenient utilization of plant equipment. The statistical analysis of the most significant process variables that contribute to determine the length of the batch confirmed that the initial heat-up stage is crucial for the development of the entire batch.

Chapter 6 Industrial implementation of a soft sensor prototype Chapter 6 describes the “physical” implementation of a prototype of the soft sensors designed in the previous Chapters. In particular, it is explained how the virtual sensor technology is integrated into the supervision system of an industrial polymerization process for the production of resins. Three prototypes have been designed and implemented to work online: i) a three-phase moving-average PLS soft sensor for the realtime estimation of the quality indices of resin A; ii) a soft sensor for resin B that predicts the length of production Stage 1 in real time by means of an evolving PLS model; and iii) a soft sensor for resin B that estimates the quality indices of the resin through a bi-phase moving-average PLS model during production Stage 2.

6.1 Industrial supervision system The supervision system adopted in the industrial production facility under study is MoviconTM 9.1, which has a typical SCADA scheme (see §3.2). The central core of the supervision system interfaces to different modules through a client-server technology (Figure 6.1).

Figure 6.1 Structure of the communication system between the plant, the server of the supervision system, the communication interface that visualizes the plant status and manages the recipes and the required interventions.

The central core of the supervision system manages automatically the following items:

108

Chapter 6

• visualization of the state of the equipment; • alarms; • recording of all the operations carried out in the production facility; • collection and visualization of the process variables; • interfaces for the communication with the operating personnel; • collection and visualization of the quality indices laboratory measurements; • communication with control devices. The supervision system is based on a Visual Basic Sax platform, which can supervise the “animation” of the clients, the correct functioning of the clients, and the communication with the online hardware sensors and with the controllers. Accordingly, the supervision system allows to perform all the required interventions for the safety and the productivity of the manufacturing through OPCs (OLE, i.e. object linking and embedding, for process control) operating in real time. In fact, data are acquired both from PLC controllers and from the hardware sensors to collect the process variables measured online. All the acquired data are registered in SQL databases easily consultable by operating personnel. Consequently, the operators can observe and modify the current production through a system of views and queries. Furthermore, a supervision server ensures the direct communication between the supervision system and both the PLCs and the regulators, while in the production facility some client personal computers are present. The supervision system interfaces to these clients through networking variables available on different levels. A local area network (LAN) guarantees the communication between servers and clients with an Ethernet structure and conveys all the process information to the recording system.

6.2 Implementation of the soft sensor In Chapters 4 and 5 three different types of soft sensors have been designed for the polymerization processes for the production of both resin A and resin B. In particular, the following soft sensors were implemented: • a three-phase moving-average PLS soft sensor (called Prototype A) for the online estimation of the quality indices, which gives both the estimations of NA and μ, and a measure of the reliability of the estimations for the entire duration of the batches for the production of resin A; • an evolving PLS soft sensor (called Prototype B1) for the realtime prediction of the length of the operating Stage 1 of the production of the resin B; • a bi-phase moving-average PLS (called Prototype B2) soft sensor for the online estimation of the quality indices in Stage 2 of the production of the resin B.

Industrial implementation of a soft sensor prototype

109

These virtual sensors have been working online at a prototype level in the production facility for the manufacturing of the abovementioned resins. These soft sensors are mathematical models based on PLS that regress the relevant characteristics (quality indices or batch/stage length) from the process variables that are available online (measured by hardware sensors). However, the “physical” implementation of a soft sensor require more than the estimation or prediction time, because there is the need of: • interfacing with the process; • interfacing with the operators; • acquiring the process variable measurements; • giving the output values, i.e. the estimation of the quality indices or the prediction of the batch/stage duration, together with the reliability of estimations/predictions. The models built for the online implementation refer to the entire dataset of the available past batches, including the set of the validation batches considered in Chapters 4 and 5.

6.2.1 Architecture of the soft sensor The soft sensors for the online estimation of the quality indices and for the prediction of the batch/stage duration were developed in the MatlabTM computing environment (www.mathworks.com). MatlabTM is provided also with a specific package, namely the PLS_ToolboxTM (www.eigenvectorresearch.com) for the multivariate statistical tools. The online implementation implies a complex series of operations, which are represented in Figure 6.2. This sequence goes through the following steps: • the first time the supervision system interrogates the routine for a batch, the “memory” of the soft sensor is erased and the virtual sensor starts to work the data; • the soft sensor acquires the array of the 34 process variables measured online, and stores them in such a way to create the “memory” of the soft sensor; • some flags are applied to recognize both the current stage of the batch for the duration prediction, and the current phase for the estimation of the quality; • the process variables are then properly selected and pre-treated. For example, the subset of the most relevant process variables to the purpose of quality estimation (or to the purpose of batch length prediction) are selected. Then, the relevant unfolding procedure is performed (BWU in the case of the length prediction and VWU in the case of the quality estimation), and the variables are scaled and mean-centred. When needed, the movingaverage filter is applied to smooth the process and the measurement noise and to dampen the effect of some outliers or making the soft sensor less sensitive to short interruptions; • at this point, data are ready to be treated by the soft sensors, and are fed into either the quality estimator or the length predictor.

110

Chapter 6

• the output of the online quality estimators (i.e. acidity number and viscosity) and of the realtime duration predictor (i.e. date and time of the end of the batch/stage) are provided to the supervision system with the respective reliability indices. After that, they can be displayed through the communication interface.

process

SUPERVISION SYSTEM

process data acquisition and recording

phase identification

ESTIMATION ROUTINE

data pre-treatment and alignment

quality estimation

comunication interface

batch duration prediction

output of estimated quality and predicted batch duration + reliability

Figure 6.2 Scheme of the sequence of operations performed by the soft sensor implemented online in the batch polymerization process for the manufacturing of resins.

The soft sensors are embedded into the supervision system (see also Figure 6.1), which commands to repeat the sequence every 30 s.

6.2.2 MatlabTM codes of the soft sensors The implemented soft sensors are built with MatlabTM codes. In principle, the “architecture” of both the soft sensors implemented has the same structure, based on three MatlabTM codes (files .m) and the relevant models (file .mat). The codes are: • an initialization code for erasing the “memory” of the soft sensor and for beginning the treatment of the data;

Industrial implementation of a soft sensor prototype

111

• a code for the management of both inputs and outputs, and for the administration of the alarms for the lack of reliability; • a code performing either the phase/stage detection and switching, and the data pretreatment (e.g.: calculation of moving average and lagged variables; variable selection; data unfolding), and the online estimation of the quality or the realtime prediction of the stage duration. In the following sections, details are given on either Prototype A and Prototypes B1 and B2. 6.2.2.1 Prototype A The soft sensor for the resin A is a three-phase moving-average PLS virtual sensor for the online estimation of the quality indices throughout the entire duration of the batch. The soft sensor is constituted by the files: • InizializzazioneModelloSIRCA.m; • OnlineSensor1.m; • SIRCAproject1v1.m; • Modello.mat. The file Modello.mat is a cell array that contains the three-phase moving-average PLS model. In particular, one model is present for each estimation phase and for both the quality indices. The first code, InizializzazioneModelloSIRCA.m is a function for the initialization of the soft sensor. It is called by the supervision system when a batch of the resin A starts. This code aims at: • erasing the “memory” of the soft sensor; • giving the output istante. The variable istante is a counter of the number of times the soft sensor is interrogated. This function is called at the beginning of the batch by the operating personnel through the “Start” button of the graphical interface of Figure 6.3. The soft sensor is then provided with some flags on the process variables to verify if the initial state of the manufacturing system fits the required conditions for the soft sensor to begin working. Afterwards, the function OnlineSensor1.m is called. This code requires two inputs: the array of the 34 process variables measured online, and the value istante. First of all, this code allows for the recording of the process variables (i.e. it starts the “memory” of the soft sensor). Then, the alarms of the estimation reliability are computed. In general, this function manages the input and output. The outputs are: • instantaneous estimations of the acidity number and viscosity; • instantaneous values of the Hotelling statistics with the respective limits for both the quality indices;

112

Chapter 6

• instantaneous values of the SPE residuals with the respective limits for both the quality indices; • alarms of reliability for the estimations and the predictions, which are easy and fast representation of the Hotelling and residual statistics limits.

Figure 6.3 Graphical user interface for the operating personnel to start the soft sensors when the resin A or B is manufactured.

The estimation process is performed calling the function SIRCAproject1v1.m, through the following steps: • automatic detection of the current estimation phase; • data pretreatment (i.e. computing the moving-average variables, performing variable selection, and unfolding input data); • online estimation of the quality indices through the relevant model embedded into the file Modello.mat.

No output is available for the first 999 sampling instants (as described in Chapter 4), but when the variable istante reaches the value of 1000, the quality estimator start to deliver the outputs, which are specifically: • the quality indices, i.e. acidity number and viscosity; • the Hotelling statistics and the residuals statistics of the abovementioned estimations, with the respective limits; • the reliability of the estimations. The reliability index is a summary of the Hotelling statistics and the SPE residuals. There are two reliability indices, one for NA and one for μ. The reliability indices assume the value 1 if the estimation is deemed to be reliable, and

Industrial implementation of a soft sensor prototype

113

assumes the value 0 if it is deemed to be unreliable, because the actual value of at least one relevant statistic (i.e. Hotelling statistics or SPE residuals) overshoots the respective limit; • the input matrix of the model updated in the function records; • the number of the current phase identified through the flags imposed on the process variables. The procedure to stop the interrogation of the model is operated by the plant personnel through the “Stop” button of the interface of Figure 6.3. 6.2.2.2 Prototypes B1 and B2 The soft sensor for the resin B performs in series the realtime prediction of Stage 1 duration in the first 1700 sampling instants with Prototype B1 and the online estimation of the quality during Stage 2 with Prototype B. The soft sensor is constituted by the files: • Inizializzazione.m; • InOut.m; • realtimeSensor.m; • modelTau.mat; • softsensor.mat. The files Inizializzazione.m, InOut.m, and realtimeSensor.m are common to both the quality estimator and the duration predictor. The file Inizializzazione.m is the initialization procedure, started by the operators when the manufacturing of resin B begins. This function resets the “memory” of the soft sensor and detects the first instant for interrogation of the models through variable istante. No input is required by this function. Similar to the case of resin A, a function InOut.m is responsible of the communication of both the estimator and the predictor with the supervision system. This function requires two inputs: the 34 process variables measured online, and the variable istante. The outputs are: during Stage 1, the prediction of the end point of the stage (date and time) and the reliability of the prediction; during Stage 2, the outputs are the estimated acidity number and viscosity, and the alarms of reliability also with the Hotelling and the residuals statistics and the respective limits. The function InOut.m calls the soft sensors, i.e. the function realtimeSensor.m, which carries out: • identification of and switching between different production stages and different estimation phases by means of some flags on the input process variables; • pre-treatment of the incoming data (i.e.: variable selection, BWU of the data for the duration predictor and VWU for the quality estimator, calculation of the moving average); • prediction of the stage duration during Stage 1 interrogating the model modelTau.mat;

114

Chapter 6

• estimation of the quality during the two phases of Stage 2 interrogating the model softsensor.mat. The models of the Prototypes B1 and B2 are recorded in the .mat files. The model of Prototype B1 is stored in the file modelTau.mat, while the file softsensor.mat is a cell array with the Prototype 2. It should be highlighted that the cell-array structure of the model of Prototype B2 (one model for each estimation phase and for each quality index) is sufficiently parsimonious because the “physical” dimension of the file is less then 200 Kb, while the dimension of the model of Prototype B1 (one model for every interrogation instant) can be nearly 2 Gb when the instantaneous models are stored in different cell-arrays structures. Therefore, the implementation of Prototype B1 requires to “shrink” the model from both the structural and the algorithmic point of view. To reduce the computational burden, the rate of interrogation of the duration predictor was decreased 10 times (1 interrogation every 5 min, and not every 30 s). This imply that the number of models to be stored in the modelTau.mat file decreases by 10 times (form 1700 to 170). Further improvements can be achieved if the structure of the model is organized in a more profitable way. For example, the instantaneous models that constitute the evolving PLS model can be reorganized in such a way that they are stored in a single cell-array structure, where the model parameters of the different instantaneous models (e.g., the loading matrices, the confidence limits, etc…) are grouped in the same vector for all the models. Therefore, on a certain sampling instant, the parameters of the appropriate instantaneous model are extracted by the all-encompassing vector of the evolving PLS to use the correct model in the right moment. In this way the dimension of the file modelTau.mat can be reduced to about 30 Mb.

Chapter 7 Surface characterization through multiresolution and multivariate image analysis In this Chapter1 the issue of the characterization of the microscopic features of the surface of a high value-added product is faced using novel techniques for the multiresolution and multivariate image analysis. In the case of a photolithography process for the production of integrated circuits it is shown that, after applying a multiresolution filter to denoise the an image of the product, it is possible to monitor the features of the product surface through multivariate statistical techniques. New multivariate techniques are proposed for the systematic monitoring of the roughness and of the surface shape. In particular, a two-level “nested” PCA model is used for surface roughness monitoring, while a new strategy based on “spatial moving windows” PCA is proposed to analyze the shape of the patterned surface. The proposed approach generates fast, meaningful and reliable information identifying the abnormalities on the surface of a device and localizing the defects in a sensitive fashion.

7.1 Photolithography process and inspection tools Integrated circuits (ICs) are recognized as some of the most complex manufactured products and some of the most versatile devices. The fabrication of an IC is obtained through a complex infrastructure of materials supply, waste treatment, logistics, and automation to support the entire process. Specifically, the semiconductor production technology develops through an extensive series of photographical, mechanical, and chemical steps in the cleanest environment, achieved by ultra-precision engineered equipments (Helbert and Daou, 2001). Typical processing loops, which may recur several times, comprise some or all of the following phases (Figure 7.1a): oxidation; photoresist application; exposure to light; development of the resist; etching; and photoresist removal.

1

Portions of this Chapter have been published in Facco et al. (2008b), Facco et al. (2008c) and Facco et al. (2009b).

116

Chapter 7

photoresist deposited film Film deposition

Photoresist application

substrate

photolithography light

mask

Exposure & development

Liquid or plasma chemical agent

Etching

Photoresist removal

etching mask

(a)

(b) Figure 7.1 (a) Simplified processing sequence for semiconductor manufacturing and (b) simplified graphical representation of the most important quality parameters of an edge.

In detail, the abovementioned procedure can be outlined as follows. First of all, the wafer, a thin crystalline slice from a semiconductor ingot (e.g.: silicon), is heated to drive off the moisture from the surface, and cleaned. Later, the wafer is maintained into a high temperature

Surface characterization through multiresolution and multivariate image analysis

117

environment, until an oxide layer grows on the substrate. After the addition of adhesion promoters, a thin (and as uniform as possible) layer of photoresist is applied by spin-coating, in a high-speed centrifugal whirling process. Essentially, the photoresist is a polymer mixed with light sensitive compounds. Through light exposure, the desired pattern (often determined by a mask) can be impressed on the surface by illuminating certain portions of the resist selectively. If the light weakens the polymer to the so called “positive” photoresist, it will becomes a less chemically stable aggregate and will be more easily removed during the following stages. Conversely, a “negative” photoresist is strengthened by light and becomes resistant to solvents. The chemical change triggered by the light during the photolithography step allows the resist to be removed by a solution called developer. The resulting shape of the device surface should be the one shown in Figure 7.1a, alternating zones in which photoresist is present, the so called edges, and zones (which will be indicated as valleys) in which the oxidized substrate is no more protected and is completely free from the resist (Figure 7.1b). After the development, a hard baking is performed to give a stronger structure to the residual resist and only then, during the etching, the part of the surface that is not protected by the resist is engraved. A chemical agent (a liquid or a plasma) removes the oxide layer to prepare the surface for the following phases. Finally, the remaining resist is removed and the surface is ready for the diffusion of dopants on the part of the surface where the oxide barrier is not present. The doping induces the formation of ions that create regions with different electrical properties. During each production loop, it is crucial to meet stringent requirements in terms of quality uniformity and consistency. At each stage, several inspections and measurements are performed, but they monitor only few samples of different lots, and some pieces of the processing equipment. Since photolithography is performed several times on the same device and, even after it is completed, a defective device can still be reprocessed, it is common industrial practice to perform quality inspections after the photolithography step (the so-called after-development inspections). Usually, a CD-SEM is adopted to measure some significant features of the semiconductor surface. The SEM (scanning electron microscope) images are used mainly for metrology purposes. In other words, the common inspection tools measure the critical dimension (CD) or, in the most advanced instrumentation, the edge height, the side wall angle (SWA), or the line edge roughness (LER) (Figure 7.1b). Recently, the possibility of measuring and reducing the edge wall roughness using deep UV (ultra-violet) light scatterometry has been discussed (Yaakobovitz et al., 2007). However, in order to detect, distinguish and classify critical features of the manufactured device, more sensitive and reliable tools are required by the new generation of products (Guldi, 2004). For instance, there is a number of defects affecting the final product quality and performance (Figure 7.2) that

118

Chapter 7

cannot be identified in terms of CD measurements, but to which it could be possible to remedy if timely detected.

Figure 7.2 Ideal shape of an edge and possible microscopic defects on the edge shape and surface.

Thus, an automatic monitoring system for the frequent and accurate quality monitoring of a photolithographed surface would be a highly attractive perspective to increase the yield and the consistency of the fabrication program.

7.2 Image analysis through multiresolution and multivariate statistical techniques Images are 2D maps summarizing the characteristics of 3D scene. In this research, industrial SEM images representing the product surface after photolithography with positive photoresist have been used as a case study. These images are grey scale functions of light intensities and can be used to extract meaningful information about the quality of the product, its regularity and conformity to the requirements, and the types and location of defects. The scale of the images is such that 1 pixel corresponds to about 9.8 nm. In general, an image is a collection of well identified characteristics. As such, it is intrinsically a multivariate system, being a wide collection of pixels, where each pixel is highly correlated to its neighbors. In addition, in a surface image the apparent variability comprises both the actual roughness of the surface (which is an actual product feature) and the signal disturbance (which corresponds to noise to be removed). Thus, a number of tasks need taking into account by a monitoring system in order to “use” an image for quality control. Figure 7.3 illustrates the general architecture for the proposed monitoring system.

Surface characterization through multiresolution and multivariate image analysis

119

50

100

150

200

50

100

150

200

50

50

100

100

150

200

150

50

100

150

200

200

50

100

150

200

Figure 7.3 Sketch of the semiconductor monitoring system through (wavelet) image filtering and multivariate statistical techniques.

First of all, a reference model is defined by selecting a suitable reference image. As the quality problem involves different scales of resolution, a multiresolution approach is needed to filter the image and denoise the signal. Subsequently, multivariate statistical techniques are used to exploit the information content of the filtered image, to formulate the monitoring model, and to build the monitoring charts for the product quality inspection. Three quality features are described and monitored through the proposed monitoring system: the line edge roughness, the surface roughness, and the shape of an edge trans-section profile. In the following subsections, the main properties and the mathematical foundations of the multiresolution-multivariate monitoring system will be discussed.

7.2.1 Image multiresolution denoising An image is always affected by disturbances, e.g., the random fluctuation of the pixel light intensity. In general, multivariate statistical techniques can discard the non systematic part of a signal, distinguishing the meaningful variability from the random one. Unfortunately, in this case, the noise is somehow blurred with the roughness of the surface (which is a structural part of the device and defines the quality properties one is interested in). Therefore, a pretreatment is needed on the image to remove the noise without discarding the structure of the surface roughness. The problem of the dual nature of the noise is faced following a multiresolution approach, which examines the different scales of the image through wavelet decomposition (details on wavelets can be found in: Kosanovich and Piovoso, 1997; and Addison, 2002). Specifically, a scale-dependent smoothing of the image is performed by subtracting the unwanted part of the noise. In fact, the smoother version iM of the image io in the domain of the pixel space s ∈ ℜ 2 is the approximation at the scale M of the original image:

i M (s ) =

+∞

∑S

n = −∞

M , n φ M , n (s ) +

M

+∞

∑ ∑T

m = −∞ n = −∞

denoised m,n

ψ m,n (s ) ,

(7.1)

120

Chapter 7

where the first term on the right-hand side is the summation of the products between the approximation coefficients SM,n at the Mth scale and the selected father wavelet ϕ (lower frequencies of the signal), and the second term is the summation of the products between the and the mother wavelet ψ (higher frequency of the signal). Some detail coefficients Tmdenoised ,n of the higher frequencies (scales m over a prescribed limit M1) are redundant and, as a consequence, can be removed: ⎧ 0 =⎨ Tmdenoised ,n ⎩Tm,n

m ≥ M1 m < M1

.

(7.2)

Only the lower resolutions are retained, because of their relevance to the purpose of the roughness monitoring. So, after being convolved through the wavelet decomposition in different scales of resolution, the image is reconstructed, by merging together all the significant scales. Different types of wavelets were tested for the denoising of the photolithographed surface. The Daubechies wavelet with 8 scaling coefficients was eventually selected. The use of the Daubechies wavelet is suggested in several studies (e.g., Salari and Ling, 1995), especially for segmentation and texture analysis problems. Indeed, the Daubechies 8 wavelet seems to respond very well to the requirements discussed in Ruttiman et al. (1998) as it introduces very limited phase distortion, maintains a faithful localization on the spatial domain, and decorrelates the signal in a sensitive manner for both the image smooth features and discontinuities. Once a reference wavelet has been identified, the issue is to find the best smoothing scale for roughness monitoring. We found that useful indications can be obtained by evaluating the correlation coefficients between the side-wall roughness lines at different light intensity levels. A procedure inspired by the work of Patsis et al. (2003) was followed to detect the side-wall roughness. The locus of minimum light intensity in the valleys (represented by the blue lines in Figure 7.4) is first recognized, and then one moves upwards along any of the two edge walls (upper wall and lower wall2), detecting different light intensity levels at preset thresholds. In this way, the “topological lines” at light intensity thresholds of 0.5, 0.6, 0.7 and 0.8 are identified (respectively the green, yellow, red, and violet lines in Figure 7.4). Stated differently, these lines represent the pixel location along an edge side wall where the light

2

The definition of upper and lower walls simply refers to the order in which they appear in our images, i.e. for one edge the upper and lower wall are respectively the first and second wall when moving in a top-down direction. In principles, they do not have any systematic (physical) difference; however, from a statistical point of view, we observed that they “belong” to different categories (perhaps because the light hit them from a slightly different angle) and therefore we decided to identify them.

121

Surface characterization through multiresolution and multivariate image analysis

edge trans-section width (pixel)

intensity assumes a certain value (iso-intensity lines). Therefore, according to this procedure, a spatial location along the edge wall is identified through a certain light intensity level.

5 10 local minima of light intensity 0.5 light intensity 0.6 light intensity 0.7 light intensity 0.8 light intensity

15 20 25 30 35 0

20

40

60

80 100 120 140 160 180 200 edge length (pixel)

Figure 7.4 Magnified section of an edge image: detection of topological levels at different light intensities for the identification of the side-wall roughness on a reconstructed Daubechies-8 1st level approximation.

The correlation coefficients between the different topological lines are computed in the cases of the original image and of the reconstructions at the 1st and at the 2nd approximation levels. Table 7.1 Correlation coefficients between different topological lines in the original image and in the 1st and 2nd level Daubechies 8 approximation reconstructions.

original image

1st level approximation

2nd level approximation

threshold 0.5 0.6 0.7 0.8 0.5 0.6 0.7 0.8 0.5 0.6 0.7 0.8

0.5 1 0.5974 0.3881 0.3316 1 0.7506 0.7675 0.5167 1 0.8086 0.7971 0.7784

0.6 0.5974 1 0.5030 0.4573 0.7506 1 0.7164 0.6321 0.8086 1 0.8337 0.7787

0.7 0.3881 0.5030 1 0.6179 0.7676 0.7164 1 0.6761 0.7971 0.8337 1 0.8889

0.8 0.3316 0.4573 0.6179 1 0.5167 0.6321 0.6761 1 0.7784 0.7787 0.8889 1

As can be seen in Table 7.1, the correlation coefficients between the topological lines of the original image are always low (even for neighboring lines). This means that the noise corrupts the image determining an artificial “uncorrelation” among pixels. However, note that the reconstruction at the 1st approximation level exhibits a substantially different situation: high

122

Chapter 7

correlation is shown between neighboring lines and a significantly lower correlation between distant lines. This occurrence is related to a certain level of roughness, which determines a random shaping of the edge wall (hence, of the topological lines). If the filtering level is further increased as in the case of the reconstruction image at the 2nd approximation level, the random shaping of the lines is excessively reduced, i.e., the ability to capture the roughness is lost. In fact, the correlation coefficient between far thresholds (0.5 and 0.8) is almost the same as the one between neighboring thresholds. This means that an excessive “smoothing” of the signal has caused both the noise and the structural roughness to be removed from the original image. Thus, a first level decomposition is chosen.

(b)

1.0

1.0

0.8

0.8 light intensity

light intensity

(a)

0.6 0.4

0.4 0.2

0.2 0.0

0.6

0

50

100 150 200 250 300 350 400 450 trans-section width (pixel)

0.0

0

50

100 150 200 250 300 350 400 450 trans-section width (pixel)

(c) (d) Figure 7.5 (a) Original image and (b) filtered image, with an example of trans-section profiles on the same pixel column for (c) the original image and (d) the filtered one.

After normalization (i.e., forcing the spectrum of the black-and-white light intensities between the values 0 and 1) and wavelet filtering, the resulting smoothing on the original image is shown in Figures 7.5a and 7.5b. The trans-section profiles of the original image along a given pixel column (Figure 7.5c) show a confused trend (the noise is almost indistinguishable from the underlying pattern). Once the image is filtered and polished from the high frequency

Surface characterization through multiresolution and multivariate image analysis

123

components, it gets a clearer and more noticeable pattern (Figure 7.5d), to which multivariate statistical techniques can be applied.

7.2.2 Multivariate statistical surface monitoring methods Multivariate monitoring instruments are needed for analyzing and interpreting the multivariate nature of images. Whenever there is the technical impossibility to perform a measurement, or an analysis entails the contemporaneous accessibility of multiple characteristics, a multivariate statistical monitoring system may prove significantly more powerful and effective than common metrology tools. Different multivariate statistical schemes were adopted through modified PCA approaches. An extended treatment on PCA methods can be found in the books by in the books by Jackson (1991) and by Geladi and Grahn (1996). The basis for a multivariate monitoring technique is its capability of summarizing a plurality of quality clues embodied in an image into a limited number of statistical indices. Usually these are an indicator of the mean trend (the Hotelling T2 statistic, as a replacement of the scores t), and an indicator of the model representation suitability (the SPE statistics). As stated before, the objective is to develop an image-based strategy for the monitoring of LER, surface roughness and edge shape. 7.2.2.1 LER monitoring Line edge roughness is one of the typical reference parameters in the after-development

inspection of a photolithographed device (Yaakobovitz et al., 2007). In fact, it plainly affects both the subsequent production phases and the performance of the final product. To monitor the LER of a single edge, information must be provided on how the light intensity is distributed in the upper and lower walls of the edge. As was mentioned in the previous section, the topological lines on nlevels = 4 light intensity levels were identified. Two PCA models (Wold et al., 1987) were developed, one for the upper wall and one for the lower wall. To build each model, a reference [nel × nlevels] data matrix was used, where nel is the edge length (in pixels). Each column of this matrix represents the pixel locations, along an edge trans-section (identified by the relevant row number), where the light intensity assumes the value of 0.5, 0.6, 0.7 or 0.8. Otherwise stated, each column represents the topological line where light intensity assumes a specified value.

124

Chapter 7

edge length (nel pixels)

upper wall

pixel space lower wall

PCA upper wall

score space

t2

lower wall

t2

t1

t1

Figure 7.6 Inspection of the upper and lower walls of an edge by the LER monitoring model. Two latent variables were used in the PCA model.

For each trans-section of an edge, the PCA model combines the nlevels variables into a measure of the mean and variance of the LER, and produces one point in the scores space representing the edge quality on that trans-section. When a new edge is inspected, a trans-section is considered (thick black line in the upper part of Figure 7.6), and the relevant topological levels data are projected onto the reference model. Differences in the mean of topological lines at a given light intensity (e.g., due to the presence of side-wall bumps) or in their variance (e.g., due to the presence of large feet or spikes) are highlighted by the T2 (or scores) and the SPE monitoring charts (respectively). By proceeding this way for each trans-section, the whole edge length can be scanned and inspected. This strategy allows to detect and localize the imperfections on the edge side walls, which can then be related to malfunctions or drifts in the production machinery. The confidence levels for the T2 and SPE thresholds can be selected in such a way as to guarantee a good sensitivity to faults, while at the same time keeping the number of false alarms small. 7.2.2.2 Surface roughness monitoring Semiconductor surface roughness is also a very important parameter, as it can deliver

substantial information concerning the accuracy of the erosion during photolithography, the presence of resist residuals on the substrate, etc. The challenge in monitoring is that the patterned surface of an after-photolithography photoresist shows uneven characteristics in different positions. In fact, the zones at lower light intensity (valleys) have remarkably larger roughness than the zones at higher light intensities (edges), due to irregular erosion from the light. Therefore, not only the edge surface should be inspected, but also the valley surface

Surface characterization through multiresolution and multivariate image analysis

125

must be analyzed. This also implies that one must be able to automatically distinguish between edges and valleys along the whole surface of the device. A single model on the entire semiconductor surface cannot simultaneously capture the uneven variance structure on different locations of the surface. The surface can be segmented in classes through a k-means clustering on the light intensity. An unsupervised PCA discriminant analysis is instead proposed here to distinguish the edges from the valleys. As a result, for each of the two surface configurations (edge or valley) a specific monitoring model is built. A “nested” PCA monitoring system was designed to this purpose, which is based on a twostage procedure: the outer level carries out the unsupervised discriminant analysis; the inner level is the actual monitoring model, where two PCA submodels are enclosed, one for the edges and one for the valleys. Thus, in the sequential surface scanning procedure, the outer level provides an automatic switching from one submodel to the other, and the correct submodel is interrogated in the inner level for the monitoring step. edge length (nel pixels)

image width ( nEiw+nViw pixels)

sequential scanning of the image

OUTER LEVEL PCA discriminant analysis

t2

t1 valley model

edge model

INNER LEVEL monitoring models

t2

t2

Figure 7.7 Edge and valley surface roughness monitoring using the nested PCA model. Two latent variables were used.

Each submodel works on either an [ niwE × nel ] or an [ niwV × nel ] matrix, where niwE and niwV are the reference edge and valley image widths (in pixels), respectively. The reference images for edges and valleys are built by considering a statistically sound number of on-quality edges V of on-quality and valleys (in fact, it is sufficient that a statistically sound number niwE and niw edge and valley image rows is collected and assembled in two reference images). The jth column of each matrix contains the values of light intensities on the image trans-section located at pixel 0 ≤ j ≤ nel along the image (i.e., edge) length. The PCA submodels reduce the

126

Chapter 7

dimension of the problem (nel variables) into a point on a 2-D scores space that characterizes most of the information content and the variability of the original space. As illustrated in Figure 7.7, a new picture is scanned row by row; any new row is identified as belonging either to an edge or to a valley, and then transformed by using the ad-hoc PCA model. The value of the first score t1 was found to be a useful indicator to classify a row. A threshold value for t1 was determined by analysis of the reference images. Each submodel allows to monitor the surface roughness by analyzing the T2 and the SPE indices. Excessively high values for the Hotelling’s T2 statistics alarm on an abnormal mean light intensity along the whole length of an edge or valley (e.g., due to uneven distribution of the photoresist coating). Excessively high values for the SPE statistics indicate an abnormal variability of the light intensity (e.g., due to the presence of holes or of photoresist residuals). Note that any abnormality can be precisely located by analyzing the contribution provided by each of the nel pixels to the altered value of the relevant statistics. Therefore, the information that becomes available through this surface monitoring system complements and extends the information provided by the LER monitoring system. 7.2.2.3 Edge shape monitoring A more wholesome and meaningful monitoring approach would attempt to examine the

overall edge shape and to compare it to a required standard. A methodology is suggested here to achieve this goal. The proposed approach performs the inspection of a single edge shape by scanning the edge image in the sense of the pixel columns. The objective is to compare different trans-section profiles of an edge (as characterized in terms of light intensity) with respect to a reference profile, while proceeding along the edge length direction. There are two main difficulties in monitoring the edge shape. First of all, the data are clearly nonlinear and non normally distributed in the sense of the image column trans-section, whereas the calibration of a PCA model requires linear and normally distributed historical reference information. Secondly, it is difficult to retain the spatial information when using a PCA model as, ideally, every pixel should be considered for its position and for the spatial relation with its neighbors (Bharati et al., 2004). The strategy proposed to overcome these issues is to consider each pixel together with its nearest neighbors located in the same trans-section profile within a predefined spatial moving window (Figure 7.8), in which the correlation between the neighbors (i.e., the local edge shape) is maintained and the nonlinearity of the profile is negligible. Thus, a moving window of appropriate size ( Δnmw ) is defined in the space of the pixels along the edge profile transsection (whose overall width is ntsw pixels). The moving window captures the correlation between pixels located in a limited region, within which it is reasonable to suppose a strong correlation between neighbors. For this reason, Δnmw = 5 pixels was chosen; at larger distances the correlation between upper and lower pixels within the same window vanishes.

Surface characterization through multiresolution and multivariate image analysis

127

edge length (nel pixels)

trans-section width (ntsw pixels)

segment

segment (Δnsegm pixels) view point

moving window

Figure 7.8 Schematic of the main concepts underneath the monitoring method for the shape analysis of one edge.

In order to smooth nano-variations that may perturb the comparison between the local shapes of different edge profiles, a segment is taken along the edge length direction (a segment dimension of Δnsegm = 5 pixels was chosen for this study, which corresponds to ~49 nm; at larger distances the correlation between right and left pixels was not significantly high). This segment is assumed to be the smallest width for which a profile shape is analyzed; this also allows for a significant reduction in the computational burden. Therefore, each segment represents the image of Δnsegm consecutive edge profiles. The shape of an edge is then analyzed by monitoring the segments that edge is made of. Note that a segment can be represented by a [ ntsw × Δnsegm ] matrix, whose entries are the light intensities on each pixel belonging to the segment. To define the optimal reference for a segment, a number (equal to Nimage) of segments that conform to the required quality standards are collected from different edges and different images. The whole set of reference data is arranged in a 3-D matrix (Figure 7.9), whose dimension is [ N image × Δnsegm × ntsw ]. Therefore, according to this arrangement, the Nimage images are stored as horizontal slices and piled up one another. The spatial moving window multi-way PCA model is then defined as follows. The spatial window moves pixel by pixel along the matrix third dimension. For any position of the window, a subset of the 3-D matrix is defined (Figure 7.9a). This subset is unfolded “imagewise”, by cutting the submatrix into Δnmw vertical slices along the trans-section width dimension, and putting the slices side by side according to a multi-way procedure developed by Nomikos and MacGregor (1994; Figure 7.9b). This results in a [ N image × (Δnmw ⋅ Δnsegm ) ] 2-D matrix that can be processed through PCA. A column of this matrix represents how the

128

Chapter 7

light intensity on a given position in segment and on a given position along the edge profile varies between different images. On the whole, ( ntsw − Δnmw + 1 ) 2-D reference matrices are obtained, this number being equal to the number of positions that the spatial window can take along the edge trans-section width. For each of these matrices, threshold values on the T2 and SPE statistics can be determined that allow to monitor the shape of the edge within the corresponding spatial window.

(a)

(b) Figure 7.9 (a) Reference data arrangement on a 3-D matrix with a spatial moving window. (b) Image-wise unfolding of the 3-D matrix on the moving window.

For any segment on a test edge whose shape needs to be monitored, the T2 statistics (or the scores) on a given edge window summarizes the mean edge shape in that window. Therefore, large T2 values indicate that the local edge shape is altered with respect to the average. Large SPE values indicate changes in the correlation structure between the profiles of a segment. This signals a local variation on the edge roughness, if the T2 statistics is found to be within its limit.

Surface characterization through multiresolution and multivariate image analysis

129

7.3 Case study: monitoring results The compliance of the surface product quality to the quality requirements after a photolithography process is evaluated in terms of LER, of surface roughness and of edge shape by applying the techniques described in the previous section. Hence, a multiple monitoring system is developed to perform the monitoring of all the abovementioned features through multiscale and multivariate image analysis. The monitoring results are presented in the following.

7.3.1 LER monitoring system The LER monitoring strategy goes through the following steps: i) an edge is selected on the de-noised test image; ii) the four light intensity levels are identified on the edge; and iii) the edge is monitored through the scores plot (and/or the T2 plot) and the SPE plot . The scores plot surveys the conformity of the mean side-wall roughness on the edge. Any point in this plot is designated with a number, which represents the column position (along the edge length) to which that point refers to. The location of the point on the scores plot represents how the topological lines are distributed along the edge side-wall at that column position. Non-conformities are shown as points located outside the confidence ellipse. The SPE plot points to irregularities in terms of excessive variance on the edge side-wall roughness, and to changes in the correlation between the topological lines (i.e., in the “parallelism” of topological lines). The column location over the edge length is represented on the x-axis of this plot. Non-conformities are shown as points located above the confidence threshold. In Figure 7.10, the scores plot (Figure 7.10a) and the SPE plot (Figure 7.10b) for the upper side wall of an edge are shown. The SPE plot shows four outliers within the first 30 pixel columns. An off-line visual inspection of the upper side wall (Figure 7.10c) indeed confirms that a bump is localized around column 30 (note that the scores plot, too, provides a “mild” alarm for column 30). Furthermore, Figure 7.10c shows that several side-wall irregularities are present in the first 30 columns, i.e., a non parallelism of the topological lines exists in this section of the edge length. This confirms the indications provided by the SPE plot. Note, however, that inspecting the side-wall image is much more time consuming and does not provide a precise and unambiguous indication on the pixel column where a non-conformity is present. Conversely, the SPE chart is very quick to analyze, the response is very localized and unambiguously points to a column where a defect is deemed to be present. The scores plot is somewhat less responsive than the SPE plot, but nevertheless the information provided by the two monitoring diagrams can complement each other.

130

Chapter 7

4

10 99

2

30 18 29 106

1 0

140 6 17

100 105 7

101 147 151 167 46 91 104 118 123 146 159 158 157 156 155 173 172 171 36 35 95 4

189 12 111 139 179 178 188 187 186 190 201 16 15 14 13 24 23 5

117 116 115 132 131 135 145 154 153 162 168 174 87 89 92 3

25 191 110 112 166 170 177 185 184 183 182 181 180 193 192 11 122 10 22 21 19 28 27 26 34 33 32 9

-1

8

90

136 31 8

121 200 120 176 195 194 109 108 107 113 119 134 133 138 137 152 161 160 165 169 175 196 20 45 44 88

96

SPE residuals

second principal component

3

102 150 83 148 86 149 47 53 75 82 85 48 52 51 54 58 64 63 62 61 68 67 72 71 81 80

103 114 126 125 124 130 129 128 141 144 164 163 43 42 41 40 39 38 37 78 77 76 84 94 93 2 1

98 97 142 143 79

127 199 198 197

50 49 57 56 55 60 59 66 65 74 73

70 69

-2

6 4 2

-3 -4 -6

-4

-2

0

2

4

6

0

0

20

first principal component

40

60

80 100 120 140 160 180 200 edge length (pixel)

edge trans-section width (pixel)

(a)

(b)

0.5 light intensity 0.6 light intensity 0.7 light intensity 0.8 light intensity

2 4 6 8 10 12 0

20

40

60

80

100

120

140

edge length (pixel)

(c) Figure 7.10 Line edge roughness monitoring: analysis of the upper side wall for one selected edge. (a) Scores plot (the numbers within the squares represent the column position along the edge length); (b) SPE plot; and (c) magnified section of the edge upper wall image.

Strong outliers are also identified by the SPE plot on pixel columns 99 and 106. These indicate a very large variability on the topological lines, or a non-parallelism between the lines. The irregularity on column 99 is also clearly detected by the scores plot. Visual inspection of the upper side wall image (Figure 7.10c) confirms that two large feet are present around pixel column 100.

7.3.2 Surface roughness monitoring system The surface roughness monitoring strategy goes through the following steps: i) a row is selected on the de-noised test image following a sequential scanning from the top to the

Surface characterization through multiresolution and multivariate image analysis

131

bottom of the image; ii) the row is categorized as belonging either to an edge or to a valley; iii) the edge/valley is inspected through the relevant scores plot (and/or the T2 plot) and the SPE residuals plot, which highlight potential anomalies on the surface; and iv) an analysis of the contribution of each pixel to the T2 statistic and to the e residuals, which constitute the SPE-residuals (Wise and Gallagher, 1996; Nomikos, 1996) allows to precisely localize the imperfections. Figure 7.11 shows the e contributions to the residuals of all the pixels along the length of a row within the same edge considered in Figure 7.10. If the e contribution of a pixel exceeds one of the confidence limits (Conlin et al., 2000), this is an indication that a surface imperfection is localized on that pixel. Following this rationale, several imperfections are detected by the contributions plot of Figure 7.11 around pixel columns 10 to 40, around column 100, and around column 135. Note that irregularities in the same locations were highlighted in the LER monitoring strategy, too. 2.0 1.5

e contribution

1.0 0.5 0.0 -0.5 -1.0 -1.5 -2.0

0

20

40

60

80

100 120 140 160 180 200

edge length (pixel)

Figure 7.11 Surface roughness monitoring: typical trend of the contributions to the residuals for the localization of defects on an image row. (The dashed lines represent the confidence limits).

The agreement between the results of LER monitoring and of surface roughness monitoring is a proof of the reliability of both approaches, which can be used simultaneously in a robust way. In fact, the topological lines at different light intensity thresholds (obtained by cutting the image with a plane parallel to the image itself) are highly correlated to the light intensities along the trans-sections of the edge surface (obtained by cutting the image with a plane orthogonal to the image). As an example, Table 2 shows the correlation coefficients between the upper side-wall topological levels of an edge and the trans-section profiles along the edge length from row 5 up to row 11. The high correlation coefficients emphasize the similarity

132

Chapter 7

between the trajectories of the LER lines and the profile of the light intensity on a row of pixels along the edge length. This confirms the possibility to observe accurately the LER through the surface roughness in terms of light intensity. Table 7.2 Correlation coefficients between the topological lines at different light intensity thresholds and the light intensity along the trans-section of an edge.

row 5 row 6 row 7 row 8 row 9 row 10 row 11

0.5 threshold 0.432 0.731 0.700 0.663 0.525 0.466 0.216

0.6 threshold 0.245 0.769 0.866 0.887 0.752 0.631 0.205

0.7 threshold 0.138 0.579 0.728 0.863 0.841 0.744 0.285

0.8 threshold 0.098 0.372 0.516 0.735 0.829 0.795 0.381

Note that the surface roughness analysis can be used to spot also other defects (which usually cannot be identified in a practical way). For instance, Figure 7.12a shows the contributions plot for a different pixel row along the image (the row was categorized as a valley). It can be seen that around pixel columns 25, 50, 100, 160 and 195 the contributions to the residuals exceed the confidence limits. This means that some defects are present in terms of excessive variance of the light intensity (which could be associated to the presence of holes or of photoresist residuals). 3

1.0

2 0.8 light intensity

e contribution

1 0 -1

0.4 0.2

-2 -3

0.6

0

20

40

60

80

100 120 140 160 180 200

valley length (pixel)

0.0

0

20

40

60

80 100 120 140 160 180 200

valley length (pixel)

(a) (b) Figure 7.12 Surface roughness monitoring: identification of defects on a valley from the analysis of (a) the contributions to the residuals and (b) the profile of light intensity along the valley length.

However, only a few of these defects would unambiguously be highlighted if an off-line inspection of the light intensity profile were carried out on the same row (Figure 7.12b).

Surface characterization through multiresolution and multivariate image analysis

133

7.3.3 Edge shape monitoring system The edge shape monitoring strategy goes through the following steps: i) a segment is selected in the de-noised image of an edge; ii) the trans-section profiles in this segment are aligned to the reference one; iii) the segment is sequentially scanned by the spatial moving window along the whole trans-section width; and iv) the shape conformity to the desired standard is surveyed using the ti plots and the SPE plot, along all the positions that the moving window can take on the trans-section width. The confidence limits on the ti charts and on the SPE chart are not the same for the entire width, but are defined for each spatial window. In particular, the confidence limits of the SPE chart take into account the different variability between valleys and edges: higher limits are set for the valleys because of a more marked shape variability; lower limits are set for the edges, in which the shape is expected to be only slightly variable. Figure 7.13a shows (full thin lines) the five light intensity (trans-section) profiles on an edge segment that conforms to the required quality standards. These profiles are compared to the average light intensity profile on the reference segments (broken thick line). All test profiles are aligned along the left-hand rising branch of the reference profile. According to a rough visual inspection of these profiles, the inspected segment seems to conform to the quality standards. However, it is not completely clear whether the shape differences on the center and on the borders of the trans-section profiles are to be regarded as “regular” or not. A more rigorous and automatic monitoring of the edge shape can be done by analyzing the t1 plot (Figure 7.13b) and the SPE plot (Figure 7.13c). It can be seen that no violations of the confidence limits are detected along the trans-section width (which is scanned by the moving window). Hence, Figures 7.13b and 7.13c unambiguously designate the tested edge as an on-quality one. A similar comparison is presented in Figure 7.13d for a non-conforming segment. Again, although the analysis of Figure 7.13d is not completely satisfactory to assess the quality of the trans-section profile, the analysis of the t1 plot (Figure 7.13e) and of the SPE plot (Figure 7.13f) unambiguously points to the locations (along the trans-section width) where the nonconformity is present. Note that this test segment is categorized as an “on quality” one in the valleys, although Figure 13d shows that there is a significant difference between the borders of the test and those of the reference profiles. However, the spatial moving window approach effectively allows to identify the difference in the roughness structure between edges (smoother structure) and valleys (coarser structure).

134

Chapter 7

1.0

1.0

0.8 light intensity

light intensity

0.8 0.6 0.4

0.6 0.4 0.2

0.2 0.0

reference test

reference test 0

10

20

30

40

0.0

50

10

20

30

(a)

20

15

15

10

10

5

5

t1

t1

0

0

-5

-5

-10

-10

-15

-15

-20

-20 0

10

20

30

-25

40

0

10

20

(b)

40

(e)

100

100

80

80 SPE residual

SPE residual

30

position along the trans-section width (window no.)

position along the trans-section width (window no.)

60 40 20 0

50

(d)

20

-25

40

trans-section width (pixel)

trans-section width (pixel)

60 40 20

0

10

20

30

40

position along the trans-section width (window no.)

0

0

10

20

30

40

position along the trans-section width (window no.)

(c) (f) Figure 7.13 Edge shape monitoring for an on-quality segment (left column plots) and for an off-quality segment (right column plots). (a) light intensity profiles on one segment of a test edge conforming to the required quality standards; (b) first score as a function of the moving window number for the same conforming segment; (c) SPE statistic as a function of the moving window location for the same conforming segment; (d) light intensity profiles on one segment of a test edge not conforming to the required standards; (e) first score as a function of the moving window location for the same non-conforming segment; (f) SPE statistic as a function of the moving window number for the same non-conforming segment. The dashed lines represent the confidence limits.

Surface characterization through multiresolution and multivariate image analysis

135

7.4 The EDGE3 monitoring interface The above procedures for the quality monitoring of a photolithography process have been implemented in the Matlab® modeling language so as to build up an interface for assessing the quality of manufacturing and detecting potential anomalies in a user-friendly way. The EDGE3 interface is designed in such a way that it is possible to switch among the three monitoring strategies (LER, surface roughness, edge shape). Once the property of interest is chosen, the interface goes through a number of automatic steps without any human intervention.

Figure 7.14 Graphical interface for the surface roughness monitoring code. The sequential scanning procedure surveys the test image (upper right image) through a nested PCA model: the discriminant analysis (scores diagram in the upper left part) between edge and valleys, and the monitoring models with T2 and SPE alarms and the contribution of each pixel to the alarms, for diagnosis and localization.

For instance, in the case of LER monitoring one edge is randomly selected from a test image and, after aligning the profile to the reference image, the four light intensities lines are automatically identified and the monitoring procedure is carried out through the scores plot and residual plots of both the edge walls. In the case of surface roughness monitoring, a similar approach is taken. An example of the interface window for surface monitoring is shown in Figure 7.14.

136

Chapter 7

The scanning procedure surveys sequentially (row by row as highlighted by the red line on the image in the upper right side of the interface) the test surface projecting the pixels of the row under investigation onto a 2-D scores plot (the upper left plot of Figure 7.13). This scores diagram identifies the normal surface condition within the dashed ellipse, and performs an unsupervised discriminant analysis between edges (projected inside the blue dotted ellipse) and valleys (projected inside the green dotted ellipse). This is the external level of the “nested” PCA method described in the previous sections. The lower part of the interface represents the monitoring stage of the nested PCA procedure. Two monitoring models are present: the model for the valleys (the four diagrams on the right) and the model for the edges (the four diagrams on the left). When evaluating the roughness on the edges, only the model of the edges is interrogated and the model for the valleys is in stand-by, and vice versa. The switching between the models is managed by the external level of the nested PCA scheme, while the internal level performs the monitoring of the surface through an alarm signal on the T2 and the SPE statistics (the penultimate row of diagrams). If the Hotelling statistics and the SPE residuals of the test samples are within the limits identified by the reference set, the signal remain to the value 0; otherwise, the alarm signal takes the value 1; the alarm signal has a -1 value when the model is in standby. The variance of the contribution to the statistical indices for every pixel of the reference data are computed, and the relevant 95% confidence limits are calculated for the reference morphology (last row diagrams in the interface). Finally, in the case of shape monitoring, the soft sensor randomly selects a limited portion of the test image. Then, the test sample is compared to the reference; and finally the shape of the edge trans-section profile is evaluated through the scores ti and SPE residuals monitoring charts. Alarm indicators appear if the shape of the test edge is recognized as off-quality.

7.5 Concluding remarks One of the most important step in the integrated circuit fab is photolithography, because of its economical and operational impact on the manufacturing scheduling. The greatest part of the monitoring efforts in photolithography is focused on the measurement of the most important physical parameters of a photolithographed device, such as CD, LER, or SWA and the common inspection tools are optical instruments. An image, however, retains information that largely exceed the mere metrology, and which are useful to identify the complex nature of the manufactured product quality. Through image analysis, and in particular advanced image processing techniques, it is possible to access without human intervention several meaningful clues that help to better understand the manufacturing process, to identify the critical situations (for the product quality and the process progress) and to counteract the problems.

Surface characterization through multiresolution and multivariate image analysis

137

In particular, in this Chapter, a monitoring system for the after-development quality evaluation was proposed. The approach, based on a combination of multiresolution techniques and multivariate statistics, can deal with the hidden characteristics of a device, performing all the tasks of the instrumental sensors, but grasping additionally features of the surface product, which are commonly inaccessible. Being quality a multivariable property in nature, multivariate statistical techniques were exploited for feature extraction of the information embedded in the image. Although the signals were corrupted by noise and alterations, which affects different scales of resolutions, only the relevant scales were considered through a multiscale treatment by means of the wavelet decomposition. The result was a multiscale and multivariate monitoring framework that inspects the quality of the photolithographed device through the analysis of a SEM image. The proposed monitoring system was shown to be an effective tool for the assessment of the product quality. By adding new features for the measurement of the critical physical parameters, the image analysis system performs a full scanning of the surface, identifying and localizing the defects and anomalies, and detecting in advance the drifts of the process.

Conclusions and perspectives Despite batch processes are relatively simple to set up and to carry out with limited degree of automation and with only partial knowledge of the underlying mechanisms of the process, it is always difficult to ensure a consistently high and reproducible quality of the product. Common instrumentation used in the industrial practice rarely provide realtime measurements of the product quality. Further complications arise from the multivariate nature of quality, which depends form a series of physical, operational, even subjective parameters. Although not easily accessible, information on quality is embedded in the value of the process variables, usually collected by process computers and stored in historical databases. Multivariate statistical methods allow to reduce the dimensionality of the problem to a latent subspace that explains the relevant part of the variability of product quality, dealing with noisy, redundant and highly correlated variables, with outliers or missing values, too. The aim of this Thesis was the development of multivariate statistical techniques for quality monitoring in the batch manufacturing of high value added products. Two classes of products were considered: products whose quality is determined by chemical/physical characteristics, and products whose surface properties define quality. The main scientific contributions of the PhD project have been: • the development of a strategy to design software sensors for the realtime estimation of product quality in batch processes; • the non-conventional application of latent variables methods for the prediction of the length of batch processes; • the development of an innovative methodology for the multiresolution and multivariate monitoring of the quality from images of a manufactured product. Soft sensors for the online estimation of product quality in batch polymerization processes were developed and implemented online in an industrial case study. These soft sensors are based on partial least squares, which regresses product quality from the process measurements that are available online. The accuracy of the estimation demonstrated to be similar to the one of the laboratory measurements, but the estimator can be interrogated with high frequency (on the order of s-1), namely hundreds of times faster then the lab assays (on the order of h-1). Furthermore, the estimations are available in real time, without the delay that is typical of the laboratory measurements. To compensate data nonlinearities and changes in the correlation structure between variables, the adopted procedure split a batch into a limited number of estimation phases. Within each of these phases, linear PLS submodels were shown to provide accurate quality estimations, and the switching from one submodel to following one is triggered in correspondence to process events, that can be easily detected from process

140

Conclusions and future perspectives

variables. The key characteristic of the proposed soft sensor is the inclusion of dynamic information through either lagged measurement (i.e. addition past values of some relevant process variables to the reference dataset) or incorporating “time memory” through a moving average approach. Including time information demonstrated to be a highly favorable approach to improve the estimation accuracy. Furthermore, averaging the process measurement values on a moving window of fixed length attenuates process noise, dampens spikes and compensates for the effect of temporarily missing values without introducing other significant computational difficulties. Caution was suggested in the selection of the moving window width, because too wide a window may delay the appearance of alarms on the reliability of the estimation. From the operational point of view, this system can help the operating personnel to promptly detect drifts on product quality, and to suggest timely adjustments of the processing recipe, minimizing the off-specifications of the final product. Moreover, the number of the quality sampling can be reduced drastically, determining a gain either in terms of overall processing time, and of lab related costs, and of manpower organization. Also a new soft sensor was developed to assist the online monitoring of product quality of batch processes and to deliver helpful information for an effective production planning: a soft sensor for the realtime prediction of batch length. This monitoring strategy utilized a timeevolving PLS modeling approach, that exploits the incremental information collected during a batch to forecast the length of the batch or the length of any of the production stages. Very satisfactory accuracy of the predictions was obtained, because the prediction error is much lower than both the variability of batch length and one operator shift length. The initial part of the batch confirmed to be of crucial importance for the batch length, because the initial conditions of the pieces of equipment, the state of the raw materials and the heat up of the reactor usually exhibit a deep influence on the batch performances. The information on the batch duration allows a better scheduling of the interventions on the plant, the optimization of the manpower both in terms of shifts and roles, the organization of a convenient utilization of plant equipment. The effectiveness of the soft sensors for the estimation of the quality and for the prediction of the batch duration was tested by applying and implementing online the above mentioned techniques to the case of an industrial batch production of resins by polymerization. Finally, multivariate statistical technique were exploited also in the field of image analysis. In the industrial practice, inspections of the product from image analysis are often mere measurements of the most important physical parameters, conveniently enhanced through filtering techniques. Moreover, these measurements are derived from non-systematic inspections. However, a lot of useful information are stored into images to allow for a systematic identification of the complex nature of the manufactured product quality. A totally automatic system for the realtime monitoring from images of high value added products was developed. By exploiting multiresolution and multivariate images analysis, this monitoring

Conclusions and future perspectives

141

system was tested in the case study of the after-photolithography characterization of a semiconductor surface in the manufacturing of integrated circuits. Advanced multivariate image analysis techniques extracted the traces that the process always leaves onto the product, helping both the identification of critical situations in the process and the neutralization of the problems. The proposed approach is based on the preliminary multiresolution filtering of the image through wavelet decomposition. Then, a monitoring scheme was developed to perform the parallel analysis of surface roughness and surface shape of the product. For example, it was shown that the surface roughness can be explored through a “nested” principal component analysis, a two-level methodology that allows to discriminate different parts of the inspected surface through the outer level PCA that performs a cluster analysis, and allows to monitor the surface roughness in the inner level PCA. The shape of the surface pattern was analyzed through a “spatial moving window” PCA approach, that retains information on spatial characteristics, taking into account both nonlinearities and different structural features of the inspected surface. To sum up, this system can help detecting and identifying several hidden characteristics of the product, which are commonly inaccessible, in a fully automatic fashion and without the human intervention. Furthermore, this tool revealed to be fast, sensitive, reliable, and unambiguous, operating the full scanning of the product surface for the precise localization of defects and anomalies, or the detection of process drifts. In conclusion, although the proposed methodologies were tested on particular case studies, they demonstrated to have great potentials and to be quite general. For this reason, it is possible to extend their application to different fields of research and different industrial applications (e.g.: food engineering; pharmaceutical industry; biotechnologies; etc…) or to different scales of investigation, from macro-scale to nano-scale. In particular, the future perspectives aim to unravel some issues that are still open for investigation. First of all, the monitoring performance of dynamic/spatial multivariate statistical methods could be more powerful if the inclusion of dynamic/spatial information were tailored to the current state of the system under study. This means that the lagged variable strategy could be highly improved if the selection of the variables and the choice of the time lags for every lagged variable fit the current state of the process. Also the moving window strategies could be improved if the size of the window can be modified during the batch to better describe the variability of the system, enlarging the window in the operating stages where low variability is experienced, or shrinking the window where the system shows larger variability. The examination of variance-covariance structure of the data can be highly beneficial to this purpose. Further research is needed also to solve the problem of the model adaptation to the changeable nature of the production processes, because multivariate statistical models are assumed to be time invariant. Although recursive strategies for batch processes are available in literature, they usually assume that the best reference data to build an adaptive model is

142

Conclusions and future perspectives

constituted by the most recent available data. However, there are many industrial situations where the batches that are the most similar to the current one are not necessarily the nearest in time. An adaptation scheme is required to tailor the model update to the running batch, in the sense that the calibration of a batch can be customized to the incoming batch. The model update could be managed by an artificial intelligence, which selects the best reference batches from a library of past ones in terms of monitoring purposes. This decision can be based on the idea of similarity, possibly by exploiting the information content initially available in a batch, and thus allowing to evaluate whether and how to update the model since the very beginning of a new production batch.

References Addison, P. S. (2002). The illustrated wavelet transform handbook. IOP Publishing, London (U.K.). Aguado, D., A. Ferrer, A. Seco, and J. Ferrer (2006). Comparison of different predictive models for nutrient estimation in a sequencing batch reactor for wastewater treatment. Chemom. Intell. Lab. Sys., 84, 75-81. Apetrei, C., I. M. Apetrei, I. Nevares, M. del Alamo, V. Parra, M. L. Rodrìguez-Méndez, J. A. De Saja (2007). Using an e-tongue based on voltammetric electrodes to discriminate among red wines aged in oak barrels or aged using alternative methods. Correlation between electrochemical signals and analytical parameters. Electrochimica Acta, 52, 2588-2594. Arvisenet, G., L. Billy, P. Poinot, E. Vigneau, D. Bertrand, and C. Prost (2008). Effect of apple particle state on the release of volatile compounds in a new artificial mouth device. J. Agric. Food Chem., 56, 3245-3253. Baffi, G., E. B. Martin, and A. J. Morris (1999a). Non-linear projection to latent structures revisited: the quadratic PLS algorithm. Computers Chem. Eng., 23, 395-411. Baffi, G., E. B. Martin, and A. J. Morris (1999b). Non-linear projection to latent structures revisited (the neural network PLS algorithm). Computers Chem. Eng., 23, 1293-1307. Bakshi, B. R., (1998). Multiscale PCA with application to multivariate statistical process monitoring. AIChE J., 44, 1596-1610. Bakshi, B. R., M. N. Nounou, P. K. Goel, and X. Shen (2001). Multiscale bayesian rectification of data from linear steady-state and dynamic systems without accurate models. Ind. Eng. Chem. Res., 40, 261-274. Bartolacci, G., P. J. Pelletier, J. J. Tessier, C. Duchesne, P. A. Bossè, J. Fournier (2006). Application of numerical image analysis to process diagnosis and physical parameter measurement in mineral processes - Part I: Flotation control based on froth textural characteristics. Minerals Eng., 19, 734-747. Beaver, S., A. Palazoglu and J. A. Romagnoli (2007). Cluster analysis for autocorrelated and cyclic chemical process data. Ind. Eng. Chem. Res., 46, 3610-3622. Bharati, M. H., J. F. MacGregor and W. Tropper (2003). Softwood lumber grading through on-line multivariate image analysis techniques. Ind. Eng. Chem. Res., 42, 5345-5353. Bharati, M. H., J. J. Liu, and J. F. MacGregor (2004). Image texture analysis: methods and comparisons. Chemom. Intell. Lab. Sys., 72, 57-71.

144

References

Blais, P., M. Micheals and J. N. Helbert (2001). Issues and trends affecting lithography tool selection strategy. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.). Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems: effect of inequality of variance in one way classification. The Annals of Mathematical Statistics, 25, 290-302. Brauner, N., and M. Shacham (2000). Considering precision of data in reduction of dimensionality and PCA. Computers Chem. Eng., 24, 2603-2611. Brosnan, T., and D. W. Sun (2004). Improving quality inspection of food products by computer vision - a review. J. Food Eng., 61, 3–16. Burnham, A. J., J. F. MacGregor, and R. Viveros (1999). Latent variable multivariate regression modeling. Chemom. Intell. Lab. Sys., 48, 167-180. Camacho, J. and J. Picò (2006a). Multi-phase principal component analysis for batch processes modeling. Chemom. Intell. Lab. Sys., 81, 127-136. Camacho, J. and J. Picò (2006b). Online monitoring of batch processes using multi-phase principal component analysis. J. Process Control, 16, 1021-1035. Camacho, J., J. Picò and A. Ferrer (2008a). Bilinear modeling of batch processes. Part 1: theorical discussion. J. Chemom., 22, 299-308. Camacho, J., J. Picò and A. Ferrer (2008b). Multiphase analysis framework for handling batch process data. J. Chemom., 22, 632-643. Capron, X., B. Walczak, O. E. de Noord, and D.L. Massart (2005). Selection and weighting of samples in multivariate regression model updating. Chemom. Intell. Lab. Sys., 76, 205214. Chang, H., J. Chen and Y. P. Ho (2006). Batch process monitoring by wavelet transform based fractal encoding. Ind. Eng. Chem. Res., 45, 3864-3879. Chen, J. and K. Liu (2002). On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chem. Eng. Sci., 57, 63-75. Chen, G., T. J. MacAvoy, and M. Piovoso (1998). A multivariate statistical controller for online quality improvement. J. Process Control, 8, 139-149. Chen, F. Z., and X. Z. Wang (2000). Discovery of operational spaces from process data for production of multiple grades of products. Ind. Eng. Chem. Res., 39, 2378-2383. Chiang, L. H., and L. F. Colegrove (2007). Industrial implementation of on-line multivariate quality control. Chemom. Intell. Lab. Sys., 88, 143-153. Chiang, L. H., E. L. Russel, and R. D. Braatz (2001). Fault detection and diagnosis in industrial systems. Springer, London (U.K.). Choi, S. W., E. B. Martin, A. J. Morris, and I. B. Lee (2006). Adaptive multivariate statistical process control for monitoring time-varying processes. Ind. Eng. Chem. Res., 45, 31083118.

References

145

Choi, S. W., and I. B. Lee (2005). Multiblock PLS-based localized process diagnosis. J. Process Control, 15, 295-306. Chong, I. G., and C. H. Jun (2005). Performance of some variable selection methods when multicollinearity is present. Chemom. Intell. Lab. Sys., 78, 103-112. Chu, Y. H., Y. H. Lee, and C. Han (2004). Improved quality estimation and knowledge extraction in a batch process by bootstrapping-based generalized variable selection. Ind. Eng. Chem. Res., 43, 2680-2690. Çinar, A., S. J. Parulekar, C. Ündey, and G. Birol (2003). Batch fermentation modeling, monitoring, and control. Marcel Dekker Inc., New York (U.S.A.). Clément, A., M. Dorais, and M. Verno (2008). Multivariate approach to the measurement of tomato maturity and gustatory attributes and their rapid assessment by vis-NIR spectroscopy. J. Agri. Food Chem., 56, 1538–1544. Conlin, A. K., E. B. Martin and A. J. Morris (2000). Confidence limits for contribution plots. J. Chemom., 14, 725-736. Costantoudis, V., G. P. Patsis, A. Tserepi, and E. Gogolides (2003). Quantification of lineedge roughness of photoresist. II. Scaling and fractal analysis and the best roughness descriptors. J. Vac. Sci. Technol. B, 21, 3, 1019-1026. Dayal, B. S., and J. F. MacGregor (1997a). Improved PLS algorithms. J. Chemom., 11, 73-85. Dayal, B. S., and J. F. MacGregor (1997b). Recursive exponentially weighted PLS and its applications to adaptive control and prediction. J. Process Control, 7, 169-179. de Jong, S.(1993). An alternative approach to partial least squares regression. Chemom. Intell. Lab. Sys., 18, 251-263. Doan, X. T. and R. Srinivasan (2008). Online monitoring of multi-phase batch processes using phase-based multivariate statistical process control. Computers Chem. Eng., 32, 230-243. Dokucu, M. T., and F. J. Doyle III (2008). Batch-to-batch control of characteristic points on the PSD in experimental emulsion polymerization. AIChE J., 54, 3171-3187. Dokken, K. M., and L. C. Davis (2007). Infrared imaging of sunflower and maize root anatomy. J. Agric. Food Chem., 55, 10517–10530. Donarski, J. A., S. A. Jones, and A. J. Charlton (2008). Application of cryoprobe 1H nuclear magnetic resonance spectroscopy and multivariate analysis for the verification of Corsican honey. J. Agric. Food Chem., 56, 5451–5456. Doymaz, F., A. Palazoglu, and J. A. Romagnoli (2003). Orthogonal nonlinear partial leastsquares regression. Ind. Eng. Chem. Res., 42, 5836-5849. Du, C. J., and D. W. Sun (2004). Recent developments in the applications of image processing techniques for food quality evaluation. Trends in Food Science & Technology, 15, 230-249.

146

References

Du, C. J., D. W. Sun (2008). Multi-classification of pizza using computer vision and support vector machine. J. Food Eng., 86, 234-242. Durante, C., M. Cocchi, M. Grandi, A. Marchetti, R. Bro (2006). Application of N-PLS to gaschromatographic and sensory data of traditional balsamic vinegars of Modena. Chemom. Intell. Lab. Sys., 83, 54-65. Eastment, H. T., and W. J. Krzanowski (1982). Cross-validatory of the number of components from a principal component analysis. Technometrics, 24, 73-77. Edgar, T. F., S. W. Butler, W. J. Campbell, C. Pfeiffer, C. Bode, S. B. Hwang, K. S. Balakrishnan and J. Hahn (2000). Automatic control in microelectronics manufacturing: practices, challenges, and possibilities. Automatica, 36, 1567-1603. Edgar, T. E (2004). Control and operations: when does controllability equal profitability. Computers Chem. Eng., 29, 41-49. El Chemali, C., J. Freudemberg, M. Hankinson and J. J. Bendik (2004). Run-to-run critical dimension and sidewall angle lithography control using the PROLITH simulator. IEEE Trans. Semiconductor Manuf., 17, 3, 388-401. ElMasry, G., N. Wang, C. Vigneault, J. Qiao, and A. ElSayed (2008). Early detection of apple bruises on different background colors using hyperspectral imaging. LWT, 41, 337-345. Eriksson, L., E. Johansson, N. Kettaneh-Wold and S. Wold (2001). Multi- and megavariate data analysis principles and applications. Umetrics Academy, Umeå (Sweden). Facco, P., (2005). Monitoring a semi-continuous polymerization process using multivariate statistical methods (in Italian). Tesi di Laurea, DIPIC, Università di Padova (Italy). Facco, P., M. Olivi, C. Rebuscini, F. Bezzo and M. Barolo (2007). Multivariate Statistical Estimation of Product Quality in the Industrial Batch Production of a Resin. In: Proc. DYCOPS 2007 – 8th IFAC Symposium on Dynamics and Control of Process Systems, (B. Foss and J. Alvarez, Eds.), Cancun (Mexico), June 6-8, vol. 2, 93-98. Facco, P., A. Faggian, F. Doplicher, F. Bezzo and M. Barolo (2008a). Virtual sensors can reduce lab analysis requirements in the industrial production of specialty chemicals. In: Proc. EMCC5 – 5th Chemical Engineering Conference for Collaborative Research in Eastern Mediterranean Countries, Cetraro (Italy), May 24-29, 178-181. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008b). “Monitoraggio multivariato e multiscala di processi di fotolitografia per la produzione di semiconduttori”. In: Proc. Congresso Gr.I.C.U 2008: Ingegneria Chimica, le nuove sfide, LaCastella (KR), September 14-17, 1383-1388. Facco, P., F. Bezzo, J. A. Romagnoli and M. Barolo (2008c). “Using digital images for fast, reliable, and automatic characterization of surface quality: a case study on the manufacturing of semiconductors”. In: Workshop on nanomaterials production, characterization and industrial applications, December 3, Milano (Italia).

References

147

Facco, P., F. Doplicher, F. Bezzo and M. Barolo (2009a). “Moving-average PLS soft sensor for online product quality estimation in an industrial batch polymerization process”. J. Process Control, in press. doi:10.1016/j.jprocont.2008.05.002 Facco, P., R. Mukherjee, F. Bezzo, M. Barolo and J. A. Romagnoli (2009b). “Monitoring Roughness and edge shape on semiconductors through multiresolution and multivariate image analysis”. AIChE J., in press. Faggian, A., P. Facco, F. Bezzo and M. Barolo (2009). “Multivariate statistical real-time monitoring of an industrial fed-batch process for the production of specialty chemical”. Chem. Eng. Res. Des., in press. doi:10.1016/j.cherd.2008.08.019 Flores-Cerrillo, J., and J. F. MacGregor (2002). Control of particle size distributions in emulsion semibatch polymerization using mid-course correction policies. Ind. Eng. Chem. Res., 41, 1805-1814. Flores-Cerrillo, J., and J. F. MacGregor (2003). Within-batch and batch-to-batch inferentialadaptive control of semibatch reactors: a partial least squares approach. Ind. Eng. Chem. Res., 42, 3334-3345. Flores-Cerrillo, J., and John F. MacGregor (2004). Control of batch product quality by trajectory manipulation using latent variable models. J. Process Control, 14, 539–553. Fransson, M., and S. Folestad (2006). Real-time alignment of batch process data using COW for on-line process monitoring. Chemom. Intell. Lab. Sys., 84, 56-61. García-Muñoz, S., T. Kourti, J. F. MacGregor, A. G. Mateos and G. Murphy (2003). Troubleshooting of an industrial batch process using multivariate methods. Ind. Eng. Chem. Res., 42, 3592-3601. Garcia-Muñoz, S., T. Kourti, and J. F. MacGregor (2004). Model Predictive Monitoring for Batch Processes. Ind. Eng. Chem. Res., 43, 5929-5941. Geladi, P., (1995). Sampling and local models for multivariate image analysis. Mikrochim. Acta, 120, 211-230. Geladi, P. and H. Grahn (1996). Multivariate Image Analysis, John Wiley & Sons, Inc., New York (U.S.A.). Geladi, P. and R. Kowalski (1986). Partial least squares regression: a tutorial. Anal. Chim. Acta, 185, 1-17 Giordani, D. S., A. F. Siqueira, M. L. C. P. Silva, P. C. Oliveira, and H. F. de Castro (2008). Identification of the biodiesel source using an electronic nose. Energy & Fuels, 22, 2743–2747. Giri, S., J. R. Idle, C. Chen, T. M. Zabriskie, K. W. Krausz, and F. J. Gonzalez (2006). A Metabolomic Approach to the metabolism of the areca nut alkaloids arecoline and arecaidine in the mouse. Chem. Res. Toxicol., 19, 818-827.

148

References

Guldi, R. L. (2004). Inline defect reduction from a historical perspective and its implication for future integrated circuits manufacturing. IEEE Trans. Semiconductor Manuf., 17, 4, 629-639. Gunther, J. C., J. S. Conner, and D. E. Seborg (2009). Process monitoring and quality variable prediction utilizing PLS in industrial fed-batch cell culture. J. Process Control, in press. doi:10.1016/j.jprocont.2008.11.007

Härdle, W., and L. Simar (2007). Applied multivariate statistical analysis (2nd ed.). Springer, New Jork (USA). Hare, L. (2003). SPC: from chaos to wiping the floor. Quality progress. Available at: http://www.asq.org/pub/qualityprogress/past/0703/58spc0703.html

[accessed on October 1st, 2008]. Harrison, L., P. Dastidar, H. Eskola, R. Järvenpää, H. Pertovaara, T. LuukkaalaP. L. Kellokumpu-Lehtinen, S. Soimakallio (2008). Texture analysis on MRI images of nonHodgkin lymphoma. Computers in Biology and Medicine, 38, 519-524. Helbert, J. N., and T. Daou (2001). Resist technology – Design, processing and applications. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.)/ William Andrew Publishing, LLC, Norwich, New York (U.S.A.). Höskuldsson, A., (1988). PLS regression methods. J. Chemom., 2, 211-228. Höskuldsson, A., (2001). Variable and subset selection in PLS regression. Chemom. Intell. Lab. Sys., 55, 23-38. Hwang, D. H., and C. Han (1999). Real-time monitoring for a process with multiple operating modes. Control Eng. Practice, 7, 891-902. Jackson, J. E. (1991). A user’s guide to principal components. John Wiley & Sons Inc., New York (U.S.A.). Jackson, J. E., and G. S. Mudholkar (1979). Control procedures for residuals associated with principal component analysis. Technometrics, 21, 341-349. Johnson, R. A., and D. W. Wichern (2007). Applied multivariate statistical analysis (6th ed.). Pearson International Edition, Upper Saddle River (USA). Kaistha, N., and C. F. Moore (2001). Extraction of event times in batch profiles for time synchronization and quality predictions. Ind. Eng. Chem. Res., 40, 252-260. Kamohara, H., A. Takinami, M. Takeda, M. Kano, S. Hasebe and I. Hasimoto (2004). Product quality estimation and operating condition monitoring for industrial ethylene fractionator. J. Chem. Eng. Japan, 37, 422-428. Kano, M., and Y. Nakagawa (2008). Data-based process monitoring, process control and quality improvement: recent developments and applications in steel industry. Computers Chem. Eng., 32, 12-24.

References

149

Kano, M., N. Showchaiya, S. Hasebe, and I. Hashimoto (2003). Inferential control of distillation composition: selection of model and control configuration. Control Eng. Practice, 11, 927-933. Kassidas, A., J. F. MacGregor, and P. A. Taylor (1999). Synchronization of batch trajectories using dynamic time warping. AIChE J., 44, 864-876. Khan, A. A., J. R. Moyne, and D. M. Tilbury (2008). Virtual metrology and feedback control for semiconductor manufacturing process using recursive partial least squares. J. Process Control, 18, 961-974. Kim, C., and C. H. Choi (2007). Image covariance-based subspace method for face recognition. Pattern Recognition, 40, 1592-1604. Kim, M., Y. H. Lee, I. S. Han, and C. Han (2005). Clustering-based hybrid soft sensor for industrial polypropylene process with grade changeover operation. Ind. Eng. Chem. Res., 44, 334-342. Kirdar, A. O., K. D. Green, and A. S. Rathore (2008). Application of multivariate data analysis for identification and successful resolution of a root cause for a bioprocessing application, Biotechnol. Prog., 24, 720-726. Knight, S., R. Dixon, R. L. Jones, E. K. Lin, N. G. Orji, R. Silver, J. S. Villarrubia, A. E. Vladár and W. Wu (2006). Advanced metrology needs for nanoelectric lithography. C. R. Physique, 7, 931-941. Komulainen, T., M. Sourander, S. L. Jämsä-Jounela (2004). An online application of dynamic PLS to a dearomatization process. Computers Chem. Eng., 28, 2611-2619. Kosanovich, K. A., M. J. Piovoso (1997). PCA of wavelet transformed process data for monitoring. Intelligent Data Analysis, 1, 85-99. Kourti, T. (2003). Mulivariate dynamic data modeling for analysis and statistical process control of batch processes, start-ups and grade transitions. J. Chemom., 17, 93-109. Kourti, T. (2005). Application of latent variable methods to process control and multivariate statistical process control in industry. Int. J. Adapt. Control Signal Process., 19, 213246. Kourti, T. and J. F. MacGregor (1995). Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom. Intell. Lab. Sys., 28, 3-21. Kresta, J. V. , J. F. MacGregor and T. E. Marlin (1991). Multivariate statistical monitoring of process operating performance. Canadian J. Chem. Eng., 69, 35-47. Kresta, J. V., T. E. Marlin, and J. F. MacGregor (1994). Development of inferential process models using PLS. Computers Chem. Eng.,18, 597-611. Kruger, U., Y. Zhou and G. W. Irwing (2004). Improved principal component monitoring of large scale processes. J. Process Control, 14, 879-888. Ku, W., R. H. Storer and C. Georgakis (1995). Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Sys., 30, 179-196.

150

References

Lee, F. (2001). Lithography process monitoring and defect detection. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.). Lee, J. H., and A. W. Dorsey (2004). Monitoring of batch processes through state-space models. AIChE J., 50, 1198-1210. Lee, J. H., A. W. Dorsey, and S. Russell (2004). Inferential product quality control of a multistage batch plant. AIChE J., 50, 136-148. Lee, Y. H., M. Kim, Y. H. Chu, and C. Han (2005a). Adaptive multivariate regression modeling based on model performance assessment. Chemom. Intell. Lab. Sys., 78, 6373. Lee, D. S., J. M. Park, and P. A. Vanrolleghem (2005b). Adaptive multiscale principal component analysis for on-line monitoring of a sequencing batch reactor. J. Biotech., 116, 195-210. Lee, . Y., S. S. Shah, C. C. Zimmer, G. Liu, and A. Revzin (2008a). Use of photolithography to encode cell adhesive domains into protein microarrays. Langmuir, 24, 2232-2239. Lee, A. C., K. Shedden, G. R. Rosania, and G. M. Crippen (2008b). Data mining the NCI60 to predict generalized cytotoxicity. J. Chem. Inf. Model., 48, 1379–1388. Lennox, B., G. A. Montague, H. G. Hiden, G. Kornfeld, P. R. Goulding (2001). Process monitoring of an industrial fed-batch fermentation. Biotechnology and Bioengineering, 74, 125-135. Li, B., J. Morris, E. B. Martin (2002). Model selection for partial least squares regression. Chemom. Intell. Lab. Sys., 64, 79-89. Li, W. and S. J. Qin (2001). Consistent dynamic PCA based on errors-in-variables subspace identification. J. Process Control, 11, 661-678. Lieftucht, D., U. Kruger, L. Xie, T. Littler, Q. Chen, and S. Q. Wang (2006). Statistical monitoring of dynamic multivariate processes – Part 2. Identifying fault magnitude and signature. Ind. Eng. Chem. Res., 45, 1677-1688. Lin, B., B. Recke, J. K. H. Knudsen, and S. B. Jørgensen (2007). A systematic approach for soft sensor development. Computers Chem. Eng., 31,419-425. Lindgren, S., and S. Rännar (1998). Alternative partial least-squares (PLS) algorithms. Perspectives in Drug Discovery and Design, 12/13/14, 105-113. Liu, J. J., and J. F. MacGregor (2005). Modeling and optimization of product appearance: application to injection-molded plastic panels. Ind. Eng. Chem. Res., 44, 4687-4696. Liu, J. J., M. H. Bharati, K. G. Dunn, and J. F. MacGregor (2005). Automatic masking in multivariate image analysis using support vector machines. Chemom. Intell. Lab. Sys., 79, 42-54. Liu, J. J., D. Kim, and C. Han (2007a). Use of wavelet packet transform in characterization of surface quality. Ind. Eng. Chem. Res., 46, 5152 -5158.

References

151

Liu, J. J., J. F. MacGregor, C. Duchesne, and G. Bartolacci (2007b). Flotation froth monitoring using multiresolutional multivariate image analysis. Minerals Engineering, 18, 65-76. Liu, J., Q. Li, J. Dong, J. Chen, and G. Gu (2008). Multivariate modeling of aging in bottled lager beer by principal component analysis and multiple regression methods. J. Agric. Food Chem., 56, 7106-7112. Ljung, L., (1999). System identification. Theory for the user (2nd ed.). Prentice Hall, Upper Saddler River (USA). Louwerse, D. J., and A. K. Smilde (2000). Multivariate statistical process control of batch processes based on three-way models. Chem. Eng. Sci., 55, 1225-1235. Lu, N., F. Gao and F. Wang (2004a). Sub-PCA modeling and on-line monitoring strategy for batch processes. AIChE J., 50, 255-259. Lu, N. and F. Gao (2005a). Stage-based process analysis and quality prediction for batch processes. Ind. Eng. Chem. Res., 44, 3547-3555. Lu, N. and F. Gao (2006). Stage-based online quality control for batch processes. Ind. Eng. Chem. Res., 45, 2272-2280. Lu, N., Y. Yang, F. Gao and F. Wang (2004b). Multirate dynamic inferential modeling for multivariable processes. Chem. Eng. Sci., 59, 855-864. Lu, N., Y. Yang, F. Gao and F. Wang (2004c). PCA-based modeling and on-line monitoring strategy for uneven length batch process. Ind. Eng. Chem. Res., 2004, 43, 3343-3352. Lu, N., Y. Yao, F. Gao, and F. Wang (2005b). Two-dimensional dynamic PCA for batch process monitoring. AIChE J., 51, 3300-3304. MacGregor, J. F., T. E. Marlin, J. Kresta and B. Skagerberg (1991). Multivariate statistical methods in process analysis and control. In: Chemical process control (Y. Arkun e W. H. Ray, Eds.), CACHE Austin, AIChE New York (U. S. A.). Mallat, S. G., (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on pattern analysis and machine intelligence, 11, 674-693. Marjanovic, O., B. Lennox, D. Sandoz, K. Smith and M. Crofts (2006). Real-time monitoring of an industrial batch process. Computers Chem. Eng., 30, 1476-1481. Marín, S., M. Vinaixa, J. Brezmes, E. Llobet, X. Vilanova, X. Correig, A. J. Ramos, V. Sanchis (2007). Use of a MS-electronic nose for prediction of early fungal spoilage of bakery products. International Journal of Food Microbiology, 114, 10–16. Maulud, A., D. Wang, and J. A. Romagnoli (2006). A multi-scale orthogonal nonlinear strategy for multi-variate statistical process monitoring. J. Process Control, 16, 671683. Misra, M., H. H. Yue, S. J. Qin, and C. Ling (2002). Multivariate process monitoring and fault diagnosis by multi-scale PCA. Computers Chem. Eng., 26, 1281-1293.

152

References

Mosteller, F., and D. L. Wallace (1963). Inference in an authorship problem. J. Amer. Statist. Assoc., 58, 275-309. Montgomery, D. C. (2005). Introduction to statistical quality control (5th edition). John Wiley & Sons Inc., Danvers (USA). Montgomery, D.C., and G. C. Runger (2003). Applied statistics and probability for engineers (3rd ed.). John, Wiley and Sons Inc., Danvers (USA). Neogi, D., and C. E. Schlags (1998). Multivariate statistical analysis of an emulsion batch process. Ind. Eng. Chem. Res., 37, 3971-3979. Nomikos, P. (1996). Detection and diagnosis of abnormal batch operations based on multiway principal component analysis. ISA Trans., 35, 259-266. World Batch Forum, Toronto, May. Nomikos, P., and J. F. MacGregor (1994). Monitoring batch processes using multiway principal component analysis. AIChE J., 40, 1361-1375. Nomikos, P., and J. F. MacGregor (1995a). Multivariate SPC charts for monitoring batch processes. Technometrics, 37, 41-59. Nomikos, P., and J. F. MacGregor (1995b). Multi-way partial least squares in monitoring batch processes. Chemom. Intell. Lab. Sys., 30, 97-108. Patsis, G. P., V. Constantoudis, A. Tserepi, E. Gogolides and G. Grozevb (2003). Quantification of line-edge roughness of photoresists. I. A comparison between off-line and on-line analysis of top-down scanning electron microscopy images. J. Vac. Sci. Technol. B, 21, 1008-1018. Qiao, J., M. O. Ngadi, N. Wang, C. Garièpy, S. O. Prasher (2007). Pork quality and marbling level assessment using a hyperspectral imaging system. J. Food Eng., 83, 10-16. Qin, S. J., (1998). Recursive PLS algorithms for adaptive data modeling. Computers Chem. Eng., 22, 503-514. Qin, S. J., and R. Dunia (2000). Determining the number of principal components for best reconstruction. J. Process Control, 10, 245-250. Quevedo, R., L. G. Carlos, J. M. Aguilera, and L. Cadoche (2002). Description of food surfaces and microstructural changes using fractal image texture analysis. J. Food Eng., 53, 361-371. Ramaker, H. J., E. N. M. van Sprang, S. P. Gurden, J. A. Westerhuis, and A. K. Smilde (2002). Improved monitoring of batch processes by incorporating external information. J. Process Control, 12, 569-576. Ramaker, H. J., E. N. M. van Sprang, J. A. Westerhuis and A. K. Smilde (2005). Fault detection properties of global, local and time evolving models for batch process monitoring. J. Process Control, 15, 799-805. Rännar, S., J. F. MacGregor and S. Wold (1998). Adaptive batch monitoring using hierarchical PCA. Chemom. Intell. Lab. Sys., 41, 73-81.

References

153

Rao, A. R. (1996). Future directions in industrial machine vision: a case study of semiconductor manufacturing applications. Image and Vision Computing, 14, 3-19. Reis, M. S., P. M. Saraiva, and B. R. Bakshi (2008). Multiscale statistical process control using wavelet packets. AIChE J., 54, 2366-2378. Romagnoli, J. A., and A. Palazoglu (2006). Introduction to process control. Taylor & Francis, Boca Raton (FL, U.S.A.). Russel, S. A., P. Kesavan, J. H. Lee and B. A. Ogunnike (1998). Recursive data-base prediction and control of batch product quality. AIChE J., 44, 11, 2442-2458. Ruttiman, U. E., M. Unser, R. R. Rowlings, D. Rio, N. F. Ramsey, V. S. Mattay, D. W. Hommer, J. A. Frank and D. R. Weinberger (1998). Statistical analysis of functional MRI data in the wavelet domain. IEEE Trans. Med. Imag., 17, 142-154. Salari, E., and Z. Ling (1995). Texture segmentation using hierarchical wavelet decomposition. Pattern Rec., 28, 1818-1824. Schievano, E., G. Pasini, G. Cozzi, and S. Mammi (2008). Identification of the production chain of Asiago d’Allevo cheese by nuclear magnetic resonance spectroscopy and principal component analysis. J. Agric. Food Chem., 56, 7208-7214. Seborg, D. E., T. F. Edgar, and D. A. Mellichamp (2004). Process dynamics and control (2nd. ed.). John Wiley & Sons Inc., New York (U.S.A.) Sharmin, R., U. Sundararaj, S. Shah, L. V. Griend and Y. J. Sun (2006). Inferential sensor for estimation of polymer quality parameter: industrial application of a PLS-based soft sensor for a LDPE plant. Chem. Eng. Sci., 61, 6372-6384. Shawhart, W. A. (1931). Statistical method from the viewpoint of quality control. The Graduate School of Agriculture, Washington DC (USA). [Reprinted in 1986 by Dover Publishing, Toronto (Canada)]. Shao, R., F. Jia, E. B. Martin, and A. J. Morris (1999). Wavelets and non-linear principal components analysis for process monitoring. Control Eng. Practice, 7, 865-879. Shi, R., and J. F. MacGregor (2000). Modeling of dynamic systems using latent variable and subspace methods. J. Chemom., 14, 423-439. Škrbić, B., and A. Onjia (2007). Multivariate analyses of microelement contents in wheat cultivated in Serbia (2002). Food Control, 18, 338-345. Srinivasan, R., and M. Qian (2005). Off-line temporal signal comparison using singular points augmented time warping. Ind. Eng. Chem. Res., 44, 4697-4716. Srinivasan, R., and M. Qian (2007). Online temporal signal comparison using singular points augmented time warping. Ind. Eng. Chem. Res., 46, 4531-4548. Szatvanyi, G., C. Duchesne, and G. Bartolacci (2006). Multivariate image analysis of flames for product quality and combustion control in rotary kilns. Ind. Eng. Chem. Res., 45, 4706-4715.

154

References

Tan, Y., L. Shi, W. Tong and C. Wang (2005). Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data. Nucleic Acids Research, 33, 56-65. Teppola, P., and P. Minkkinen (2000). Wavelet–PLS regression models for both exploratory data analysis and process monitoring. J. Chemom., 14, 383-399. Tessier, J., C. Duchesne, and G. Bartolacci (2007). A machine vision approach to on-line estimation of run of mine ore composition on conveyor belts. Minerals Eng., 20, 11291144. Tessier, J., C. Duchesne, C. Gauthier, and G. Dufour (2006). Estimation of alumina content of anode cover materials using multivariate image analysis techniques. Chem. Eng. Sci., 63, 1370-1380. Treasure, R. J., U. Kruger, and J. E. Cooper (2004). Dynamic multivariate statistical process control using subspace identification. J. Process Control, 14, 279-292. Trendafilova, I., M.P. Cartmell, and W. Ostachowicz (2008). Vibration-based damage detection in an aircraft wing scaled model using principal component analysis and pattern recognition. Journal of Sound and Vibration, 313, 560-566. Übeyli, D. E., (2007). Implementing automated diagnostic systems for breast cancer detection. Expert Systems with Applications, 33, 1054-1062. Ündey, C., and A. Çinar (2002). Statistical process monitoring of multistage, multiphase batch processes. 2002. IEEE Control Systems Magazine, 22, 40-52. Ündey, C., B. A. Williams, and A. Çinar (2002). Monitoring batch pharmaceutical fermentations: data synchronization, landmark alignment and real-time monitoring. 15th IFAC World Congress, July 22-26, Barcelona (Spain). Ündey, C., S. Ertunç, and A. Çinar (2003a). Online batch/fed-batch process performance monitoring, quality prediction, and variable-contribution analysis for diagnosis. Ind. Eng. Chem. Res., 42, 4645-4658. Ündey, C., E. Tatara, and A. Çınar (2003b). Real-time batch process supervision by integrated knowledge-based systems and multivariate statistical methods. Engineering Applications of Artificial Intelligence, 16, 555-566. Ündey, C., E. Tatara, and A. Çınar (2004). Intelligent real-time performance monitoring and quality prediction for batch/fed-batch cultivations. J. Biotech.,108, 61-77. Valle, S., W. Li, and S. J. Qin (1999). Selection of the number of principal components: the variance of the reconstruction error criterion with a comparison to other methods. Ind. Eng. Chem. Res., 38, 4389-4401. Viggiani, L., and M. A. Castiglione Morelli (2008). Characterization of wines by nuclear magnetic resonance: a work study on wines from the Basilicata region in Italy. J. Agric. Food Chem., 56, 8273-8279.

References

155

Viñasa, R., A. Eff-Darwicha, V. Solerb, M. C. Martín-Luisa, M. L. Quesadaa, J. de la Nuez (2007). Processing of radon time series in underground environments: implications for volcanic surveillance in the island of Tenerife, Canary Islands, Spain. Radiation Measurements, 42, 101-115. Xie, L., U. Kruger, D. Lieftucht, T. Littler, Q. Chen, and S. Q. Wang (2006). Statistical monitoring of dynamic multivariate processes – Part 1. Modelling autocorrelation and crosscorrelation. Ind. Eng. Chem. Res., 45, 1659-1676. Xu, L., J. H. Jiang, H. L. Wu, G. L. Shen, R. Q. Yu (2007). Variable-weighted PLS. Chemom. Intell. Lab. Sys., 85, 140-143. Yaakobovitz, B., Y. Cohen and Y. Tsur (2007). Line edge roughness detection using deep UV light scatterometry. Microelectronic Eng., 84, 619-625. Yabuki, Y., T. Nagasawa, and J. F. MacGregor (2002). Industrial experiences with product quality control in semi-batch processes. Computers Chem. Eng., 26, 205-212. Yabuki, Y., and J. F. MacGregor (1997). Product quality control in semibatch reactors using midcourse correction policies. Ind. Eng. Chem. Res., 36, 1268-1275. Yao, Y, and F. Gao (2009). Phase and transition based batch process modeling and online monitoring. J. Process Control, in press. doi:10.1016/j.jprocont.2008.11.001 Yoon, S., and J. F. MacGregor (2004). Principal-component analysis of multiscale data for process monitoring and fault diagnosis. AIChE J., 50, 2891-2903. Yu, H., and J. F. MacGregor (2003). Multivariate image analysis and regression for prediction of caoating content and distribution in the production of snack foods. Chemom. Intell. Lab. Sys., 67, 125-144. Yu, H., J. F. MacGregor, G. Haarsma and W. Bourg (2003). Digital imaging for online monitoring and control of industrial snack food processes. Ind. Eng. Chem. Res., 42, 3036-3044. Yue, H. H., S. J. Qin, R.J. Markle, C. Nauert and M. Gatto (2000). Fault detection of plasma etcher using optical emission spectra. IEEE Trans. Semicond. Manufact., 13, 374-385. Waits, C. M., B. Morgan, M. Kastantin and R. Ghodssi (2005). Microfabrication of 3D silicon MEMS structures using grey-scale lithography and deep reactive ion etching. Sensors and actuators A, 119, 245-253. Waldo, W. (2001). Techniques and tools for optical lithography. In: Handbook of VLSI microlithography. Second edition. Principles, technology and application. Noyes Publications, Park Ridge, New Jersey (U.S.A.). Wang, X., U. Kruger, and B. Lennox (2003). Recursive partial least squares algorithms for monitoring complex industrial processes. Control Eng. Practice, 11, 613-632. Warne, K., G. Prasad, S. Rezvani, and L. Maguire (2004). Statistical and computational intelligence techniques for inferential model development: a comparative evaluation and

156

References

a novel proposition for fusion. Engineering Applications of Artificial Intelligence, 17, 871-885. Westerhuis, J. A., S. P. Gurden, and A. K. Smilde (2000). Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Sys., 51, 95-114. Westerhuis, J. A., T. Kourti, and J. F. MacGregor (1998). Analysis of multiblock and hierarchical PCA and PLS models. J. Chemom., 12, 301-321. Westerhuis, J. A., T. Kourti, and J. F. MacGregor (1999). Comparing alternative approaches for multivariate statistical analysis of batch process data. J. Chemometrics 13, 397-413. Whelehan, O. P., M. E. Earll, E. Johansson, M. Toft, L. Eriksson (2006). Detection of ovarian cancer using chemometric analysis of proteomic profiles. Chemom. Intell. Lab. Sys., 84, 82-87. Wise, B. M., and N. B. Gallagher (1996). The process chemometrics approach to process monitoring and fault detection. J. Process Control, 6, 329-348. Wold, S. (1978). Cross-validatory estimation of number of components in factor and principal components models. Technometrics, 20, 397-405. Wold, S. (1994). Exponentially weighted moving principal components analysis and projection on latent structures. Chemom. Intell. Lab. Sys., 23, 149-161. Wold, S., P. Geladi, K. Esbensen, J. Öhman (1987). Multi-way principal components and PLS-analysis, J. Chemom., 1, 47–56. Wold, S., N. Kettaneh, H. Fridèn e A. Holmberg (1998). Modeling and diagnostics of batch processes and analogous kinetics experiments. Chemom. Intell. Lab. Sys., 44, 331-340. Wold, S., N. Kettaneh-Wold, and Skagerberg, B (1989). Non-linear PLS modelling. Chemom. Int. Lab. Sys., 7, 53-65. Wold, S., N. Kettaneh, and K. Tjessem (1996). Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection. J. Chemom., 10, 463-482. Wold, S., M. Sjöström, and L. Eriksson (2001). PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Sys., 58, 109-130. Zamprogna, E., M. Barolo and D. E. Seborg (2004). Estimating product composition profiles in batch distillation via partial least squares regression. Computers Eng. Practice., 12, 917-929. Zhai, H. L., X. G. Chen, and Z. D. Hu (2006). A new approach for the identification of important variables. Chemom. Intell. Lab. Sys., 80, 130-135. Zhang, Y., and M. Dudzic (2006). Industrial application of multivariate SPC to continuous caster start-up operations for breakout prevention. Computers Eng. Practice, 14, 13571375. Zhang, H., and B. Lennox (2004). Integrated condition monitoring and control of fed-batch fermentation processes. J. Process Control, 14, 41-50.

References

157

Zhang, Q., K., Poola, and C. J. Spanos (2007). Across wafer critical dimension uniformity enhancement through lithography and etch process sequence: concept, approach, modeling, and experiment. IEEE Trans. Semicond. Manufact., 20, 488-505. Zhang, Q., K., Poola, and C. J. Spanos (2008). One step forward from run-to-run critical dimension control: across-wafer level critical dimension control through lithography and etch process. J. Process Control, 18, 937-945. Zhao, C., F. Wang, F. Gao, N. Lu, and M. Jia (2007a). Adaptive monitoring method for batch processes based on phase dissimilarity updating with limited modeling data. Ind. Eng. Chem.. Res., 46, 4943-4953. Zhao, C., F. Wang, N. Lu, and M. Jia (2007b). Staged-based soft-transition multiple PCA modeling and on-line monitoring strategy for batch processes. J. Process Control, 17, 728-741. Zhao, C., F. Wang, Z. Mao, N. Lu, and M. Jia (2008a). Improved batch process monitoring and quality prediction based on multiphase statistical analysis. Ind. Eng. Chem. Res., 47, 835-849. Zhao, C., F. Wang, Z. Mao, N. Lu, and M. Jia (2008b). Quality prediction based on phasespecific average trajectory for batch processes. AIChE J., 54, 693-705. Zhao, S. J., J. Zhang and Y. M. Xu (2006a). Performance monitoring of process with multiple operating modes through multiple PLS models. J. Process Control, 16, 763-772. Zhao, S. J., J. Zhang, Y. M. Xu, and Z. H. Xiong (2006b). Nonlinear projection to latent structures method and its applications. Ind. Eng. Chem. Res., 45, 3843-3852. Web sites http://www.asq.org/index.html/ http://www.eigenvectorresearch.com/ http://www.mathworks.com/ http://www.progea.com/ http://www.sirca.com/

[accessed on October 1st, 2008] [accessed on January 5th, 2009] [accessed on January 5th, 2009] [accessed on January 5th, 2009] [accessed on December 31st, 2008]

Acknowledgements The author would like to express his gratitude to the people and the institutions whose technical and intellectual support has been involved in this project: prof. Massimiliano Barolo and Dr. Fabrizio Bezzo of the Department of Chemical Engineering Principles and Practice (DIPIC - Università di Padova, Italia), and Prof. J. A. Romagnoli of the “Gordon A. and Mary Cain” Department of Chemical Engineering (Louisiana State University, Baton Rouge, Louisiana, USA) for their invaluable guidance and their essential advices. Furthermore, the author would like to gratefully acknowledge SIRCA S.p.A. for the financial support and Fondazione “Ing. Aldo Gini” for the scholarship. … e i ringraziamenti! Grazie Mamma e Papà, siete una vera benedizione. Non saprò mai ringraziarvi abbastanza, e i miei modi ruvidi di certo non vi ripagano come meritate. La vostra disponibilità e dedizione nei miei confronti è proverbiale. Penso proprio che un giorno ci sarà un adagio: “disponibili come Luciana e Adriano”. Il mondo è un caldo abbraccio perché ci siete voi! Claudia, sei una stella che ho rubato al cielo! Tu e la tua famiglia mi avete regalato incredibile entusiasmo e voglia di vivere. Se solo ne fossi capace, consegnerei nelle tue mani dolci tutta la felicità del mondo, consegnerei alla tua vita ogni grazia, realizzerei ogni tuo desiderio. Ciò che posso fare è donarti tutto il mio Amore. Per sempre! Grazie agli amici veri, perché una serata a cena con loro è sempre un dono inestimabile. Grazie a Max e Fabrizio, non solo per le opportunità di crescita che mi hanno dato, per quello che mi hanno insegnato o per quanto mi hanno aiutato, ma per i panini mangiati assieme a mezzogiorno, i CAPE-luganega, la grande simpatia, l’umanità, la puntualità, … Un grazie anche a Federico. Chi avrebbe detto di scoprire un nuovo amico nelle mezz’ore perse a chiacchierare sulla varianza della matrice di covarianza o l’analisi del’immagine dei funghi? Impagabili. E grazie a tutti i miei compagni di ufficio, per le due parole dette per scherzo e che magari ci hanno alleggerito la giornata. E un ringraziamento ai “miei” laureandi, che sono sempre simpatici e brillanti.