Chemometric Optimization of the Ruthenium ... - Semantic Scholar

14 downloads 0 Views 1MB Size Report
Carbon monoxide and nitrogen were high-purity grade. Toluene ...... Press, Wilmslow, 1987. ... Institute of Technology, School of Computing Sciences, Techni-.
J. CHEM. SOC. FARADAY TRANS., 1991, 87(17), 2811-2820

2811

Chemometric Optimization of the Ruthenium Carbonyl Catalysed Cyclization of 2-Nitrostilbene to 2-Phenylindole Corrado Crotti and Sergio Cenini* Dipartimento di Chimica lnorganica e Metallorganica and CNR Center, Via Venezian 21,20133 Milano, Italy Roberto Todeschini Dipartimento di Chimica-Fisica e Elettrochimica, Via Golgi 19,20133 Milano, Italy Stefan0 Tollari Dipartimento di Chimica Organica e lndustriale, Via Venezian 21,20133 Milano, Italy

A chemometric optimization of the Ru,(CO), catalysed deoxygenation of 2-nitrostilbene to 2-phenylindole was carried out. T h e effects of temperature, CO pressure, amounts of catalyst and substrate on conversion and selectivity were examined by factorial design/response surface methods. The conversion was found to increase on increasing the temperature and decreasing t h e CO pressure, it assumed a minimum value for medium amounts of catalyst and was almost independent of the amount of substrate. These results were also confirmed using a learning system and were used to develop a mechanism for the reaction. The data suggest two different mechanisms: one based on a Ru(CO), catalysed process and another one based on a Ru,(CO),, catalysed process, which are first and zero order with respect to the substrate, respectively.

The interest in transition-metal-catalysed deoxygenation by carbon monoxide of aromatic nitro-compounds has grown enormously in recent years, particularly for the production of carbonylated compounds such as isocyanates, carbamates and ureas.' Several other products are also available from this kind of reaction and some of them are interesting also from an industrial viewpoint. Some years ago, we reported the catalytic synthesis of indoles by deoxygenation of 2nitrostyrenes' catalysed by metal carbonyls :

gen before use and stored under a nitrogen atmosphere. 2Nitrostilbene was prepared by using a modified Wittig r e a ~ t i o n NMR ;~ spectroscopy and capillary GC showed that the (E)-isomer was 2 9 9 % pure. Ru,(CO),, was prepared by a literature method.6 To purify Ru3(CO),, from some ruthenium metal, the carbonyl was dissolved in hot toluene and the resulting solution was filtered through Celite 577. The reactions under high pressure were conducted in a glass vessel inside a stainless-steel autoclave; the air in the autoclave was replaced with nitrogen by a freeze-pumpthaw procedure, before the introduction of carbon monoxide at the desired pressure. The autoclave was heated in a thermoregulated oil bath (If: 1 "C) and magnetic stirring was applied to both the catalytic solution and the heating oil. At the end of The only detectable side products were the corresponding the reactions, the autoclave was cooled by an ice bath, and amines, 2-aminostilbenes (5 < yield (%) < 33). then 'blown off '. This synthesis appears to be more convenient than the palladium-catalysed intramolecular amination of a l k e n e ~ , ~ IR spectra were obtained using a Perkin-Elmer 781 infrared spectrophotometer coupled with a Perkin-Elmer System used by Hegedus as first step in the synthesis of ergot alka10 data station. loid~.~ The potential interest in a highly selective synthetic route to indoles prompted us to undertake a study to optimize Catalytic Reactions and Analyses of the Products reaction (I). It is well known that statistical univariate techAn example of the catalytic reactions is described here. niques are often not adequately efficient to obtain useful Ru,(CO)'~(4.20 mg, 0.00657 mmol) and 2-nitrostilbene (40.0 information in systems of complex data. Therefore, to obtain mg, 0.178 mmol) were weighed into the glass-liner and a definitive result of an optimization study using a relatively toluene (3.00 cm3) was added. The glass liner was put inside low number of reaction runs, we started with a chemometric the autoclave and carbon monoxide was admitted at 50 bar. approach based on a 'response surface' method; in addition, The autoclave was dipped in the pre-heated oil bath at to check the effectiveness of this method, we carried out 220°C and the temperature was adjusted immediately to another optimization study using a different principle, a 210"C, to compensate for the cooling due to the dipping of 'learning system', which was based on a classification-tree the cold mass of the autoclave. The reaction was carried out method. for 25 min after immersion in the oil bath. At the end of the Coincidently, we were also interested in understanding the reaction, IR spectroscopy showed that the only carbonyl intimate mechanism of this reaction and the results from the complexes present were Ru,(CO), and Ru(CO), . optimization study gave us some useful suggestions about the Quantitative analyses were carried out on a Carlo Erba hypothetical mechanism. HRGC Fractovap 4160 capillary gas chromatograph equipped with an ' on-column' injector, which was coupled with a Perkin-Elmer LCI-100 Integrator. The column was a Experimental PS-255, 0.32 mm i.d., 25 m length, 5 pm thickness of the staChemicals, Procedures and Instruments tionary phase; the internal standard was hexamethylbenzene and the response factors for each compound (2-nitrostilbene, Carbon monoxide and nitrogen were high-purity grade. 2-aminostilbene and 2-phenylindole) with respect to the interToluene and THF were distilled, dried, degassed with nitro-

J. CHEM. SOC. FARADAY TRANS., 1991, VOL. 87

2812

nal standard were calculated at four different concentrations, which were averaged using 12 injections for each concentration. The response factors so obtained had a 2a < 2% with respect to the average value [2a < 0.02 f,]. The analyses of the reactions were carried out using averages based on five injections. The average value had 20 < 3%. Accurate analysis is absolutely necessary to obtain meaningful results from the factorial design and response surface calculations. GC-MS analyses were carried out on a Hewlett Packard 5890 capillary gas chromatograph coupled with a VG 7070 mass spectrometer.

If a node of the tree contains examples of a single class (eventually, together with a few examples of other classes'pruning factor') then it finishes with a 'leaf' labelled with the class. On the basis of the data-set examples, in each node the program checks the information carried on the possible partitions from the variables and chooses the most informative variable to perform a partition of the tree in a subset (i.e. new nodes) according to the values of the selected variable. The information content in each node is evaluated by the Shannon's entropy definition : l o Inf =

Theoretical Methods and Software To evaluate the significance of the role played by the different variables, a complete factorial design was performed initially (four variables, two levels and 16 runs). Factorial design is a class of experimental designs that are generally very economical; i.e. they offer a large amount of useful information from a small number of experiments. To access the data quantitatively, a multivariate chemometric approach based on the response surface (RS) was used.7 RS techniques are normally used to find the best experimental conditions in process parameter optimization problems and are an alternative (and usually better) procedure to the one-variable-at-a-time (OVAT) method of optimization. In fact, adjustments of OVAT often fail in obtaining optimization, whereas the multivariate methods, which are mathematical and statistical techniques that allow one to scan the experimental domain efficiently and rationally, may give more reliable results. In this investigation, we sought optimization of the functional dependence between a chemical result y (% yield) and the experimental factors xl, x 2 , . . . (pressure, temperature, catalyst amount and substrate amount), i.e. y =f(P, T , ccat, c,,~). For this purpose the full second-order polynomial models are particularly versatile for use as empirical models in many systems over a limited domain of variables. To obtain a local representation of the RS near the maximal and minimal values, the fitted equation will be of the form Y

= bo

+ b1X' + b , x , + b"x: + b,,x$ + bl,xlx,

(1)

The interpretation of the fitted equation is not always straightforward when more than two independent variables are taken into account, since it is difficult to visualize the features of a surface by simply inspecting the equation. Geometrical interpretation of the RS is obtained by canonical a n a l y ~ i s which ,~ reveals the essential features by shifting the origin and rotating the axes, which transform the estimated polynomial model into a simpler one, i.e. reducing eqn. (1) into a canonical form. In practice, the original factorial design was enlarged in a central composite design (four variables, five levels and 32 runs). The yield (%) was fixed as the y-dependent variable, and a multivariate linear regression was carried out on the complete quadratic forms of the considered variables (see below). To build each map, we used the calculated regression coefficients, varying only two independent variables and keeping the others at constant values. The solutions obtained were compared with those obtained by using a learning system. In this case, the dependent variable is used just to define different classes and to classify the runs. The learning system produces a 'tree classification model' based on the considered variables8 The learning system we used is a member of the family of classification-tree methods also called the TDIDT (top down induction of decision trees) family.g The basic algorithm for constructing a decision tree is as follows :

-1pi In, pi

where pi is the probability of the i-class in the considered node; the probability is evaluated as the relative frequency nJn, for the ni examples belonging to the i-class on the n, total examples in the node. So, the information content of a node containing only examples of a single class is zero. The factorial design was achieved by using the statistical package PCS;" the RS analysis by using the package REGFAC', and the classification tree solutions by using the package ASSISTANT PROFESSIONAL.

'

Results The goal of the first step was the optimization of the reaction by factorial design/RS methods. Among all the possible independent variables, we decided to consider four: temperature ("C)( T ) ;CO pressure (bar) ( P ) value set at the beginning of the reaction at room temperature; catalyst amount (mg) (c,,J; substrate amount (mg) (csub). To consider more variables would involve a huge number of reactions; indeed, it is not possible to carry out a more economical fractional factorial design" because of higher-order interactions and it is very likely these variables are closely related, producing synergistic effects. On the other hand, a lot of useful information can be obtained from the study of these four variables; for instance, c,,, and c,,b determine the catalytic ratio. So we decided to keep the reaction conditions constant with Ru,(CO),, as catalyst, 2-nitrostilbene as substrate, toluene (3 cm3) as solvent and 25 min reaction time. Two dependent variables were studied: The percentage conversion (C), calculated as percentage of the substrate reacted during the run and indicating the activity of our catalytic system. The percentage selectivity (S), calculated as the molar percentage of the produced 2-phenylindole with respect to the reacted nitrostilbene and indicating the selectivity of the catalytic system. Table 1 lists the values used for the four independent variables and the experimental results for the two dependent variables: runs 1-16 were performed in replicate form (a and b) to calculate system error; runs 17-23 were added to obtain the central composite design, with the necessary replications. To facilitate the subsequent calculations (both for the factorial design and for the RS analysis), the experimentally independent variables ( T , P , ccatand c,,,) were not used as such, but they were translated as coded variables (T', P', cia, and club)by the correspondence reported in Table 2. The results of the factorial design for y = C are given in Table 3 with the main and mixed effects and the calculated and tabular F values (at the level of significance of 0.01). The reported i-effects are the average variations (positive or negative) of C obtained by switching the i-variable from the low to the high value. It can be seen that all four independent variables have an effect on C greater than experimental error. By contrast, the results of factorial design for y = S shown

2813

J. CHEM. SOC. FARADAY TRANS., 1991, VOL. 87 Table 1 Values used for the four independent variables and experimental results for the two dependent variables P /bar

‘cat

‘su b

C

S

run

T /”C

/mg

/mg

(%)

(%)

la lb 2a 2b 3a 3b 4a 4b 5a 56 6a 6b 7a 7b 8a 8b 9a 9b 1Oa 10b 1l a 1lb 12a 12b 13a 13b 14a 14b 15a 15b 16a 166 17a 17b 17‘ 17d 18a 186 19a 196 20a 20b 21a 21b 22a 22b 23a 23b

200 200 220 220 200 200 220 220 200 200 220 220 200 200 220 220 200 200 220 220 200 200 220 220 200 200 220 220 200 200 220 220 210 210 210 210 193 193 227 227 210 210 210 210 210 210 210 210

30 30 30 30 70 70 70 70 30 30 30 30 70 70 70 70 30 30 30 30 70 70 70 70 30 30 30 30 70 70 70 70 50 50 50 50 50 50 50 50 85 85 15 15 50 50 50 50

2.8 2.8 2.8 2.8 2.8 2.8 2.8 2.8 5.6 5.6 5.6 5.6 5.6 5.6 5.6 5.6 2.8 2.8 2.8 2.8 2.8 2.8 2.8 2.8 5.6 5.6 5.6 5.6 5.6 5.6 5.6 5.6 4.2 4.2 4.2 4.2 4.2 4.2 4.2 4.2 4.2 4.2 4.2 4.2 6.6 6.6 1.8 1.8

40 40 40 40 40 40 40

38 41 86 89 10 10 35 30 75 75 99 99 20 15 81 80 39 34 78 72 15 17 28 22

70 64 75 74 62 57 76 84 73 75 71 72 68 73 69 76

40 40 40 40 40 40 40 40 40 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 40 40 40 40 40 40

40 40 40

40 40 40 40 40 40

40

64 58 85 84 9 9 62 67 45 49 50 47 28 38 77 78 14 11 62 52 92 91 46 54

68

79 71 71 71 57 78 88 75 76 74 75 56

64 77 80 68 51 58 65 40 36 72 67 41 30 56 54 71 65 63 67

Ru,(CO),, as catalyst; 2-nitrostilbene as substrate; toluene (3 cm3) as solvent ; 25 min reaction time.

in Table 4 are much less significant; for this reason, we decided to carry out the central composite design only for y

=

c.

Moreover, we must point out that the most striking effects are observed for T’, P’ and cl,,, and that all three variables are is very high); therefore we chose strictly correlated ( L T p, , to expand the experimental domain of only T’, P’ and ciat.At CEPl

Table 2 Correspondence between the actual experimental values and the design coded values experimental variable T P

coded variable T P‘

‘cat

‘bat

‘sub

‘:ub

correspondence T’ = (T - 210)/10 P‘ = (P - 50)/20 cbal = (cCal - 4.2)/1.4 ‘:ub = (‘sub - 60)/20

Table 3 Results of the factorial design for y

=

C

~

variables and interactions T P

i-effects

I(”/.)

Fi

36

1338.2 1525.0 472.0 81.8 4.0 40.0 15.5 0.9 5.8 22.5 259.8 1.9 3.1 1.4 0.04

- 38

‘cat ‘sub

TIP TIccat T/Csub ‘/‘cat p/csub ‘ca J‘sub TIPIccat T/p/csub T/CcaJCsub PI‘ca J‘sub TIP/Ccat/Csub

21 -9 2 6 -4 1 2 -5 16 -1 2 -1 - 0.2

level of significance 0.01 0.01 0.01 0.01

0.01 0.0 1 0.05 0.01 0.01

Tabular F value at the level of significance of 0.01 = 8.6. Calculated variance Sz of the system = 7.5.

Table 4 Results of the factorial design for y variables and interactions T P ‘Cat ‘sub

TIP Tlccat T/Csub Plccat pIcsub ‘caJ‘sub TIPIcc,t T/P/csub T/ccat/csub PICcaJ‘sub T/P/CcatICsub

i-effects

M 1 O%

7.6 - 1.7

0.6 1.4 7.4 - 3.4 0.8 - 1.7 - 0.4 - 1.4 - 1.4 2.7 3.4 - 1.6 0.8

Fi

21.4 1.1 0.1 0.7 20.2 4.1 0.2 1.1 0.1 0.7 0.7 2.7 4.4 1.o 0.2

=S

level of significance 0.01

0.01

Tabular F value at the level of significance of 0.01 = 8.6. Calculated variance Sz of the system = 21.8.

this point, we ran the calculations of the RS surface twice. In the first calculation only three independent variables were taken into account and in the second calculation all four variables (although with only two values for S ) were considered. The differences between the equations obtained in the two cases were minimal; thus we report only the results obtained for four independent variables. Table 5(a) lists the calculated regression coefficients that were used to obtain the response surfaces (Fig. 1-4). The validity of the obtained response surfaces as a model, from an optimization point of view, was checked by applying the learning system that gave as a result, the classification tree in Fig. 5 (Table 6). Another application of the learning system was made on another compound present in traces (