Multivariate Statistical Modeling: Alternative ... - Clinical Chemistry

0 downloads 0 Views 1MB Size Report
let us first elaborate on the evaluation crite- .... if Say > Se, RGF> RGV. When R>. R,, ..... healthy blood bank donors showed that flow cytometry ... community.
CLIN.CHEM.38/9, 1706-1711 (1992)

Multivariate Statistical Modeling: Alternative Approach to Test Evaluation, Applied to Counting Reticulocytes by Flow Cytometry W P. Oosterhuis,”4

A. H. Zwinderman,2

T. A. Modderman,’

We present a statistical path analysis model for the evaluation of two tests in the absence of a “goldstandard” method. This model is applied to the evaluation of flow-

cytometric and visual reticulocytecounting by using as the comparison method a combination of three hematological measurements: hemoglobin concentration (HGB), mean cellular volume (MCV), and erythrocyte density width (EDW). We assumed that, in general, a higher reticulocyte count is associated with a lower HGB value and with greater values for MCV and EDW. Applying this assumption and the statistical model, we demonstrated that flow cytometry was superior to visual reticulocyte counting in the low-value range studied. The path analysis model is potentially applicable in other cases where two tests are to be compared, and when no gold standard is available. Addftlonal

K.yphrssss:

hemoglobin

erythrocytes

We compared the traditional visual method for reticulocyte counting with a new method based on flow cytometry with fluorescent dyes that selectively bind to intracellular RNA in reticulocytes but not in mature erythrocytes. Because in reticulocyte counting there is no “gold standard” method, we had no clear way to determine which of the two methods was the most accurate, especially in the normal to low range. However, patients with an abnormal reticulocyte count usually have abnormal results for other hematological tests too. We used these other hematological data to try to construct a substitute gold standard. Here we report our evaluation of this approach. We addressed this problem in a more general approach: how to compare two competing methods to measure a clinical variable when the true value of the clinical variable is unknown. This situation occurs not only in reticulocyte counting but is encountered very often in clinical research, e.g., in other hematological determinations, for which the quality of standards in measurements of erythrocytes and leukocytes is not comparable with the quality of standards in pure chemical tests. When the true value is unknown, usually the old (or standard) method is used as a comparison bench mark, as we did with the visual reticulocyte counting method ‘Drechtsteden 2Depent

Hospital, Dordrecht, The Netherlands. of Medical Statistics, University of Leiden.

3Department of Clinical Epidemiology and Biostatistics, University of Amsterdam. 4Address for correspondence: Department of Clinical Chemistry, Ziekenhuis Canisius-Wilhelmina, Wegdoor Jonkerbos 100, 6532 SZ, Nmegen, The Netherlands. Received April 3, 1991; accepted February 5, 1992. 1706

CLINICAL

CHEMISTRY,Vol. 38, No. 9, 1992

R. B. Dinkelaar,’

and H. J. van der Helm3

in our study. The general situation is illustrated in Figure 1, with special reference to our reticulocyte counting problem. There are two competing methods (V = visual count and F = flow-cytometric count) to measure a clinical variable (the reticulocyte count), whose true value is denoted by G. In reality G is unknown, and V and F are therefore compared with the derived marker (related measurement) P. In our case, P stands

for one or more hematological test results related to the reticulocyte count [and in the following examples, P stands for the hemoglobin concentration (HGB)].6 Thus, one reaches conclusions based on the relations of V and F with P, while actually one is interested in the relations of V and F with 0. Here we will specify the general conditions under which the approach of using a derived measure to evaluate two competing methods to measure a clinical variable is valid. We will apply the model to the evaluation of both reticulocyte counting methods.

Materials and Methods Statistical Modeling:Univariate Analysis

Before going into the problem of specifying the conditions under which a derived marker can be used as a reference against which two competing methods can be evaluated, let us first elaborate on the evaluation criteria themselves. The ideal situation is when a method (e.g., method V, which yields results V) yields the same measurements as the true value (G): i.e., G = V for all possible values of V and G. Obviously, this ideal situation will not occur very often in practice. Usually, there is random error involved in V, and sometimes also systematic error. When all this is the case, then the relation between G and V becomes V = a + /3G + a, where a is the systematic error in the general level, /3is the systematic proportional error of V, and a is random error. Discarding the possibility of a nonlinear relation between 0 and V, whatever the values of a, /3, and a, the necessary condition for V to be a good method to measure G is a high correlation between V and G. Therefore, we will use the correlation as a criterion to distinguish between two competing methods to measure G: the method that correlates best with G is the best method to measure G. There are several flaws in using the correlation as the only criterion (1): a high correlation may not be a sufficient condition, but for the present we let it serve this purpose. Statistically, the correlation represents the common variance between G and V; the greater the common variance, the better 0 is measured by V. 5Nonstandard abbreviations: HGB, hemoglobin concentration MCV, mean cellular volume; and EDW, erythrocyte density width.

The correlations expressed

as functions

of V and F with P can now be of the correlations of V, F, and P

with 0: RcjpRp

Rpp =

+ S

RGPRGV + S1

Depending

on the assumptions concerning Rep, S, and conclusions can be drawn with respect to the ratio in equation 5. We discuss four cases. Case A. Assume there is a perfect correlation between O and P. Consequently, the variance of a is zero, as are the covariances of a with V and F: Se = = 0. Therefore, RIR = RoIRov, and the method that correlates best with P also correlates best with G. In our case: if there is a perfect correlation between the true reticulocyte number and the HGB, then the reticulocyte counting method that correlates best with the HGB would be the superior method. Case B. Assume there is no correlation between G and P: R = 0. Hence, the influence of G on the ratio R5IR is unknown, and no conclusion can be drawn Se,, different

FIg. 1. Dlagran, showIng the correlation coefficients (F and the Interrelations between the data used: V, visual reticulocyte count; F, flow-cytometrlc retlculocyte count; G, hypothetical (latent) “gold standard” method results; P, routine hematological data

To evaluate V and F, we need the correlations of each method’s results with the true value G: R0 and ROF, respectively. However, because G is unknown, we instead have the correlations of V and F with the derived marker P: R and RPF, respectively. For convenience without loss of generality, we assume that G, P, V, and F have been normalized such that their means are zero and their variances equal unity. In that case the correlation between two variables is equal to their covariance: hence R0 = where n is the number of measurements. We assume

the unobserved

that there exists a linear relation between true value G and the derived marker P: P1=/3G+#{128},fori=1,..,n,

(1)

where /3is the unknown slope of the regression function between G and P, and a the usual error term that is independent of G. Let us study the correlation between PandV,Rpv:

error. Therefore, the correlation between the counting methods and HGB can be attributed completely to the correlation between HGB and the true reticulocyte number. In that case the method that correlates best with the HGB is superior. CaseD. Again assume that the correlation between G and P is imperfect: 0< R0 1, then F correlates better

in =

P V

-

ni.,1 in =

[(pG + a) VJ

-

ni,-1

=p-

i

n

in

OV1+ j-’l

(2)

-

i-’l

When P and G are normalized,

likewise

with P than V does. Under which condition will the ratio Ror/Rov also be >1? Because R> 4ROFROP + Se > RovRop + Say ROF - R0> (S#{128} - Se)fRop,

then

/3 equals R0p; hence:

RpvR0pR6+Sj

withrespecttowhichmethodisbestinmeasuringG.In our case: if there is no correlation between any hematological variable (or any other measurable clinical variable) and the reticulocyte number, then we cannot compare the two methods by using this statistical method or any other method. Case C. Assume that the correlation between 0 and P is imperfect: 0 Se, RGF> RGV

(3)

for F:

When R> RPF

=

RGP RGF + S

(4)

where S and are the covariances between the error term a in equation 1 and V and F, respectively.

R,, R0v is that Say>

then a sufficient condition that Roe>

Se. In words, when F correlates better than V with P, F will also correlate better with G than V does when the covariance between F and the error term of the relationship of P and G is smaller than the CLINICAL CHEMISTRY, Vol. 38, No. 9, 1992

1707

covariance of V with the error term of the relationship of PandG. In our case, the idea is that the visual count and the flow-cytometric count correlate with HGB only because they measure the true reticulocyte number (albeit imperfectly) and because the true reticulocyte number correlates with the HGB. If the visual count or the flow-cytometric count or both have an additional correlation with the HGB that is independent of the true reticulocyte number (i.e., Say >0, Se >0), then we can draw conclusions only if we assume that the method that correlates best with HGB has the lowest additional covariance with 11GB that is independent of the true reticulocyte number. This condition may seem irrelevant; usually, the two competing methods will correlate to a certain extent with G and the remainder will be pure random error, and the random error will not be correlated with P. However, there may be instances where this might not be so-e.g., in our case if the number of platelets is included as a derived marker and some platelets happen to be counted as reticulocytes in one of the counting methods. Visual counting methods are subject to observer variation, which would be random error in relation to the true value G. However, when the same observer counts the number of reticulocytes, the error in the relation between P and 0 might be correlated with V. Hence, the correlation of P with V would be partly due to their mutual relation with G, but there would be additional correlation attributable to their mutual relation with the same observer. If despite this additional correlation, F correlates better with P than V does, it is safe to conclude that F also correlates better than V with 0. CombIning Derived Markers

Until now we have used only one derived marker (HGB in our example). Often several candidates can potentially be used as a derived marker to evaluate two competing methods. In our analysis, several other hematological variables besides HGB could correlate with the reticulocyte number, e.g., erythrocyte count or mean cellular volume (MCV). There are many possibilities for combining only three

several derived methods.

markers,

but we will discuss

The simplest combination method is to assume additivity and to sum the available derived markers. Let P1, ,Pmbemderivedmax’kers;then,Pe = P1 + P2 + + Pm can also serve as a derived marker. All results of the previous section also apply to P,. In combining P1, m, one should take care that the individual correlations of P1,. . Pm with V and F have the same sign. A more sophisticated technique can be used if also unidimensionality is assumed: P1, . . . Pm can then be combined by means of factor analysis (2). Each of the individual derived markers correlates with 0; therefore, 0 can be postulated as the common factor of all derived markers, and this common variance can be identified by means of principal factor analysis. In fact, principal factor analysis computes factors as weighted combina.

,

,

1708 CLINICALCHEMISTRY,Vol. 38, No.9,1992

tions of the individual derived markers: P, = /31P1 + f32P2 + + where P is the constructed factor and Pi,. . . /3,, are weights that are computed in such a way that the common variance explained by P in P1, m is maximized. All results of the previous section also apply to P. The third method is to assume additivity but not unidimensionality and to combine P1,.. . Pm by means of multiple-regression analysis; the multiple correlations of V and F with P1, Pm are used for the evaluation of V and F. This entails that for V and F separately weighted combinations of P1, ..., Pm are taken: for V, Pv = /31vP1 + i32v2 + ... + /3mvPzn; for F, ...

,

,

...,

Pp=$lFPj+/3PF+.

..+/3,.Inthiscasetheresults

of the previous section do not quite hold. Because Piv doesnotequal/3(i= 1,...,m),Pwillingeneralbe unequal to PF. Consequently, V and F are evaluated against (slightly) different standards. For instance, it is possible that V and F have the same multiple correlations with P1,. . . ,Pm, but that V correlates highly with P, and F correlates highly with P. However, when such a result is not found in univariate analysis, application of multiple-regression analysis seems valid and convincing. Comparison of Two Reticulocyte Counting Methods Patients. Blood was collected by venipuncture

into

EDTA-containing evacuated blood-collection tubes. Samples from 147 outpatients were chosen without conscious bias from all samples offered to our hematological laboratory. Patients with thrombocyte counts 450 x 10’2/L, leukocyte counts 15 x iO9fL were excluded on the basis of suspected abnormal erythrocyte production rate. Visual reticulocyte counting. Reticulocytes were counted visually by microscopically enmining smears stained with Brilliant Cresyl Blue as described elsewhere (3). Reticulocyte percentages were obtained by examining 1000 erythrocytes in accordance with the proposed guideline of the National Committee for Clinical Laboratory Standards (4). Flow cytometry. We added 5 ,uL of whole blood to i mL of Retic-Count#{174} reagent (Becton Dickinson Immunocytometry Systems, Inc., Mountain View, CA) and incubated the samples at room temperature for i h in the dark. We used an Ortho Spectrum ifi flow cytometer with the Consort 30 program (Becton Dickinson) for sample analysis. The fluorescence histogram of the stained sample was examined and the middle marker was set to exclude cell autofluorescence. The two end markers were set at 0 and 250 channel numbers. The latter excluded strong fluorescence of any small mono-

nuclear cells falling within the gate. Derived markers. Selection of the set of derivative markers is critical, and should be based on an hypothesized correlation of the selected markers and the quantity to be measured (0). In the evaluation of the reticulocyte counting methods, we selected HGB, erythrocyte density width (EDW), and MCV. The expected correlation of the reticulocyte count with these variables was

based on the physiological mechanism of stimulation of erythropoiesis when HGB is depressed; a high number of reticulocytes and young erythrocytes will increase the MCV and EDW because these cells have a high average cell volume. There are no compelling reasons to assume

that either the visual or the flow-cytometric enumeration method is related to HGB, MCV, or EDW other than through the correlation of the (true) reticulocyte number with these derived markers. Hence, we assume that this is a case C situation (see above). Statistical analysis. Pearson correlation coefficients, principal

factor

analysis,

and multiple-regression

anal-

ysis were performed with the SPSS statistical program. Differences between correlation coefficients were tested for significance with Hotelling’s test for paired correlation coefficients

(5).

Results Reticulocyte count. We evaluated 147 patients’ samples. The actual sample size was n = 103 after exclusion of 44 samples due to an abnormal platelet or leukocyte count. In the samples studied, no high reticulocyte counts were found, the counts ranging from 0.2% to 3.7% (visual count) and from 0.2% to 3.0% (flow cytometry). The mean number of reticulocytes was 1.5% (SD 0.9%) according to the visual counting method, 1.3% (SD 0.6%) according to the flow-cytometric method. Figure 2 shows a plot of the visual reticulocyte count vs the flow-cytometric count. The deviations between the flow-cytometric and the visual counts ranged from 0 to 2.1% (mean 0.6%). The correlation is not very high (r = 0.60, P