Ultramicroscopy 90 (2002) 273–289

Optimal experimental design of STEM measurement of atom column positions S. Van Aerta,*, A.J. den Dekkera, D. Van Dyckb, A. van den Bosa a

Department of Applied Physics, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands b Department of Physics, University of Antwerp (RUCA), Groenenborgerlaan 171, 2020 Antwerp, Belgium Received 21 November 2000; received in revised form 6 September 2001

Abstract A quantitative measure is proposed to evaluate and optimize the design of a high-resolution scanning transmission electron microscopy (STEM) experiment. The proposed measure is related to the measurement of atom column positions. Speciﬁcally, it is based on the statistical precision with which the positions of atom columns can be estimated. The optimal design, that is, the combination of tunable microscope parameters for which the precision is highest, is derived for different types of atom columns. The proposed measure is also used to ﬁnd out if an annular detector is preferable to an axial one and if a Cs -corrector pays off in quantitative STEM experiments. In addition, the optimal settings of the STEM are compared to the Scherzer conditions for incoherent imaging and their dependence on the type of object is investigated. r 2002 Elsevier Science B.V. All rights reserved. PACS: 07.05.FB; 61.16.Bg; 02.50.r; 43.50.+y Keywords: Scanning transmission electron microscopy (STEM); Electron microscope design and characterization; Data processing/ image processing

1. Introduction For many years, it has been standard practice to evaluate the performance of STEM imaging modes qualitatively, that is, in terms of direct visual interpretability. The performance criteria used are resolution and contrast. For example, when axial bright-ﬁeld coherent STEM is compared to annular dark-ﬁeld incoherent STEM, the latter imaging mode is preferred. The basic ideas *Corresponding author. Tel.: +31-15-2781823; fax: +31-152784263. E-mail address: [email protected] (S. Van Aert).

underlying this preference are the higher resolution for incoherent imaging than for coherent imaging [1] and the higher contrast in dark-ﬁeld images than in bright-ﬁeld images [2]. In annular darkﬁeld incoherent STEM, visual interpretation of the images is optimal if the Scherzer conditions [3] for incoherent imaging are adapted [4]. As demonstrated in [5], the resolution can be improved further if the main lobe of the probe is narrowed. However, visual interpretability is then reduced as a result of a considerable rise of the sidelobes of the probe. However, two important aspects are absent in these widely used performance criteria. First, the

0304-3991/02/$ - see front matter r 2002 Elsevier Science B.V. All rights reserved. PII: S 0 3 0 4 - 3 9 9 1 ( 0 1 ) 0 0 1 5 2 - 8

274

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

electron–object interaction is not taken into account. Second, the dose efficiency, which is deﬁned as the fraction of the primary electron dose that is detected, is left out of consideration. Higher resolution and contrast are often obtained at the expense of dose efﬁciency, which leads to a decrease in signal-to-noise ratio (SNR). For example, the incoherence in annular dark-ﬁeld incoherent STEM is only attained by using an annular dark-ﬁeld detector with a geometry much larger than the objective aperture [6]. Its corresponding higher resolution, by adapting the Scherzer conditions for incoherent imaging, is thus obtained at the expense of dose efﬁciency. Another example is the following. It is well known that in bright-ﬁeld images, decreasing the size of an axial detector leads to higher contrast, but also to a deterioration of the SNR. To compensate for such a decrease in SNR, longer recording times are necessary, which in turn increase the disturbing inﬂuence of specimen drift. In our opinion, the ultimate goal of STEM is not providing optimal visual interpretability, but quantitative structure determination instead. The reason for this is that the positions of atom . columns should be known within sub-angstrom precision in order to understand the physical and chemical properties of materials such as superconductors, nanoparticles, quantum transistors, etc. [7]. The images are then to be considered as data planes from which the structural information has to be measured quantitatively. This can be done as follows: one has a model for the object and for the imaging process, including electron–object interaction, microscope transfer and image detection. This model describes the expectations of the intensity observations [8] and it contains parameters that have to be measured experimentally. The quantitative structure determination is done by ﬁtting the model to the experimental data by use of criteria of goodness of ﬁt, such as, least squares, least absolute values or maximum likelihood. Thus, quantitative structure determination becomes a statistical parameter estimation problem. Ultimately, the structural parameters, such as the positions of the atom columns, have to be measured as precisely as possible. However, this precision will always be limited by the presence of

noise. Given the parametric model and knowledge about the statistics of the observations, use of the concept of Fisher information [9] allows us to derive an expression for the highest attainable precision with which the positions of the atom columns can be estimated. This expression, which is called the Crame! r–Rao Lower Bound (CRLB), is a function of the object parameters, microscope parameters, and dose efﬁciency. Therefore, it may be used as an alternative performance measure in the optimization of the design of a STEM experiment for a given object. The optimal design is the set of microscope parameters resulting in the highest overall attainable precision. In practice, the design is optimized by minimizing a scalar measure of the CRLB. An overview of the concept of the CRLB and its use can be found in [9]. In [10], an example can be found in which the CRLB is computed in order to optimize the design of a quantitative HREM experiment. The microscope parameters considered are related to the probe, the image recording and the detector conﬁguration. The probe is described by the objective aperture radius, the defocus, the spherical aberration constant, the electron wavelength, the width of the source image, and the reduced brightness of the source. The parameters describing the image recording are the ﬁeld of view (FOV), that is, the area scanned on the specimen, the probe sampling distances and the recording time. The detector conﬁguration is described by either the inner radius of an annular detector or the outer radius of an axial detector. The paper is organized as follows. The parametric model for the expectations of the intensity observations is described in Section 2. In Section 3, the joint probability density function of the observations is introduced. From this, the CRLB will be derived in Section 4. In Section 5, the CRLB is used to evaluate and optimize the design of a STEM experiment. It is assumed that the experimental design is limited by specimen drift.

2. Parametric model for the intensity observations A parametric model, describing the expectations of the intensity observations recorded by the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

STEM, is needed in order to derive an expression for the CRLB. In this section, such a model will be derived by use of the simpliﬁed channeling theory [11–13] and of the fact that the scattering is dominated by the tightly bound 1s-type state of the atom columns. The source image will also be taken into account. The model contains both microscope parameters, such as objective aperture radius, defocus, spherical aberration constant, detector radius, width of the source image, and object parameters, such as the positions of the atom columns, the energy of the 1s-states and the object thickness. 2.1. The exit wave According to the simpliﬁed channeling theory, described in [11–13], an expression can be derived for the exit wave cðr; zÞ of an object consisting of N atom columns. This is a complex wave function in the plane at the exit face of the object, resulting from the interaction of the probe with the object: cðr; zÞ ¼ pðr r0 Þ þ

exp ip

N X n¼1

cn ðr0 bn Þf1s;n ðr bn Þ

E1s;n kz 1 ; E0

ð1Þ

where r ¼ ðx; yÞT is a two-dimensional vector in this plane. The symbol T denotes taking the transpose. The specimen thickness is z; E0 is the incident electron energy, k is the inverse electron wavelength, and r0 ¼ ðxk ; yl ÞT is the position of the STEM probe, described by the function pðr r0 Þ: The function f1s;n ðr bn Þ is the 1s-state of the column at position bn ¼ ðbxn ; byn ÞT : In accordance with the 1s-state of an atom, it is the lowest energy bound state with energy E1s;n : In Eq. (1), it is assumed that the dynamical motion of the electron in a column can be expressed primarily in terms of this tightly bound 1s-state. The excitation coefﬁcients cn can be found from Z cn ðr0 bn Þ ¼ fn1s;n ðr bn Þpðr r0 Þ dr; ð2Þ where ‘ * ’ denotes taking the complex conjugate. Since the 1s-state is a real function and since the probe is assumed to be radially symmetric, so that

275

pðrÞ ¼ pðrÞ; Eq. (2) can be written as a convolution product: cn ðr0 bn Þ ¼ pðr0 bn Þn f1s;n ðr0 bn Þ:

ð3Þ

If the convolution theorem is used, Eq. (3) may be written as cn ðr0 bn Þ ¼ I1 g-r b PðgÞF1s;n ðgÞ Z 0 n ¼ PðgÞF1s;n ðgÞ expði2pg ðr0 bn ÞÞ dg; ð4Þ where PðgÞ is the Fourier transform of the probe pðrÞ; F1s;n ðgÞ is the Fourier transform of the 1sstate f1s;n ðrÞ; g is a two-dimensional spatial frequency vector in reciprocal space, and ‘ ’ denotes the scalar product. In this paper, the twodimensional Fourier transform F ðgÞ of an arbitrary function f ðrÞ is deﬁned as Z F ðgÞ ¼ Ir-g f ðrÞ ¼ f ðrÞ expði2pg rÞ dr: ð5Þ Consequently, the inverse Fourier transform is deﬁned as Z f ðrÞ ¼ I1 F ðgÞ ¼ F ðgÞ expði2pg rÞ dg: ð6Þ g-r For radially symmetric 1s and probe functions, Eq. (4) can be written as cn ðr0 bn Þ ¼ cn ðjr0 bn jÞ Z N ¼ 2p PðgÞF1s;n ðgÞJ0 ð2pgjr0 bn jÞg dg: ð7Þ 0

This is an elementary result of the theory of Bessel functions, where J0 ðÞ is the zeroth-order Bessel function of the ﬁrst kind and qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ jr0 bn j ¼ ðxk bxn Þ2 þ ðyl byn Þ2 ð8Þ is the distance from the probe to the atom column position. The illuminating STEM probe pðrÞ is the inverse Fourier transform of the transfer function of the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

276

objective lens PðgÞ: pðrÞ ¼

I1 g-r PðgÞ:

ð9Þ

The transfer function PðgÞ is radially symmetric and given by PðgÞ ¼ PðgÞ ¼ AðgÞ expðiwðgÞÞ; ð10Þ where g ¼ jgj is the spatial frequency. The circular aperture function AðgÞ is deﬁned as ( 1 if gpgap ; AðgÞ ¼ ð11Þ 0 if g > gap ; with gap being the aperture radius. The phase shift wðgÞ; resulting from the objective lens aberrations, is radially symmetric and given by wðgÞ ¼ pelg2 þ 12pCs l3 g4 ;

ð12Þ

with Cs being the spherical aberration constant, e the defocus, and l the electron wavelength. According to [14], the 1s-state function is well approximated by a two-dimensional quadratically normalized (radially symmetric) Gaussian function: 2 1 r f1s ðrÞ ¼ pﬃﬃﬃﬃﬃﬃ exp 2 ; ð13Þ 4a a 2p where r ¼ jrj and a represents the column-dependent width. This width is directly related to the energy of the 1s-state. The Fourier transform F1s ðgÞ of Eq. (13), that is, the Fourier spectrum of the object, is given by 1 g2 F1s ðgÞ ¼ pﬃﬃﬃﬃﬃﬃ exp 2 ð14Þ 4c c 2p with 1 ; ð15Þ 4pa the width of the Fourier transformed 1s-state.

c¼

2.2. The image intensity distribution From the exit wave, which is given in Eq. (1), the total detected intensity in the Fourier detector plane of a STEM can be derived [15] as Z Ips ðr0 Þ ¼ jCðg; zÞj2 DðgÞ dg; ð16Þ

where Cðg; zÞ is the two-dimensional Fourier transform of the exit wave cðr; zÞ; and DðgÞ is the detector function, which is equal to one in the detected ﬁeld and equal to zero elsewhere. The two-dimensional Fourier transform of the exit wave can be derived by combining Eqs. (1) and (5): Cðg; zÞ ¼ PðgÞ expð2pig r0 Þ þ

N X

cn ðr0 bn ÞF1s;n ðgÞ expð2pig bn Þ

n¼1

E1s;n exp ip kz 1 : E0

ð17Þ

Thus far, it has been assumed that the source can be modeled as a point. Therefore, the subscript ‘ps’ in Eq. (16) refers to point source. Elaborating on the ideas given in [16], it follows that the ﬁnite size of the source image can be taken into account by a two-dimensional convolution of the intensity distribution Ips ðr0 Þ with the intensity distribution of the source image SðrÞ: Iðk; lÞ ¼ Iðr0 Þ ¼ Ips ðr0 Þ * Sðr0 Þ:

ð18Þ

The effect of the source image is thus an additional blurring. The indices ðk; lÞ correspond to the probe at position r0 ¼ ðxk ; yl ÞT : A realistic form for the intensity distribution of the source image is Gaussian [16]. The function SðrÞ is thus a twodimensional normalized Gaussian distribution: 2 1 r SðrÞ ¼ SðrÞ ¼ exp 2 ð19Þ 2 2ps 2s with s representing the width corresponding to the radius containing 39% of the total intensity of S: Up to now, no assumptions have been made about the shape or size of the detector. From now on, however, the detector is assumed to be radially symmetric. Mathematically this means that DðgÞ ¼ DðgÞ: Eq. (18) can be split up into three terms: Iðk; lÞ ¼ I0 þ I1 ðk; lÞ þ I2 ðk; lÞ:

ð20Þ

The zeroth order term I0 corresponds to a noninteracting probe, the ﬁrst order term I1 ðk; lÞ to the interference between the probe and the 1sstate and the second order term I2 ðk; lÞ to the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

interference between different 1s-states. The zeroth order term I0 is given by I0 ¼

Z

2

I2 ðk; lÞ ¼

cn ðjr0 bn jÞcnm ðjr0 bm jÞ

E1s;n kz 1 exp ip E0 E1s;m exp þip kz 1 E0 Z 2p F1s;n ðgÞF1s;m ðgÞ #n

It describes a constant background intensity, resulting from the non-interacting electrons collected by the detector. This equation can be written as I0 ¼ 2p

"

n¼1 m¼1

jPðgÞ expð2pig r0 Þj DðgÞ dg * Sðr0 Þ: ð21Þ

Z

N X N X

277

J0 ð2pg dAn Am ÞgDðgÞ dg 2

A ðgÞDðgÞg dg

by substitution of Eq. (10) and using the fact that DðgÞ is radially symmetric. Due to the deﬁnition of the aperture function, given in Eq. (11), the following equality may be used: ð23Þ

Therefore, Eq. (22) becomes I0 ¼ 2p

Z AðgÞDðgÞg dg:

ð26Þ

ð22Þ where

A2 ðgÞ ¼ AðgÞ:

Sðr0 Þ;

ð24Þ

dAn Am ¼ jbn bm j ¼

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðbxn bxm Þ2 þ ðbyn bym Þ2 ð27Þ

is the distance between the atom columns at positions bn and bm : It is only the last term I2 ðk; lÞ of Eq. (20) that remains if the detector is annular and has an inner radius greater than or equal to the aperture radius. Then, PðgÞDðgÞ ¼ 0

ð28Þ

or, equivalently, The ﬁrst order term I1 ðk; lÞ corresponds to the interference between the incident probe pðr r0 Þ and the 1s-state f1s;n ðr bn Þ: I1 ðk; lÞ ¼

N X

"

ð29Þ

Therefore, Eq. (26) describes the model for annular dark-ﬁeld STEM. 2.3. The image recording

2 Re cn ðjr0 bn jÞ

n¼1

E1s;n kz 1 exp ip E0 Z 2p Pn ðgÞF1s;n ðgÞJ0 ð2pgjr0 bn jÞ # DðgÞg dg * Sðr0 Þ:

AðgÞDðgÞ ¼ 0:

ð25Þ

This is a linear term in the sense that the contributions of different atom columns are added. The second order term I2 ðk; lÞ describes the mutual interference between different 1s-states f1s;n and f1s;m :

In a STEM, the illuminating probe scans the specimen in a raster fashion. The image is thus recorded as a function of the probe position r0 ¼ ðxk ; yl ÞT : Without loss of generality, the image magniﬁcation is ignored. Therefore, the probe position r0 ¼ ðxk ; yl ÞT directly corresponds to an image pixel at the same position. The recording device is characterized as consisting of K L equidistant pixels of area Dx Dy; where Dx and Dy are the probe sampling distances in the x and y directions, respectively. Pixel ðk; lÞ corresponds to position ðxk ; yl ÞT ðx1 þ ðk 1ÞDx; y1 þ ðl 1ÞDyÞT with k ¼ 1; y; K and l ¼ 1; y; L and ðx1 ; y1 ÞT represents the position of the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

278

pixel in the bottom left corner of the ﬁeld of view FOV. The FOV is chosen to be centered about ð0; 0ÞT : Assuming a recording time t for one pixel and a probe current J; one calculates the number of electrons per probe position: Jt ð30Þ e with e ¼ 1:6 1019 C the electron charge. The recording time t for one pixel is the ratio of the frame time t; that is, the recording time for the whole FOV, to the total number of pixels KL: t : ð31Þ t¼ KL The primary electron dose D is thus given by Jt D ¼ KL : ð32Þ e The probe current J is given by [17] J¼

2 Br E0 p2 dI50 a2

ð33Þ 4e with Br being the reduced brightness of the source, E0 the beam energy, dI50 the diameter of the source image containing 50% of the current and a; which is equal to gap l; the aperture angle. From Eq. (19), it follows that pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ dI50 ¼ 2 2 ln 0:5 s: ð34Þ As a consequence of the detector shape and size in STEM, only the electrons within a selected part of the diffraction pattern are used to produce the image. Mathematically, this is expressed in Eq. (16). The expected number of detected electrons per pixel position ðk; lÞ equals [18] Jt fkl ; ð35Þ e where fkl denotes the fraction of electrons recorded by the detector ðfkl o1Þ: This fraction can be expressed as Iðk; lÞ fkl ¼ ; ð36Þ ID¼1 with ID¼1 being the constant intensity obtained if the detector function DðgÞ is uniform: Z ID¼1 ¼ 2p AðgÞg dg; ð37Þ

following from straightforward calculations, using Eqs. (20)–(26). The total number of detected electrons to form the image is now equal to Ddet ¼

K X L X k¼1

l¼1

fkl

Jt : e

ð38Þ

Then, the dose efﬁciency DE, which is deﬁned as the fraction of the primary electron dose that is detected, becomes PK PL Ddet l¼1 fkl DE ¼ ¼ k¼1 : ð39Þ D KL The result of Eq. (35) deﬁnes the expectation values of the intensity observations recorded by the detector, which is needed to derive the joint probability density function in the next section.

3. The joint probability density function of the observations In any STEM experiment, the observations will ‘‘contain errors’’. These errors have to be speciﬁed, which is the subject of this section. Generally, sets of observations made under the same conditions nevertheless differ from experiment to experiment. The usual way to describe this behavior is to model the observations as stochastic variables. The reason is that there is no viable alternative and that it has been found to work. Stochastic variables are deﬁned by probability density functions [19]. In a STEM experiment the observations are counting results, for which the probability density function can be modeled as a Poisson distribution. In what follows, the underlined characters represent stochastic variables. Consider a set of KL stochastic observations fwkl ; k ¼ 1; y; K; l ¼ 1; y; Lg; where the index kl% corresponds to the pixel at the position ðxk ; yl ÞT : In what follows, the vector w will represent the KL 1 column vector of these observations. The observations are assumed to be statistically independent and have a Poisson distribution. The probability that the observation wkl is equal to okl % is given by [20] o

lklkl expðlkl Þ; okl !

ð40Þ

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

where the parameter lkl is equal to the expectation E½wkl : The expectation values E½wkl are given in % Eq.% (35): Jt ð41Þ E½wkl ¼ fkl : e % The probability fw ðx; bÞ that a set of observations is equal to fokl ; k ¼ 1; y; K; l ¼ 1; y; Lg is the product of all probabilities described by Eq. (40) since the observations are assumed to be statistically independent: o K Y L Y lklkl expðlkl Þ; fw ðx; bÞ ¼ o ! k¼1 l¼1 kl

ð42Þ

where the elements of x correspond with those of w and the column vector b contains the unknown parameters. In our case b ¼ ðbx1 ; by1 ; y; bxN ; byN ÞT contains the coordinates in the x- and y-direction of the N atom column positions. The function described in Eq. (42) is called the joint probability density function of the observations. The joint probability density function, which has been derived here, will now be used for the computation of the CRLB.

4. The Cram!er–Rao Lower Bound In this section, the CRLB is discussed. The CRLB is a lower bound on the variance of any unbiased estimator. What does this mean? Suppose that an experimentalist wants to measure the position parameters b ¼ ðbx1 ; by1 ; y; bxN ; byN ÞT of a set of N atom columns quantitatively. For this purpose, one can use many estimators (estimation methods) such as least squares, least absolute values or maximum likelihood estimators. The precision of an estimator is represented by the variance or by its square root, the standard deviation. Generally, different estimators will have different precisions. It can be shown, however, that the variance of unbiased estimators will never be lower than the CRLB. Fortunately, there exists a class of estimators (including the maximum likelihood estimator) that achieves this bound at least asymptotically, that is, for the number of observations going to inﬁnity. For the details of this lower bound we refer to [9].

279

4.1. Fisher information The CRLB follows from the concept of Fisher information. The Fisher information matrix Fb for estimation of the position parameters of a set of N atom columns b ¼ ðbx1 ; by1 ; y; bxN ; byN ÞT ¼ T ðb1 ; y; b2N Þ is deﬁned as 2 @ ln fw ðw; bÞ Fb ¼ E ; ð43Þ @[email protected] where fw ðw; bÞ is the joint probability density function of the observations described in Eq. (42) and @2 ln fw ðw; bÞ

ð44Þ @[email protected] is the 2N 2N Hessian matrix of ln fw ðw; bÞ % % deﬁned by its ðp; qÞth element

@2 ln fw ðw; bÞ ; % % @bp @bq

ð45Þ

where bp and bq correspond to the p and qth element of the vector b; respectively. 4.2. Crame!r–Rao inequality Suppose that b# ¼ ðb#x1 ; b#y1 ; y; b#xN ; b#yN ÞT ¼ # ðb1 ; y; b#2N ÞT is an unbiased estimator of b: The Crame! r–Rao inequality then states that [21] 1 # bÞXF # Covðb; ; ð46Þ b

# bÞ # is the 2N 2N variance–covarwhere Covðb; # deﬁned by its iance matrix of the estimator b; ðp; qÞth element covðb# p ; b# q Þ: Its diagonal elements # The are thus the variances of the elements of b: matrix Fb1 is called the Crame! r–Rao lower bound # The Crame! r–Rao inequality on the variance of b: (46) expresses that the difference between the lefthand and right-hand member is positive semideﬁnite. A property of a positive semi-deﬁnite matrix is that its diagonal elements cannot be negative. This means that the diagonal elements of # bÞ # will always be larger than or equal to the Covðb; corresponding diagonal elements of the inverse of the Fisher information matrix. Therefore, the diagonal elements of Fb1 deﬁne lower bounds on the variances of the elements of b# Varðb#p ÞXFb1 ðp; pÞ;

ð47Þ

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

280

where p ¼ 1; y; 2N and Fb1 ðp; pÞ is the ðp; pÞth element of the inverse of the Fisher information matrix. The elements Fb ðp; qÞ may be calculated explicitly using Eqs. (41)–(45): " K X L X @2 Jt Fb ðp; qÞ ¼ E wkl ln fkl e @bp @bq % k¼1 l¼1 !# Jt fkl e

¼

K X L X k¼1

¼

Jt e

l¼1

K X

L X

k¼1

l¼1

@ 1 @fkl @2 fkl Jt E½wkl @bp @bq e % @bp fkl @bq

1 @fkl @fkl : fkl @bp @bq

ð48Þ

The derivative of fkl ðbÞ; with respect to b may be calculated from the parametric model of the intensity observations described in Section 2. In this section, it has been shown how from the parametric model of the intensity observations described in Section 2 and the joint probability density function described in Section 3, the elements of the Fisher information matrix Fb of Eq. (48) may be calculated explicitly. From the latter, the CRLB may be computed. The diagonal elements of the CRLB give a lower bound on the variance of any unbiased estimator of the x- and ycoordinates of a set of N atom columns. The CRLB is a function of microscope and object parameters. In the following section, this lower bound will be used to study the dependence of the precision on the microscope parameters for different objects.

5. Experimental design The CRLB, which is discussed in Section 4, will be used to evaluate and optimize the experimental design of a quantitative STEM experiment. First, it will be stated which optimality criterion is chosen given the purpose of this paper. The optimal experimental design will be determined by minimizing a scalar measure s2CR of the CRLB as a function of the microscope parameters. Second, an overview of the microscope parameters

will be given. Some of them are tunable, while others are ﬁxed. Third, the results of the numerical evaluation of the dependence of s2CR on these parameters will be discussed. Fourth, an interpretation of these results will be given. Fifth, the resulting optimal design will be compared with the design that is assumed to be optimal in the conventional approach using performance criteria that are based on resolution and contrast. 5.1. Optimality criterion The purpose of this paper is to optimize the design of a STEM so as to measure the atom column positions as precisely as possible. The precision with which the atom column coordinates can be measured is represented by the diagonal elements of the CRLB, that is, the right-hand side members of inequalities (47). However, simultaneous minimization of the diagonal elements of the CRLB as a function of the microscope parameters is in most cases impossible. A decrease of a particular diagonal element has often the unfavorable side effect of an increase of others. Therefore, the optimal experimental design has to be determined by minimizing a scalar measure of the CRLB. Several measures can be found in literature [22,23]. Among these optimality criteria the obvious choice given the purpose of this paper is the A-optimality criterion. The A-optimality criterion is deﬁned by the scalar measure s2CR : s2CR ¼ trace Fb1 ;

ð49Þ

that is, the trace of Fb1 ; or equivalently, the sum of the variances of the estimators of the atom column coordinates. Minimization of s2CR as a function of the microscope parameters results in the optimal experimental design. 5.2. Microscope parameters An overview of the microscope parameters, which enter the parametric model for the STEM intensity observations, described in Section 2, is given here. For simplicity, some of these parameters will be kept constant in the evaluation and optimization of the experimental design.

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

The parameters describing the detector conﬁguration are related to the detector function DðgÞ: In principle, the detector can have any shape or size. However, in the present work we will conﬁne ourselves to the more common ones, which are, annular and axial detectors. The inner or outer radius gdet of the annular or axial detector, respectively, is tunable. The parameters describing the probe are: the defocus e; the spherical aberration constant Cs ; the objective aperture radius gap ; the electron wavelength l; the width of the source image s; and the reduced brightness Br : In the evaluation, l and Br have been kept constant. The parameters describing the image recording are the FOV ; the probe sampling distances or, equivalently, the pixel sizes, Dx and Dy; the number of pixels K and L in the x- and ydirection, respectively, and the recording time t: The FOV is chosen ﬁxed, but large enough so as to guarantee that the tails of the probe are collected. The pixel sizes Dx and Dy are kept constant. It can be shown that the precision will generally improve with smaller pixel sizes, with all other parameters kept constant. However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel SNR decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in [10,24]. It is directly clear from Eqs. (33), (48) and (49) that the precision will increase proportionally to the recording time t and the reduced brightness Br : In what follows, the recording time t is kept constant, presuming that specimen drift sets practical limits on the exposure time. As already mentioned Br has been kept constant too.

5.3. Numerical results In this subsection, the results of the numerical evaluation of the dependence of s2CR on the microscope parameters will be discussed. The cases for which the parametric model consists of an isolated atom column and of two atom columns are considered in the ﬁrst and second part, respectively. It will be discussed how the optimal

281

experimental design of an isolated atom column is inﬂuenced by a neighboring atom column. 5.3.1. Isolated atom column In this part, the experimental design is evaluated and optimized for the special case of a parametric model consisting of an isolated atom column. Therefore, s2CR which is deﬁned in Eq. (49), is equal to the sum of the lowest variance with which the x- and y-coordinate of the atom column position can be estimated. The FOV is chosen centered about the atom column, because in that case it may be shown that the variances on the xand y-coordinate are equal. To begin with, it is assumed that the width of the source image is determined by the aperture angle, following the relation dI50 ¼

0:54l ; a

ð50Þ

where 0:54l=a is equal to the diameter of the diffraction-error disc containing 50% of the total intensity. In this way, the contribution of the source image to the total probe size is small. Then, meeting Eq. (50), it follows from Eq. (33) that the probe current is constant and equal to JB ¼ 1018 Br [17]. Next, the dependence of the precision on the aperture radius is studied. From the evaluation of s2CR ; it is found that the optimal aperture radius is mainly determined by the object and that it is the same for annular or axial detectors. The optimal aperture radius is proportional to the width of the column’s 1s-state F1s ðgÞ described by Eq. (14). Fig. 1 compares the optimal aperture radius (for Cs ¼ 0:5 mm and at optimal defocus) with the width c ¼ 1=ð4paÞ of the 1s-state F1s ðgÞ: The optimal aperture radii are plotted as a function of ðd 2 =Z þ 0:276BÞ1=2 ; since this is proportional to the width c; as shown in [25]. For a given atom column, d represents the interatomic distance, Z the atomic number and B the Debye–Waller factor. The widths a of the corresponding 1s-states f1s ðrÞ in real space are given in Table 1. It is clear from Fig. 1 that the inﬂuence of the object on the optimal aperture radius is substantial. In contrast to what one may expect, the resulting probe in the optimal design is not as small as possible. It is even

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

larger than the 1s-function f1s ðrÞ; which is shown in Fig. 2 for a silicon and a gold column in ½1 0 0 direction. Furthermore, an increase of the spherical aberration constant results in a decrease of the optimal aperture radius and vice versa. This effect is especially important for heavy atom columns, such as a gold column in ½1 0 0 -direction, where the optimal aperture radius for Cs ¼ 0 mm ( 1 and for Cs ¼ 0:5 mm it is is equal to 0:75 A ( 1 : For a lighter atom column, such equal to 0:5 A as silicon in ½1 0 0 -direction, the optimal aperture radius for Cs ¼10 and 0:5 mm are the same and ( : equal to 0:28 A Then, a comparison between an annular and an axial detector is presented. In Fig. 3, s2CR is evaluated for a Si [1 0 0] column as a function of the detector-to-aperture radius, for an annular as well as an axial detector. The aperture radius and

Probe Si [100]

1.2

0.8 1.6 0.4 0.8

0.0 -10

-5

0

5

10

Spatial coordinate (Å)

0.2

Au [100]

0.4

-5

0

5

10

Spatial coordinate (Å)

Annular detector Axial detector

0.1

Width of 1s-state Optimal aperture radius

Si[100]

0.2

2 σCR (Å2)

Å-1

Cu [100]

0.0 -10

Fig. 2. The left and right ﬁgure represent the 1s-state f1s ðrÞ for a Si ½1 0 0 column and a Au ½1 0 0 column, respectively. The amplitude jpðrÞj of their associated optimal probes (for Cs ¼ 0:5 mm) are also shown.

0.6

Sn[100] Sr[100] Si[110]

Probe Au [100]

3.2

2.4

Amplitude

282

0.8

1.2 2

0.0 0.4

1.6

(d /Z+0.276B)

-1/2

0.8

1.2

1.6

gdet /gap

-1

(Å )

Fig. 1. Comparison of the optimal aperture radius (for Cs ¼ 0:5 mm) with the width of the 1s-state F1s ðgÞ; which is proportional to ðd 2 =Z þ 0:276BÞ1=2 : The Debye–Waller factor ( 2 : The optimal aperture radius increases with B is set to 0:6 A the width of the 1s-state. Therefore, it varies strongly for different atom columns.

Fig. 3. The criterion s2CR ; which is deﬁned in Eq. (49), computed as a function of the detector-to-aperture radius for a Si ½1 0 0 column, using an annular and an axial detector. The aperture radius and the defocus are set to their optimal values ( 1 and 8 nm; respectively. The detector radius is 0:28 A variable. All other parameters are ﬁxed (see Tables 2 and 3).

Table 1 ( 2 ), interatomic distance, atomic number, and intercolumn distance Width of the 1s-state, and its energy (Debye–Waller factor ¼ 0:6 A for different atom columns

( a ðAÞ E1s ðeVÞ ( d ðAÞ Z ( dAA ðAÞ

Si [1 0 0]

Si [1 1 0]

Sr [1 0 0]

Sn [1 0 0]

Cu [1 0 0]

Au [1 0 0]

0.34 20.2 5.43 14 1.92

0.27 37.4 3.84 14 1.36

0.22 57.3 6.08 38 4.03

0.20 69.8 6.49 50 2.29

0.18 78.3 3.62 29 2.56

0.13 210.8 4.08 79 2.88

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289 Table 2 Initial microscope parameter values

300

0.5

L

0.006

Dx Dy t ðsÞ ðpmÞ ðpmÞ

99 99 20

20

8 108

Br

A m2 srV

!

2 107

σ 2CR (Å2)

E0 Cs K ðkeVÞ ðmmÞ

283

0.004 Cs = 0 mm Cs = 0.5 mm

0.002

0.000

-400

-200

Table 3 Parameters of an isolated atom column

0

200

400

Defocus (Å)

( bx ðAÞ

( by ðAÞ

( z ðAÞ

0

0

E0 E1s k

( the defocus are set to their optimal values 0:28 A and 8 nm; respectively. All other parameters are held ﬁxed (Tables 2 and 3). Three relations can be derived from this ﬁgure. First, for an annular detector, the optimal design is obtained when the detector radius equals the optimal aperture radius. Second, for an axial detector, the optimal detector radius is slightly smaller than the optimal aperture radius. Third, an annular detector may result in higher precisions than an axial detector, when operating at the optimal conditions. Subsequently, the dependence of the precision on the defocus is evaluated. In Fig. 4, s2CR is plotted for a Si ½1 0 0 column as a function of the defocus for an annular detector, whereas in Fig. 5 this is done for an axial detector. These ﬁgures show that the optimal defocus is the same for an annular and an axial detector, for a given spherical aberration constant. Furthermore, the optimal defocus depends strongly on the spherical aberration constant. An empirical relation has been found between the optimal defocus e; the optimal aperture radius gap and the spherical aberration constant Cs :

Fig. 4. The criterion s2CR ; which is deﬁned in Eq. (49), computed as a function of defocus for a Si ½1 0 0 column, using an annular detector (for Cs ¼ 0 and 0:5 mm). The aperture and detector radius are set to their optimal value ( 1 : All other parameters are ﬁxed (see Tables 2 and 3). 0:28 A

1

eE 12Cs l2 g2ap :

ð51Þ

In this case, the transfer function is nearly equal to one over the whole angular range of the objective aperture. The optimal transfer function for a Si ½1 0 0 column is presented in Fig. 6, where the vertical line represents the optimal aperture radius

0.04

σ 2CR (Å2)

0.03 0.02 Cs = 0 mm Cs = 0.5 mm

0.01 0.00 -400

-200

0

200

400

Defocus (Å) Fig. 5. The criterion s2CR ; which is deﬁned in Eq. (49), computed as a function of defocus for a Si ½1 0 0 column, using an axial detector (for Cs ¼ 0 and 0:5 mm). The aperture and detector radius are set to their optimal values 0:28 and ( 1 ; respectively. All other parameters are ﬁxed (see 0:22 A Tables 2 and 3).

and Cs is set to 0:5 mm: Eq. (51) is derived from Eq. (12) by setting the phase shift wðgÞ exactly to zero for g ¼ gap : Then, the potential merit of Cs -correctors in quantitative STEM applications is studied. Figs. 7 and 8 show the ratio s2CR =s2CR ðCs ¼ 0 mmÞ as a function of the spherical aberration constant, for an annular as well as an axial detector. This is done for a Si ½1 0 0 column, as well as for a Au ½1 0 0 column. The aperture radius has been set to the value that is optimal for Cs ¼ 0:5 mm: From Fig. 7, it follows that the precision that can be gained by reducing Cs is only marginal for a Si

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

284

2.5 2 2 / σ CR (Cs=0 mm) σ CR

Transfer function

1.0 0.5 0.0 Real part Imaginary part

-0.5 -1.0 0.0

0.1

0.2

0.3

0.5

0.4

-1

Annular detector Axial detector

2.0

1.5

1.0 0.0

0.2

0.4

0.6

0.8

1.0

Cs (mm)

Å

Fig. 6. Transfer function for a spherical aberration constant of 0:5 mm and defocus of 8 nm: The vertical line represents the ( 1 : aperture radius, which is equal to 0:28 A

Fig. 8. The ratio s2CR =s2CR ðCs ¼ 0 mmÞ computed as a function of the spherical aberration constant, for a Au ½1 0 0 column, using an annular as well as an axial detector. The ( 1 : The defocus is aperture and detector radius are set to 0:50 A determined by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

2 2 σCR / σCR (Cs= 0 mm)

1.010 Annular detector Axial detector

1.005

1.000 0.0

0.2

0.4

0.6

0.8

1.0

Cs (mm)

Fig. 7. The ratio s2CR =s2CR ðCs ¼ 0 mmÞ computed as a function of the spherical aberration constant, for a Si ½1 0 0 column, using an annular as well as an axial detector. The aperture and ( 1 : The defocus is determined detector radius are set to 0:28 A by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

½1 0 0 column, whereas it follows from Fig. 8 that such a reduction would be more likely to pay off for a Au ½1 0 0 column. For an annular detector, the precision, expressed in terms of the variance, at Cs ¼ 1 mm is about 1:008 and 1:752 times worse than the precision at Cs ¼ 0 mm for a Si ½1 0 0 and a Au ½1 0 0 column, respectively. In terms of the standard deviation, which is deﬁned as the square root of the variance, these fractions are 1:004 and 1:324; respectively. This result may be explained by the fact that the optimal aperture setting is strongly dependent on the atom column. The optimal aperture radius for a Au ½1 0 0 column is much larger than for a Si ½1 0 0 column. Because spherical aberration is observable only for non-

paraxial rays, correction is only necessary for objective lenses working with larger apertures. In other applications, a Cs -corrector may be more worthwhile: a larger pole-piece gap may be possible, allowing greater access for X-ray detectors and sample holders for in situ experiments or a larger probe current is possible in a probe of a given size, which is of importance in microanalysis. The optimal experimental settings described in this part are derived for single isolated atom columns. One should keep in mind, that the precision with which the position of a single isolated column can be estimated is a good performance measure as long as neighboring columns are clearly separated in the image. In this case, the precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. However, images of atom columns taken under experimental settings that are optimal for isolated atom columns may show strong overlap for realistic materials, for example, for a Si ½1 0 0 crystal. Then, the precision with which the position of an atom column can be estimated is affected unfavorably by the presence of neighboring columns [24]. To ﬁnd out if the optimal experimental design changes in the case of neighboring atom columns, the parametric model for the intensity distribution derived in Section 2 will be extended from one to two atom columns. Then, another interesting

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

question that can be answered is if an increase of the diameter of the source image, accompanied with larger probe currents, results in higher precisions. Thus far, the diameter of the source image has been determined by the diameter of the diffraction-error disc, following Eq. (50). Increasing this diameter results in a higher precision in the case of a single isolated atom column. The ‘optimum’ value of s; and therefore of dI50 ; would be inﬁnity. However, this is not a realistic value, since neighboring columns will then strongly overlap. This overlap will be taken into account in the following part.

5.3.2. Two neighboring atom columns In this part, the experimental design is evaluated and optimized for the special case of a parametric model consisting of two neighboring atom columns. The atom column types are similar to those considered in Section 5.3.1 and the intercolumn distances are given in Table 1. The criterion s2CR which is deﬁned in Eq. (49), is now equal to the sum of the lowest variances with which the x- and y-coordinates of the two atom column positions can be estimated. The FOV is chosen centered about the atom column positions, whose coordinates are given in Table 4. In this case, it may be shown that the variances on the x-coordinates as well as the variances on the y-coordinates are equal. First, it has been assumed that the diameter of the source image is given by Eq. (50). From the evaluation of the criterion s2CR ; it follows that the optimal design for two neighboring atom columns is almost equal to the one for an isolated atom column. This means that neighboring columns of some atom types may show strong overlap in images taken under the optimal conditions. Compared to the results given in Section 5.3.1,

Table 4 Parameters of two neighboring atom columns ( bx1 ðAÞ

( by1 ðAÞ

( bx2 ðAÞ

( by2 ðAÞ

( z ðAÞ

dAA 2

0

dAA 2

0

E0 E1s k

285

changes in the optimal aperture radius are only in the order of 5 percent. The optimal detector radius is still equal to or slightly smaller than the optimal aperture radius for annular and axial detectors, respectively. The attainable precision is again higher for annular than for axial detectors. Furthermore, the optimal defocus is still given by Eq. (51) and a Cs -corrector is more likely to pay off for heavy than for light atom columns, although the precision that can be gained is only marginal. Second, the diameter of the source image has been taken variable. In practice, this is possible by adjusting the settings of the condenser lenses, allowing the demagniﬁcation of the source to be continuously varied. It is well known that an increase of the diameter of the source image is accompanied with two side effects: an increase of the source size and of the probe current, having an unfavorable and a favorable effect on the precision, respectively [6]. The potential merit of increasing the diameter of the source image is studied by the evaluation of the criterion s2CR : For annular detectors, it has been found that the optimal diameter of the source image is about twice as large as the one deﬁned in Eq. (50) for the atom column types and distances given in Table 1. For example, Fig. 9 shows the computed criterion s2CR as a function of the ratio dI50 =ð0:54l=aÞ for two neighboring Si ½1 1 0 columns. The ( 1 ; being aperture radius has been set to 0:36 A optimal for Cs ¼ 0:5 mm and the radius of the annular detector is set equal to the aperture radius. For axial detectors, it has been found that the optimal diameter of the source image is about equal to the one deﬁned in Eq. (50) for the atom column types and distances given in Table 1. Furthermore, it has to be noticed that the evaluation of the diameter of the source image has no effect on the optimal settings of the other microscope parameters. Finally, it has to be mentioned that thermal diffuse scattering is not taken into account in the parametric model for the image intensity distribution used in this paper. This is justiﬁed by the fact that the inner radius of the annular detector must not be taken too large in a quantitative experiment, where high-dose efﬁciency is important in

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

286

0.006

0.016

2 σCR (Å2)

2 σCR (Å2 )

0.024

0.004 σ 2CR

0.002

2

d50 /D int.

0.008 0.000

0.3

0.000 1

2

3

4

5

d I50 /(0.54 λ/α)

0.5

Fig. 9. The criterion s2CR computed as a function of the ratio dI50 =ð0:54l=aÞ for two neighboring Si ½1 1 0 columns for an annular detector. The aperture and detector radius are set to ( 1 : The defocus is determined by their optimal value 0:36 A Eq. (51). All other parameters are ﬁxed (see Tables 2 and 4).

order to provide high precision. In this case, the 1s-states dominate the scattering. In the following subsection, an intuitive interpretation of the described results is given. 5.4. Interpretation of the results

Fig. 10. The left-hand and right-hand side members of Eq. (52) computed as a function of the aperture radius for a Sr ½1 0 0 column, using an annular detector. The radius of the annular detector is equal to the aperture radius. The defocus is determined by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

0.12

σ 2CR

(Dn.int.d 50 )/(Dint.KL ∆ x ∆y) 4

σ 2CR (Å2 )

0

0.4

Aperture radius (Å-1)

2

0.08

0.04

Proportionality relations for s2CR for dark-ﬁeld and bright-ﬁeld images consisting of an isolated atom column have been derived. In dark-ﬁeld imaging, where the non-interacting electrons are eliminated from detection, it has been found that s2CR B

2 d50 ; Dint:

ð52Þ

where Dint: is equal to the total number of interacting electrons and d50 represents the radius of the image intensity distribution of the interacting electrons containing 50% of its total intensity. In particular, relation (52) holds for annular detectors, having an inner radius greater than or equal to the aperture radius. In bright-ﬁeld imaging, where the non-interacting electrons contribute to the background intensity in the image, it has been found that s2CR B

4 Dn:int: d50 ; D2int: KLDxDy

ð53Þ

where Dn:int: is equal to the total number of noninteracting electrons. Relation (53) holds for axial

0.00 0.3

0.4

0.5

Aperture radius (Å-1) Fig. 11. The the left-hand and right-hand side members of Eq. (53) computed as a function of the aperture radius for a Sr ½1 0 0 column, using an axial detector. The radius of the axial detector is equal to the aperture radius. The defocus is determined by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

detectors as well as for annular detectors having an inner radius smaller than the aperture radius. The validity of these proportionality relations are illustrated in Figs. 10 and 11, where its right-hand and left-hand side are shown as a function of the aperture radius for a Sr ½1 0 0 column, using an annular and an axial detector, respectively. This relation allows us to get deeper insight into the numerical results, derived in Section 5.3. It shows that, in order to obtain a higher precision, one has to balance the width of the image intensity

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

distribution and the number of interacting and non-interacting electrons. For example, at the optimal design, the optimal aperture radius gap strongly depends on the atom column. Furthermore, an annular detector is to be used of which the inner radius gdet is equal to gap : How can this be explained? On the one hand, d50 will become smaller if the probe becomes smaller, that is, if the aperture radius increases. However, the decrease of d50 will become less important if the probe width is about equal to the width of the 1s-state. This is due to the fact that d50 is mainly determined by the excitation of the 1s state, described by Eq. (3). On the other hand, the accompanied increase of the detector radius results in an enormous loss of interacting electrons. As a consequence, the optimal design balances the excitation of the atom column-dependent 1s-state and the loss of electrons in the detector. In this way, the dependence of the optimal aperture radius on the atom column can be explained. 5.5. Comparison with conventional approach In the conventional approach, which is based on direct visual interpretability, the Scherzer conditions for incoherent imaging are usually applied [3,4] 1 4l 1=4 gap ¼ ; l Cs e ¼ ðCs lÞ1=2 :

ð54Þ

Table 5 compares these Scherzer conditions and the optimal conditions for a Sr ½1 0 0 column. In Table 5 Comparison between the optimal conditions (for an isolated Sr [1 0 0] column) and the Scherzer conditions for an annular detector, with Cs ¼ 0:5 mm

e ðnmÞ ( 1 Þ gap ðA ( 1 Þ gdet ðA 2 ( 2Þ sCR ðA DE ð%Þ ( d50 ðAÞ

Optimal conditions (Sr [1 0 0])

Scherzer conditions

16 0.40 0.40 0.0037 2.9 0.94

32 0.56 1.12 0.2304 0.034 0.75

287

the column of the Scherzer conditions, the value of gdet has been taken two times larger than gap ; which is representative for a typical Crewe detector [26]. As can be noticed clearly, the Scherzer conditions differ signiﬁcantly from the optimal conditions. The precision, expressed in terms of the variance, at Scherzer conditions is about 62 times worse than the precision that could be reached at the optimal design. In terms of the standard deviation, which is deﬁned as the square root of the variance, this fraction is about 8: This is not astonishing and can be explained from Eq. (52). Due to the large hole in the detector, the dose efﬁciency is very low at Scherzer conditions, thus affecting the precision in an unfavorable way. Due to the smaller probe size, the width of the intensity distribution is slightly smaller at Scherzer conditions than at the optimal conditions, thus affecting the precision in a favorable way. However, the extremely low number of detected electrons is the dominant factor, resulting in a low precision. From this comparison, it may be concluded that there is a world of difference between the Scherzer conditions and the optimal conditions. Although, one has to keep in mind that both conditions are derived for different purposes: direct visual interpretability on the one hand and precise measurement of the atom column positions on the other hand. However, both purposes may go hand in hand. As explained earlier, quantitative structure determination is done by numerically ﬁtting the parametric model to the experimental data. The ﬁt is evaluated using a criterion of goodness of ﬁt. In practice, the search for the global optimum of the criterion of goodness of ﬁt is an iterative numerical procedure. At each iteration, the coordinates are slightly changed in order to improve the ﬁt. In order to guarantee convergence to the global optimum of the goodness of ﬁt, good initial conditions are required. This means that it is important to ﬁnd a reasonable trial structure. Trial positions for the atom columns may be obtained from experimental images that are optimized for qualitative interpretation, whereas the reﬁnement may result from experimental images that are optimized for quantitative interpretation.

288

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

6. Conclusions

References

Conventionally, the design of a STEM experiment is based on qualitative image interpretation. However, in terms of image interpretation, the future lies in quantitative measurement of structural parameters. Since this is a different purpose, the design has to be reconsidered. A quantitative measure has been proposed to evaluate and optimize the design of a high-resolution STEM experiment. It is based on the statistical precision with which the positions of atom columns can be estimated. In the resulting optimal design, the aperture radius has been found to be mainly determined by the object under study. More speciﬁcally, it is proportional to the weight of the atom column. The optimal inner or outer radius of an annular or an axial detector turns out to be equal to or slightly smaller than the optimal aperture radius, respectively. However, an annular detector results in a higher precision than an axial detector. The resulting optimal defocus is the one for which the transfer function comes close to unity over the whole angular range of the aperture. The merit of Cs -correctors in quantitative STEM applications depends on the object under study. It pays off more for heavy atom columns, although the precision that can be gained is only marginal. For annular detectors, increasing the size of the source image beyond the size of the diffraction-error disc, which increases the probe current at the expense of resolution, has a favorable effect on the attainable precision. For axial detectors, the optimal size of the source image is about equal to the size of the diffractionerror disc.

[1] S.J. Pennycook, Scanning transmission electron microscopy: Z-contrast, in: S. Amelinckx, D. Van Dyck, J. Van Landuyt, G. Van Tendeloo (Eds.), Handbook of MicroscopyFApplications in Materials Science, Solid-State Physics and Chemistry, Methods II, pages 595–620, Weinheim, 1997. VCH. [2] J.M. Cowley, Scanning Transmission Electron Microscopy, in: S. Amelinckx, D. van Dyck, J. Van Landuyt, G. Van Tendeloo (Eds.), Handbook of MicroscopyFApplications in Materials Science, Solid-State Physics and Chemistry, Methods II, VCH, Weinheim, 1997, pp. 563–594. [3] O. Scherzer, J. Appl. Phys. 20 (1949) 20. [4] S.J. Pennycook, D.E. Jesson, Ultramicroscopy 37 (1991) 14. [5] P.D. Nellist, S.J. Pennycook, Phys. Rev. Lett. 81 (19) (1998) 4156. [6] P.D. Nellist, S.J. Pennycook, The principles and interpretation of annular dark-ﬁeld Z-contrast imaging, in: P.W. Hawkes (Ed.), Advances in Imaging and Electron Physics, Vol. 113, Academic Press, San Diego, 2000, pp. 147–199. [7] D. Van Dyck, Prospects of quantitative high resolution $ electron microscopy, in: L. Frank, F. Ciampor (Eds.), Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology, Vol. III, The Czechoslovak Society for Electron Microscopy, Brno, 2000, pp. 13–18. [8] A. van den Bos, Measurement errors, in: J.G. Webster (Ed.), Encyclopedia of Electrical and Electronics Engineering, Vol. 12, Wiley, New York, 1999, pp. 448–459. [9] A. van den Bos, Parameter estimation, in: P.H. Sydenham (Ed.), Handbook of Measurement Science, Vol. 1, Wiley, Chicester, 1982, pp. 331–377. [10] A.J. den Dekker, J. Sijbers, D. Van Dyck, J. Microsc. 194 (1999) 95. [11] J. Broeckx, M. Op de Beeck, D. Van Dyck, Ultramicroscopy 60 (1995) 71. [12] D. Van Dyck, M. Op de Beeck, Ultramicroscopy 64 (1996) 99. [13] S.J. Pennycook, D.E. Jesson, Acta Metall. Mater. 40 (1992) S149–S159. [14] P. Geuens, J.H. Chen, A.J. den Dekker, D. Van Dyck, An analytic expression in closed form for the electron exit wave, Acta Crystallogr. Section A, 55 Supplement, Abstract P11.OE.002, 1999. [15] J.M. Cowley, Ultramicroscopy 2 (1976) 3. [16] C. Mory, M. Tence, C. Colliex, J. Microsc. Spectrosc. Electron. 10 (1985) 381. [17] J.E. Barth, P. Kruit, Optik 3 (1996) 101. [18] L. Reimer, Elements of a transmission electron microscope, in: Transmission Electron Microscopy, Physics of Image Formation and Microanalysis, Springer, Berlin, Heidelberg, 1993, pp. 86–135.

Acknowledgements The authors would like to thank Dr. J.E. Barth and Dr. M.A.J. van der Stam for fruitful discussions related to this work. The research of Dr. A.J. den Dekker has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences.

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289 [19] A. van den Bos, A.J. den Dekker, Resolution reconsideredFconventional approaches and an alternative, in: P.W. Hawkes (Ed.), Advances in Imaging and Electron Physics, Vol. 117, Academic Press, San Diego, 2001, pp. 241–360. [20] A.M. Mood, F.A. Graybill, D.C. Boes, Introduction to the Theory of Statistics, 3rd Edition, McGraw-Hill, Tokyo, 1974. [21] M.G. Kendall, A. Stuart, The Advanced Theory of StatisticsFInference and Relationship, Volume 2, 2nd Edition, Charles Grifﬁn and Company Limited, London, 1967.

289

[22] V.V. Fedorov, Theory of Optimal Experiments, Academic Press, New York, London, 1972. [23] A. P!azman, Foundations of Optimum Experimental Design, D. Reidel Publishing Company, Dordrecht, Boston, Lancaster, Tokyo, 1986. [24] E. Bettens, D. Van Dyck, A.J. den Dekker, J. Sijbers, A. van den Bos, Ultramicroscopy 77 (1999) 37. [25] D. Van Dyck, J.H. Chen, Solid State Commun. 109 (8) (1999) 501. [26] S.J. Pennycook, D.E. Jesson, M.F. Chisholm, N.D. Browning, A.J. McGibbon, M.M. McGibbon, J. Microsc. Soc. Am. 1 (6) (1995) 231.

Optimal experimental design of STEM measurement of atom column positions S. Van Aerta,*, A.J. den Dekkera, D. Van Dyckb, A. van den Bosa a

Department of Applied Physics, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands b Department of Physics, University of Antwerp (RUCA), Groenenborgerlaan 171, 2020 Antwerp, Belgium Received 21 November 2000; received in revised form 6 September 2001

Abstract A quantitative measure is proposed to evaluate and optimize the design of a high-resolution scanning transmission electron microscopy (STEM) experiment. The proposed measure is related to the measurement of atom column positions. Speciﬁcally, it is based on the statistical precision with which the positions of atom columns can be estimated. The optimal design, that is, the combination of tunable microscope parameters for which the precision is highest, is derived for different types of atom columns. The proposed measure is also used to ﬁnd out if an annular detector is preferable to an axial one and if a Cs -corrector pays off in quantitative STEM experiments. In addition, the optimal settings of the STEM are compared to the Scherzer conditions for incoherent imaging and their dependence on the type of object is investigated. r 2002 Elsevier Science B.V. All rights reserved. PACS: 07.05.FB; 61.16.Bg; 02.50.r; 43.50.+y Keywords: Scanning transmission electron microscopy (STEM); Electron microscope design and characterization; Data processing/ image processing

1. Introduction For many years, it has been standard practice to evaluate the performance of STEM imaging modes qualitatively, that is, in terms of direct visual interpretability. The performance criteria used are resolution and contrast. For example, when axial bright-ﬁeld coherent STEM is compared to annular dark-ﬁeld incoherent STEM, the latter imaging mode is preferred. The basic ideas *Corresponding author. Tel.: +31-15-2781823; fax: +31-152784263. E-mail address: [email protected] (S. Van Aert).

underlying this preference are the higher resolution for incoherent imaging than for coherent imaging [1] and the higher contrast in dark-ﬁeld images than in bright-ﬁeld images [2]. In annular darkﬁeld incoherent STEM, visual interpretation of the images is optimal if the Scherzer conditions [3] for incoherent imaging are adapted [4]. As demonstrated in [5], the resolution can be improved further if the main lobe of the probe is narrowed. However, visual interpretability is then reduced as a result of a considerable rise of the sidelobes of the probe. However, two important aspects are absent in these widely used performance criteria. First, the

0304-3991/02/$ - see front matter r 2002 Elsevier Science B.V. All rights reserved. PII: S 0 3 0 4 - 3 9 9 1 ( 0 1 ) 0 0 1 5 2 - 8

274

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

electron–object interaction is not taken into account. Second, the dose efficiency, which is deﬁned as the fraction of the primary electron dose that is detected, is left out of consideration. Higher resolution and contrast are often obtained at the expense of dose efﬁciency, which leads to a decrease in signal-to-noise ratio (SNR). For example, the incoherence in annular dark-ﬁeld incoherent STEM is only attained by using an annular dark-ﬁeld detector with a geometry much larger than the objective aperture [6]. Its corresponding higher resolution, by adapting the Scherzer conditions for incoherent imaging, is thus obtained at the expense of dose efﬁciency. Another example is the following. It is well known that in bright-ﬁeld images, decreasing the size of an axial detector leads to higher contrast, but also to a deterioration of the SNR. To compensate for such a decrease in SNR, longer recording times are necessary, which in turn increase the disturbing inﬂuence of specimen drift. In our opinion, the ultimate goal of STEM is not providing optimal visual interpretability, but quantitative structure determination instead. The reason for this is that the positions of atom . columns should be known within sub-angstrom precision in order to understand the physical and chemical properties of materials such as superconductors, nanoparticles, quantum transistors, etc. [7]. The images are then to be considered as data planes from which the structural information has to be measured quantitatively. This can be done as follows: one has a model for the object and for the imaging process, including electron–object interaction, microscope transfer and image detection. This model describes the expectations of the intensity observations [8] and it contains parameters that have to be measured experimentally. The quantitative structure determination is done by ﬁtting the model to the experimental data by use of criteria of goodness of ﬁt, such as, least squares, least absolute values or maximum likelihood. Thus, quantitative structure determination becomes a statistical parameter estimation problem. Ultimately, the structural parameters, such as the positions of the atom columns, have to be measured as precisely as possible. However, this precision will always be limited by the presence of

noise. Given the parametric model and knowledge about the statistics of the observations, use of the concept of Fisher information [9] allows us to derive an expression for the highest attainable precision with which the positions of the atom columns can be estimated. This expression, which is called the Crame! r–Rao Lower Bound (CRLB), is a function of the object parameters, microscope parameters, and dose efﬁciency. Therefore, it may be used as an alternative performance measure in the optimization of the design of a STEM experiment for a given object. The optimal design is the set of microscope parameters resulting in the highest overall attainable precision. In practice, the design is optimized by minimizing a scalar measure of the CRLB. An overview of the concept of the CRLB and its use can be found in [9]. In [10], an example can be found in which the CRLB is computed in order to optimize the design of a quantitative HREM experiment. The microscope parameters considered are related to the probe, the image recording and the detector conﬁguration. The probe is described by the objective aperture radius, the defocus, the spherical aberration constant, the electron wavelength, the width of the source image, and the reduced brightness of the source. The parameters describing the image recording are the ﬁeld of view (FOV), that is, the area scanned on the specimen, the probe sampling distances and the recording time. The detector conﬁguration is described by either the inner radius of an annular detector or the outer radius of an axial detector. The paper is organized as follows. The parametric model for the expectations of the intensity observations is described in Section 2. In Section 3, the joint probability density function of the observations is introduced. From this, the CRLB will be derived in Section 4. In Section 5, the CRLB is used to evaluate and optimize the design of a STEM experiment. It is assumed that the experimental design is limited by specimen drift.

2. Parametric model for the intensity observations A parametric model, describing the expectations of the intensity observations recorded by the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

STEM, is needed in order to derive an expression for the CRLB. In this section, such a model will be derived by use of the simpliﬁed channeling theory [11–13] and of the fact that the scattering is dominated by the tightly bound 1s-type state of the atom columns. The source image will also be taken into account. The model contains both microscope parameters, such as objective aperture radius, defocus, spherical aberration constant, detector radius, width of the source image, and object parameters, such as the positions of the atom columns, the energy of the 1s-states and the object thickness. 2.1. The exit wave According to the simpliﬁed channeling theory, described in [11–13], an expression can be derived for the exit wave cðr; zÞ of an object consisting of N atom columns. This is a complex wave function in the plane at the exit face of the object, resulting from the interaction of the probe with the object: cðr; zÞ ¼ pðr r0 Þ þ

exp ip

N X n¼1

cn ðr0 bn Þf1s;n ðr bn Þ

E1s;n kz 1 ; E0

ð1Þ

where r ¼ ðx; yÞT is a two-dimensional vector in this plane. The symbol T denotes taking the transpose. The specimen thickness is z; E0 is the incident electron energy, k is the inverse electron wavelength, and r0 ¼ ðxk ; yl ÞT is the position of the STEM probe, described by the function pðr r0 Þ: The function f1s;n ðr bn Þ is the 1s-state of the column at position bn ¼ ðbxn ; byn ÞT : In accordance with the 1s-state of an atom, it is the lowest energy bound state with energy E1s;n : In Eq. (1), it is assumed that the dynamical motion of the electron in a column can be expressed primarily in terms of this tightly bound 1s-state. The excitation coefﬁcients cn can be found from Z cn ðr0 bn Þ ¼ fn1s;n ðr bn Þpðr r0 Þ dr; ð2Þ where ‘ * ’ denotes taking the complex conjugate. Since the 1s-state is a real function and since the probe is assumed to be radially symmetric, so that

275

pðrÞ ¼ pðrÞ; Eq. (2) can be written as a convolution product: cn ðr0 bn Þ ¼ pðr0 bn Þn f1s;n ðr0 bn Þ:

ð3Þ

If the convolution theorem is used, Eq. (3) may be written as cn ðr0 bn Þ ¼ I1 g-r b PðgÞF1s;n ðgÞ Z 0 n ¼ PðgÞF1s;n ðgÞ expði2pg ðr0 bn ÞÞ dg; ð4Þ where PðgÞ is the Fourier transform of the probe pðrÞ; F1s;n ðgÞ is the Fourier transform of the 1sstate f1s;n ðrÞ; g is a two-dimensional spatial frequency vector in reciprocal space, and ‘ ’ denotes the scalar product. In this paper, the twodimensional Fourier transform F ðgÞ of an arbitrary function f ðrÞ is deﬁned as Z F ðgÞ ¼ Ir-g f ðrÞ ¼ f ðrÞ expði2pg rÞ dr: ð5Þ Consequently, the inverse Fourier transform is deﬁned as Z f ðrÞ ¼ I1 F ðgÞ ¼ F ðgÞ expði2pg rÞ dg: ð6Þ g-r For radially symmetric 1s and probe functions, Eq. (4) can be written as cn ðr0 bn Þ ¼ cn ðjr0 bn jÞ Z N ¼ 2p PðgÞF1s;n ðgÞJ0 ð2pgjr0 bn jÞg dg: ð7Þ 0

This is an elementary result of the theory of Bessel functions, where J0 ðÞ is the zeroth-order Bessel function of the ﬁrst kind and qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ jr0 bn j ¼ ðxk bxn Þ2 þ ðyl byn Þ2 ð8Þ is the distance from the probe to the atom column position. The illuminating STEM probe pðrÞ is the inverse Fourier transform of the transfer function of the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

276

objective lens PðgÞ: pðrÞ ¼

I1 g-r PðgÞ:

ð9Þ

The transfer function PðgÞ is radially symmetric and given by PðgÞ ¼ PðgÞ ¼ AðgÞ expðiwðgÞÞ; ð10Þ where g ¼ jgj is the spatial frequency. The circular aperture function AðgÞ is deﬁned as ( 1 if gpgap ; AðgÞ ¼ ð11Þ 0 if g > gap ; with gap being the aperture radius. The phase shift wðgÞ; resulting from the objective lens aberrations, is radially symmetric and given by wðgÞ ¼ pelg2 þ 12pCs l3 g4 ;

ð12Þ

with Cs being the spherical aberration constant, e the defocus, and l the electron wavelength. According to [14], the 1s-state function is well approximated by a two-dimensional quadratically normalized (radially symmetric) Gaussian function: 2 1 r f1s ðrÞ ¼ pﬃﬃﬃﬃﬃﬃ exp 2 ; ð13Þ 4a a 2p where r ¼ jrj and a represents the column-dependent width. This width is directly related to the energy of the 1s-state. The Fourier transform F1s ðgÞ of Eq. (13), that is, the Fourier spectrum of the object, is given by 1 g2 F1s ðgÞ ¼ pﬃﬃﬃﬃﬃﬃ exp 2 ð14Þ 4c c 2p with 1 ; ð15Þ 4pa the width of the Fourier transformed 1s-state.

c¼

2.2. The image intensity distribution From the exit wave, which is given in Eq. (1), the total detected intensity in the Fourier detector plane of a STEM can be derived [15] as Z Ips ðr0 Þ ¼ jCðg; zÞj2 DðgÞ dg; ð16Þ

where Cðg; zÞ is the two-dimensional Fourier transform of the exit wave cðr; zÞ; and DðgÞ is the detector function, which is equal to one in the detected ﬁeld and equal to zero elsewhere. The two-dimensional Fourier transform of the exit wave can be derived by combining Eqs. (1) and (5): Cðg; zÞ ¼ PðgÞ expð2pig r0 Þ þ

N X

cn ðr0 bn ÞF1s;n ðgÞ expð2pig bn Þ

n¼1

E1s;n exp ip kz 1 : E0

ð17Þ

Thus far, it has been assumed that the source can be modeled as a point. Therefore, the subscript ‘ps’ in Eq. (16) refers to point source. Elaborating on the ideas given in [16], it follows that the ﬁnite size of the source image can be taken into account by a two-dimensional convolution of the intensity distribution Ips ðr0 Þ with the intensity distribution of the source image SðrÞ: Iðk; lÞ ¼ Iðr0 Þ ¼ Ips ðr0 Þ * Sðr0 Þ:

ð18Þ

The effect of the source image is thus an additional blurring. The indices ðk; lÞ correspond to the probe at position r0 ¼ ðxk ; yl ÞT : A realistic form for the intensity distribution of the source image is Gaussian [16]. The function SðrÞ is thus a twodimensional normalized Gaussian distribution: 2 1 r SðrÞ ¼ SðrÞ ¼ exp 2 ð19Þ 2 2ps 2s with s representing the width corresponding to the radius containing 39% of the total intensity of S: Up to now, no assumptions have been made about the shape or size of the detector. From now on, however, the detector is assumed to be radially symmetric. Mathematically this means that DðgÞ ¼ DðgÞ: Eq. (18) can be split up into three terms: Iðk; lÞ ¼ I0 þ I1 ðk; lÞ þ I2 ðk; lÞ:

ð20Þ

The zeroth order term I0 corresponds to a noninteracting probe, the ﬁrst order term I1 ðk; lÞ to the interference between the probe and the 1sstate and the second order term I2 ðk; lÞ to the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

interference between different 1s-states. The zeroth order term I0 is given by I0 ¼

Z

2

I2 ðk; lÞ ¼

cn ðjr0 bn jÞcnm ðjr0 bm jÞ

E1s;n kz 1 exp ip E0 E1s;m exp þip kz 1 E0 Z 2p F1s;n ðgÞF1s;m ðgÞ #n

It describes a constant background intensity, resulting from the non-interacting electrons collected by the detector. This equation can be written as I0 ¼ 2p

"

n¼1 m¼1

jPðgÞ expð2pig r0 Þj DðgÞ dg * Sðr0 Þ: ð21Þ

Z

N X N X

277

J0 ð2pg dAn Am ÞgDðgÞ dg 2

A ðgÞDðgÞg dg

by substitution of Eq. (10) and using the fact that DðgÞ is radially symmetric. Due to the deﬁnition of the aperture function, given in Eq. (11), the following equality may be used: ð23Þ

Therefore, Eq. (22) becomes I0 ¼ 2p

Z AðgÞDðgÞg dg:

ð26Þ

ð22Þ where

A2 ðgÞ ¼ AðgÞ:

Sðr0 Þ;

ð24Þ

dAn Am ¼ jbn bm j ¼

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðbxn bxm Þ2 þ ðbyn bym Þ2 ð27Þ

is the distance between the atom columns at positions bn and bm : It is only the last term I2 ðk; lÞ of Eq. (20) that remains if the detector is annular and has an inner radius greater than or equal to the aperture radius. Then, PðgÞDðgÞ ¼ 0

ð28Þ

or, equivalently, The ﬁrst order term I1 ðk; lÞ corresponds to the interference between the incident probe pðr r0 Þ and the 1s-state f1s;n ðr bn Þ: I1 ðk; lÞ ¼

N X

"

ð29Þ

Therefore, Eq. (26) describes the model for annular dark-ﬁeld STEM. 2.3. The image recording

2 Re cn ðjr0 bn jÞ

n¼1

E1s;n kz 1 exp ip E0 Z 2p Pn ðgÞF1s;n ðgÞJ0 ð2pgjr0 bn jÞ # DðgÞg dg * Sðr0 Þ:

AðgÞDðgÞ ¼ 0:

ð25Þ

This is a linear term in the sense that the contributions of different atom columns are added. The second order term I2 ðk; lÞ describes the mutual interference between different 1s-states f1s;n and f1s;m :

In a STEM, the illuminating probe scans the specimen in a raster fashion. The image is thus recorded as a function of the probe position r0 ¼ ðxk ; yl ÞT : Without loss of generality, the image magniﬁcation is ignored. Therefore, the probe position r0 ¼ ðxk ; yl ÞT directly corresponds to an image pixel at the same position. The recording device is characterized as consisting of K L equidistant pixels of area Dx Dy; where Dx and Dy are the probe sampling distances in the x and y directions, respectively. Pixel ðk; lÞ corresponds to position ðxk ; yl ÞT ðx1 þ ðk 1ÞDx; y1 þ ðl 1ÞDyÞT with k ¼ 1; y; K and l ¼ 1; y; L and ðx1 ; y1 ÞT represents the position of the

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

278

pixel in the bottom left corner of the ﬁeld of view FOV. The FOV is chosen to be centered about ð0; 0ÞT : Assuming a recording time t for one pixel and a probe current J; one calculates the number of electrons per probe position: Jt ð30Þ e with e ¼ 1:6 1019 C the electron charge. The recording time t for one pixel is the ratio of the frame time t; that is, the recording time for the whole FOV, to the total number of pixels KL: t : ð31Þ t¼ KL The primary electron dose D is thus given by Jt D ¼ KL : ð32Þ e The probe current J is given by [17] J¼

2 Br E0 p2 dI50 a2

ð33Þ 4e with Br being the reduced brightness of the source, E0 the beam energy, dI50 the diameter of the source image containing 50% of the current and a; which is equal to gap l; the aperture angle. From Eq. (19), it follows that pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ dI50 ¼ 2 2 ln 0:5 s: ð34Þ As a consequence of the detector shape and size in STEM, only the electrons within a selected part of the diffraction pattern are used to produce the image. Mathematically, this is expressed in Eq. (16). The expected number of detected electrons per pixel position ðk; lÞ equals [18] Jt fkl ; ð35Þ e where fkl denotes the fraction of electrons recorded by the detector ðfkl o1Þ: This fraction can be expressed as Iðk; lÞ fkl ¼ ; ð36Þ ID¼1 with ID¼1 being the constant intensity obtained if the detector function DðgÞ is uniform: Z ID¼1 ¼ 2p AðgÞg dg; ð37Þ

following from straightforward calculations, using Eqs. (20)–(26). The total number of detected electrons to form the image is now equal to Ddet ¼

K X L X k¼1

l¼1

fkl

Jt : e

ð38Þ

Then, the dose efﬁciency DE, which is deﬁned as the fraction of the primary electron dose that is detected, becomes PK PL Ddet l¼1 fkl DE ¼ ¼ k¼1 : ð39Þ D KL The result of Eq. (35) deﬁnes the expectation values of the intensity observations recorded by the detector, which is needed to derive the joint probability density function in the next section.

3. The joint probability density function of the observations In any STEM experiment, the observations will ‘‘contain errors’’. These errors have to be speciﬁed, which is the subject of this section. Generally, sets of observations made under the same conditions nevertheless differ from experiment to experiment. The usual way to describe this behavior is to model the observations as stochastic variables. The reason is that there is no viable alternative and that it has been found to work. Stochastic variables are deﬁned by probability density functions [19]. In a STEM experiment the observations are counting results, for which the probability density function can be modeled as a Poisson distribution. In what follows, the underlined characters represent stochastic variables. Consider a set of KL stochastic observations fwkl ; k ¼ 1; y; K; l ¼ 1; y; Lg; where the index kl% corresponds to the pixel at the position ðxk ; yl ÞT : In what follows, the vector w will represent the KL 1 column vector of these observations. The observations are assumed to be statistically independent and have a Poisson distribution. The probability that the observation wkl is equal to okl % is given by [20] o

lklkl expðlkl Þ; okl !

ð40Þ

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

where the parameter lkl is equal to the expectation E½wkl : The expectation values E½wkl are given in % Eq.% (35): Jt ð41Þ E½wkl ¼ fkl : e % The probability fw ðx; bÞ that a set of observations is equal to fokl ; k ¼ 1; y; K; l ¼ 1; y; Lg is the product of all probabilities described by Eq. (40) since the observations are assumed to be statistically independent: o K Y L Y lklkl expðlkl Þ; fw ðx; bÞ ¼ o ! k¼1 l¼1 kl

ð42Þ

where the elements of x correspond with those of w and the column vector b contains the unknown parameters. In our case b ¼ ðbx1 ; by1 ; y; bxN ; byN ÞT contains the coordinates in the x- and y-direction of the N atom column positions. The function described in Eq. (42) is called the joint probability density function of the observations. The joint probability density function, which has been derived here, will now be used for the computation of the CRLB.

4. The Cram!er–Rao Lower Bound In this section, the CRLB is discussed. The CRLB is a lower bound on the variance of any unbiased estimator. What does this mean? Suppose that an experimentalist wants to measure the position parameters b ¼ ðbx1 ; by1 ; y; bxN ; byN ÞT of a set of N atom columns quantitatively. For this purpose, one can use many estimators (estimation methods) such as least squares, least absolute values or maximum likelihood estimators. The precision of an estimator is represented by the variance or by its square root, the standard deviation. Generally, different estimators will have different precisions. It can be shown, however, that the variance of unbiased estimators will never be lower than the CRLB. Fortunately, there exists a class of estimators (including the maximum likelihood estimator) that achieves this bound at least asymptotically, that is, for the number of observations going to inﬁnity. For the details of this lower bound we refer to [9].

279

4.1. Fisher information The CRLB follows from the concept of Fisher information. The Fisher information matrix Fb for estimation of the position parameters of a set of N atom columns b ¼ ðbx1 ; by1 ; y; bxN ; byN ÞT ¼ T ðb1 ; y; b2N Þ is deﬁned as 2 @ ln fw ðw; bÞ Fb ¼ E ; ð43Þ @[email protected] where fw ðw; bÞ is the joint probability density function of the observations described in Eq. (42) and @2 ln fw ðw; bÞ

ð44Þ @[email protected] is the 2N 2N Hessian matrix of ln fw ðw; bÞ % % deﬁned by its ðp; qÞth element

@2 ln fw ðw; bÞ ; % % @bp @bq

ð45Þ

where bp and bq correspond to the p and qth element of the vector b; respectively. 4.2. Crame!r–Rao inequality Suppose that b# ¼ ðb#x1 ; b#y1 ; y; b#xN ; b#yN ÞT ¼ # ðb1 ; y; b#2N ÞT is an unbiased estimator of b: The Crame! r–Rao inequality then states that [21] 1 # bÞXF # Covðb; ; ð46Þ b

# bÞ # is the 2N 2N variance–covarwhere Covðb; # deﬁned by its iance matrix of the estimator b; ðp; qÞth element covðb# p ; b# q Þ: Its diagonal elements # The are thus the variances of the elements of b: matrix Fb1 is called the Crame! r–Rao lower bound # The Crame! r–Rao inequality on the variance of b: (46) expresses that the difference between the lefthand and right-hand member is positive semideﬁnite. A property of a positive semi-deﬁnite matrix is that its diagonal elements cannot be negative. This means that the diagonal elements of # bÞ # will always be larger than or equal to the Covðb; corresponding diagonal elements of the inverse of the Fisher information matrix. Therefore, the diagonal elements of Fb1 deﬁne lower bounds on the variances of the elements of b# Varðb#p ÞXFb1 ðp; pÞ;

ð47Þ

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

280

where p ¼ 1; y; 2N and Fb1 ðp; pÞ is the ðp; pÞth element of the inverse of the Fisher information matrix. The elements Fb ðp; qÞ may be calculated explicitly using Eqs. (41)–(45): " K X L X @2 Jt Fb ðp; qÞ ¼ E wkl ln fkl e @bp @bq % k¼1 l¼1 !# Jt fkl e

¼

K X L X k¼1

¼

Jt e

l¼1

K X

L X

k¼1

l¼1

@ 1 @fkl @2 fkl Jt E½wkl @bp @bq e % @bp fkl @bq

1 @fkl @fkl : fkl @bp @bq

ð48Þ

The derivative of fkl ðbÞ; with respect to b may be calculated from the parametric model of the intensity observations described in Section 2. In this section, it has been shown how from the parametric model of the intensity observations described in Section 2 and the joint probability density function described in Section 3, the elements of the Fisher information matrix Fb of Eq. (48) may be calculated explicitly. From the latter, the CRLB may be computed. The diagonal elements of the CRLB give a lower bound on the variance of any unbiased estimator of the x- and ycoordinates of a set of N atom columns. The CRLB is a function of microscope and object parameters. In the following section, this lower bound will be used to study the dependence of the precision on the microscope parameters for different objects.

5. Experimental design The CRLB, which is discussed in Section 4, will be used to evaluate and optimize the experimental design of a quantitative STEM experiment. First, it will be stated which optimality criterion is chosen given the purpose of this paper. The optimal experimental design will be determined by minimizing a scalar measure s2CR of the CRLB as a function of the microscope parameters. Second, an overview of the microscope parameters

will be given. Some of them are tunable, while others are ﬁxed. Third, the results of the numerical evaluation of the dependence of s2CR on these parameters will be discussed. Fourth, an interpretation of these results will be given. Fifth, the resulting optimal design will be compared with the design that is assumed to be optimal in the conventional approach using performance criteria that are based on resolution and contrast. 5.1. Optimality criterion The purpose of this paper is to optimize the design of a STEM so as to measure the atom column positions as precisely as possible. The precision with which the atom column coordinates can be measured is represented by the diagonal elements of the CRLB, that is, the right-hand side members of inequalities (47). However, simultaneous minimization of the diagonal elements of the CRLB as a function of the microscope parameters is in most cases impossible. A decrease of a particular diagonal element has often the unfavorable side effect of an increase of others. Therefore, the optimal experimental design has to be determined by minimizing a scalar measure of the CRLB. Several measures can be found in literature [22,23]. Among these optimality criteria the obvious choice given the purpose of this paper is the A-optimality criterion. The A-optimality criterion is deﬁned by the scalar measure s2CR : s2CR ¼ trace Fb1 ;

ð49Þ

that is, the trace of Fb1 ; or equivalently, the sum of the variances of the estimators of the atom column coordinates. Minimization of s2CR as a function of the microscope parameters results in the optimal experimental design. 5.2. Microscope parameters An overview of the microscope parameters, which enter the parametric model for the STEM intensity observations, described in Section 2, is given here. For simplicity, some of these parameters will be kept constant in the evaluation and optimization of the experimental design.

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

The parameters describing the detector conﬁguration are related to the detector function DðgÞ: In principle, the detector can have any shape or size. However, in the present work we will conﬁne ourselves to the more common ones, which are, annular and axial detectors. The inner or outer radius gdet of the annular or axial detector, respectively, is tunable. The parameters describing the probe are: the defocus e; the spherical aberration constant Cs ; the objective aperture radius gap ; the electron wavelength l; the width of the source image s; and the reduced brightness Br : In the evaluation, l and Br have been kept constant. The parameters describing the image recording are the FOV ; the probe sampling distances or, equivalently, the pixel sizes, Dx and Dy; the number of pixels K and L in the x- and ydirection, respectively, and the recording time t: The FOV is chosen ﬁxed, but large enough so as to guarantee that the tails of the probe are collected. The pixel sizes Dx and Dy are kept constant. It can be shown that the precision will generally improve with smaller pixel sizes, with all other parameters kept constant. However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel SNR decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in [10,24]. It is directly clear from Eqs. (33), (48) and (49) that the precision will increase proportionally to the recording time t and the reduced brightness Br : In what follows, the recording time t is kept constant, presuming that specimen drift sets practical limits on the exposure time. As already mentioned Br has been kept constant too.

5.3. Numerical results In this subsection, the results of the numerical evaluation of the dependence of s2CR on the microscope parameters will be discussed. The cases for which the parametric model consists of an isolated atom column and of two atom columns are considered in the ﬁrst and second part, respectively. It will be discussed how the optimal

281

experimental design of an isolated atom column is inﬂuenced by a neighboring atom column. 5.3.1. Isolated atom column In this part, the experimental design is evaluated and optimized for the special case of a parametric model consisting of an isolated atom column. Therefore, s2CR which is deﬁned in Eq. (49), is equal to the sum of the lowest variance with which the x- and y-coordinate of the atom column position can be estimated. The FOV is chosen centered about the atom column, because in that case it may be shown that the variances on the xand y-coordinate are equal. To begin with, it is assumed that the width of the source image is determined by the aperture angle, following the relation dI50 ¼

0:54l ; a

ð50Þ

where 0:54l=a is equal to the diameter of the diffraction-error disc containing 50% of the total intensity. In this way, the contribution of the source image to the total probe size is small. Then, meeting Eq. (50), it follows from Eq. (33) that the probe current is constant and equal to JB ¼ 1018 Br [17]. Next, the dependence of the precision on the aperture radius is studied. From the evaluation of s2CR ; it is found that the optimal aperture radius is mainly determined by the object and that it is the same for annular or axial detectors. The optimal aperture radius is proportional to the width of the column’s 1s-state F1s ðgÞ described by Eq. (14). Fig. 1 compares the optimal aperture radius (for Cs ¼ 0:5 mm and at optimal defocus) with the width c ¼ 1=ð4paÞ of the 1s-state F1s ðgÞ: The optimal aperture radii are plotted as a function of ðd 2 =Z þ 0:276BÞ1=2 ; since this is proportional to the width c; as shown in [25]. For a given atom column, d represents the interatomic distance, Z the atomic number and B the Debye–Waller factor. The widths a of the corresponding 1s-states f1s ðrÞ in real space are given in Table 1. It is clear from Fig. 1 that the inﬂuence of the object on the optimal aperture radius is substantial. In contrast to what one may expect, the resulting probe in the optimal design is not as small as possible. It is even

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

larger than the 1s-function f1s ðrÞ; which is shown in Fig. 2 for a silicon and a gold column in ½1 0 0 direction. Furthermore, an increase of the spherical aberration constant results in a decrease of the optimal aperture radius and vice versa. This effect is especially important for heavy atom columns, such as a gold column in ½1 0 0 -direction, where the optimal aperture radius for Cs ¼ 0 mm ( 1 and for Cs ¼ 0:5 mm it is is equal to 0:75 A ( 1 : For a lighter atom column, such equal to 0:5 A as silicon in ½1 0 0 -direction, the optimal aperture radius for Cs ¼10 and 0:5 mm are the same and ( : equal to 0:28 A Then, a comparison between an annular and an axial detector is presented. In Fig. 3, s2CR is evaluated for a Si [1 0 0] column as a function of the detector-to-aperture radius, for an annular as well as an axial detector. The aperture radius and

Probe Si [100]

1.2

0.8 1.6 0.4 0.8

0.0 -10

-5

0

5

10

Spatial coordinate (Å)

0.2

Au [100]

0.4

-5

0

5

10

Spatial coordinate (Å)

Annular detector Axial detector

0.1

Width of 1s-state Optimal aperture radius

Si[100]

0.2

2 σCR (Å2)

Å-1

Cu [100]

0.0 -10

Fig. 2. The left and right ﬁgure represent the 1s-state f1s ðrÞ for a Si ½1 0 0 column and a Au ½1 0 0 column, respectively. The amplitude jpðrÞj of their associated optimal probes (for Cs ¼ 0:5 mm) are also shown.

0.6

Sn[100] Sr[100] Si[110]

Probe Au [100]

3.2

2.4

Amplitude

282

0.8

1.2 2

0.0 0.4

1.6

(d /Z+0.276B)

-1/2

0.8

1.2

1.6

gdet /gap

-1

(Å )

Fig. 1. Comparison of the optimal aperture radius (for Cs ¼ 0:5 mm) with the width of the 1s-state F1s ðgÞ; which is proportional to ðd 2 =Z þ 0:276BÞ1=2 : The Debye–Waller factor ( 2 : The optimal aperture radius increases with B is set to 0:6 A the width of the 1s-state. Therefore, it varies strongly for different atom columns.

Fig. 3. The criterion s2CR ; which is deﬁned in Eq. (49), computed as a function of the detector-to-aperture radius for a Si ½1 0 0 column, using an annular and an axial detector. The aperture radius and the defocus are set to their optimal values ( 1 and 8 nm; respectively. The detector radius is 0:28 A variable. All other parameters are ﬁxed (see Tables 2 and 3).

Table 1 ( 2 ), interatomic distance, atomic number, and intercolumn distance Width of the 1s-state, and its energy (Debye–Waller factor ¼ 0:6 A for different atom columns

( a ðAÞ E1s ðeVÞ ( d ðAÞ Z ( dAA ðAÞ

Si [1 0 0]

Si [1 1 0]

Sr [1 0 0]

Sn [1 0 0]

Cu [1 0 0]

Au [1 0 0]

0.34 20.2 5.43 14 1.92

0.27 37.4 3.84 14 1.36

0.22 57.3 6.08 38 4.03

0.20 69.8 6.49 50 2.29

0.18 78.3 3.62 29 2.56

0.13 210.8 4.08 79 2.88

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289 Table 2 Initial microscope parameter values

300

0.5

L

0.006

Dx Dy t ðsÞ ðpmÞ ðpmÞ

99 99 20

20

8 108

Br

A m2 srV

!

2 107

σ 2CR (Å2)

E0 Cs K ðkeVÞ ðmmÞ

283

0.004 Cs = 0 mm Cs = 0.5 mm

0.002

0.000

-400

-200

Table 3 Parameters of an isolated atom column

0

200

400

Defocus (Å)

( bx ðAÞ

( by ðAÞ

( z ðAÞ

0

0

E0 E1s k

( the defocus are set to their optimal values 0:28 A and 8 nm; respectively. All other parameters are held ﬁxed (Tables 2 and 3). Three relations can be derived from this ﬁgure. First, for an annular detector, the optimal design is obtained when the detector radius equals the optimal aperture radius. Second, for an axial detector, the optimal detector radius is slightly smaller than the optimal aperture radius. Third, an annular detector may result in higher precisions than an axial detector, when operating at the optimal conditions. Subsequently, the dependence of the precision on the defocus is evaluated. In Fig. 4, s2CR is plotted for a Si ½1 0 0 column as a function of the defocus for an annular detector, whereas in Fig. 5 this is done for an axial detector. These ﬁgures show that the optimal defocus is the same for an annular and an axial detector, for a given spherical aberration constant. Furthermore, the optimal defocus depends strongly on the spherical aberration constant. An empirical relation has been found between the optimal defocus e; the optimal aperture radius gap and the spherical aberration constant Cs :

Fig. 4. The criterion s2CR ; which is deﬁned in Eq. (49), computed as a function of defocus for a Si ½1 0 0 column, using an annular detector (for Cs ¼ 0 and 0:5 mm). The aperture and detector radius are set to their optimal value ( 1 : All other parameters are ﬁxed (see Tables 2 and 3). 0:28 A

1

eE 12Cs l2 g2ap :

ð51Þ

In this case, the transfer function is nearly equal to one over the whole angular range of the objective aperture. The optimal transfer function for a Si ½1 0 0 column is presented in Fig. 6, where the vertical line represents the optimal aperture radius

0.04

σ 2CR (Å2)

0.03 0.02 Cs = 0 mm Cs = 0.5 mm

0.01 0.00 -400

-200

0

200

400

Defocus (Å) Fig. 5. The criterion s2CR ; which is deﬁned in Eq. (49), computed as a function of defocus for a Si ½1 0 0 column, using an axial detector (for Cs ¼ 0 and 0:5 mm). The aperture and detector radius are set to their optimal values 0:28 and ( 1 ; respectively. All other parameters are ﬁxed (see 0:22 A Tables 2 and 3).

and Cs is set to 0:5 mm: Eq. (51) is derived from Eq. (12) by setting the phase shift wðgÞ exactly to zero for g ¼ gap : Then, the potential merit of Cs -correctors in quantitative STEM applications is studied. Figs. 7 and 8 show the ratio s2CR =s2CR ðCs ¼ 0 mmÞ as a function of the spherical aberration constant, for an annular as well as an axial detector. This is done for a Si ½1 0 0 column, as well as for a Au ½1 0 0 column. The aperture radius has been set to the value that is optimal for Cs ¼ 0:5 mm: From Fig. 7, it follows that the precision that can be gained by reducing Cs is only marginal for a Si

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

284

2.5 2 2 / σ CR (Cs=0 mm) σ CR

Transfer function

1.0 0.5 0.0 Real part Imaginary part

-0.5 -1.0 0.0

0.1

0.2

0.3

0.5

0.4

-1

Annular detector Axial detector

2.0

1.5

1.0 0.0

0.2

0.4

0.6

0.8

1.0

Cs (mm)

Å

Fig. 6. Transfer function for a spherical aberration constant of 0:5 mm and defocus of 8 nm: The vertical line represents the ( 1 : aperture radius, which is equal to 0:28 A

Fig. 8. The ratio s2CR =s2CR ðCs ¼ 0 mmÞ computed as a function of the spherical aberration constant, for a Au ½1 0 0 column, using an annular as well as an axial detector. The ( 1 : The defocus is aperture and detector radius are set to 0:50 A determined by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

2 2 σCR / σCR (Cs= 0 mm)

1.010 Annular detector Axial detector

1.005

1.000 0.0

0.2

0.4

0.6

0.8

1.0

Cs (mm)

Fig. 7. The ratio s2CR =s2CR ðCs ¼ 0 mmÞ computed as a function of the spherical aberration constant, for a Si ½1 0 0 column, using an annular as well as an axial detector. The aperture and ( 1 : The defocus is determined detector radius are set to 0:28 A by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

½1 0 0 column, whereas it follows from Fig. 8 that such a reduction would be more likely to pay off for a Au ½1 0 0 column. For an annular detector, the precision, expressed in terms of the variance, at Cs ¼ 1 mm is about 1:008 and 1:752 times worse than the precision at Cs ¼ 0 mm for a Si ½1 0 0 and a Au ½1 0 0 column, respectively. In terms of the standard deviation, which is deﬁned as the square root of the variance, these fractions are 1:004 and 1:324; respectively. This result may be explained by the fact that the optimal aperture setting is strongly dependent on the atom column. The optimal aperture radius for a Au ½1 0 0 column is much larger than for a Si ½1 0 0 column. Because spherical aberration is observable only for non-

paraxial rays, correction is only necessary for objective lenses working with larger apertures. In other applications, a Cs -corrector may be more worthwhile: a larger pole-piece gap may be possible, allowing greater access for X-ray detectors and sample holders for in situ experiments or a larger probe current is possible in a probe of a given size, which is of importance in microanalysis. The optimal experimental settings described in this part are derived for single isolated atom columns. One should keep in mind, that the precision with which the position of a single isolated column can be estimated is a good performance measure as long as neighboring columns are clearly separated in the image. In this case, the precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. However, images of atom columns taken under experimental settings that are optimal for isolated atom columns may show strong overlap for realistic materials, for example, for a Si ½1 0 0 crystal. Then, the precision with which the position of an atom column can be estimated is affected unfavorably by the presence of neighboring columns [24]. To ﬁnd out if the optimal experimental design changes in the case of neighboring atom columns, the parametric model for the intensity distribution derived in Section 2 will be extended from one to two atom columns. Then, another interesting

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

question that can be answered is if an increase of the diameter of the source image, accompanied with larger probe currents, results in higher precisions. Thus far, the diameter of the source image has been determined by the diameter of the diffraction-error disc, following Eq. (50). Increasing this diameter results in a higher precision in the case of a single isolated atom column. The ‘optimum’ value of s; and therefore of dI50 ; would be inﬁnity. However, this is not a realistic value, since neighboring columns will then strongly overlap. This overlap will be taken into account in the following part.

5.3.2. Two neighboring atom columns In this part, the experimental design is evaluated and optimized for the special case of a parametric model consisting of two neighboring atom columns. The atom column types are similar to those considered in Section 5.3.1 and the intercolumn distances are given in Table 1. The criterion s2CR which is deﬁned in Eq. (49), is now equal to the sum of the lowest variances with which the x- and y-coordinates of the two atom column positions can be estimated. The FOV is chosen centered about the atom column positions, whose coordinates are given in Table 4. In this case, it may be shown that the variances on the x-coordinates as well as the variances on the y-coordinates are equal. First, it has been assumed that the diameter of the source image is given by Eq. (50). From the evaluation of the criterion s2CR ; it follows that the optimal design for two neighboring atom columns is almost equal to the one for an isolated atom column. This means that neighboring columns of some atom types may show strong overlap in images taken under the optimal conditions. Compared to the results given in Section 5.3.1,

Table 4 Parameters of two neighboring atom columns ( bx1 ðAÞ

( by1 ðAÞ

( bx2 ðAÞ

( by2 ðAÞ

( z ðAÞ

dAA 2

0

dAA 2

0

E0 E1s k

285

changes in the optimal aperture radius are only in the order of 5 percent. The optimal detector radius is still equal to or slightly smaller than the optimal aperture radius for annular and axial detectors, respectively. The attainable precision is again higher for annular than for axial detectors. Furthermore, the optimal defocus is still given by Eq. (51) and a Cs -corrector is more likely to pay off for heavy than for light atom columns, although the precision that can be gained is only marginal. Second, the diameter of the source image has been taken variable. In practice, this is possible by adjusting the settings of the condenser lenses, allowing the demagniﬁcation of the source to be continuously varied. It is well known that an increase of the diameter of the source image is accompanied with two side effects: an increase of the source size and of the probe current, having an unfavorable and a favorable effect on the precision, respectively [6]. The potential merit of increasing the diameter of the source image is studied by the evaluation of the criterion s2CR : For annular detectors, it has been found that the optimal diameter of the source image is about twice as large as the one deﬁned in Eq. (50) for the atom column types and distances given in Table 1. For example, Fig. 9 shows the computed criterion s2CR as a function of the ratio dI50 =ð0:54l=aÞ for two neighboring Si ½1 1 0 columns. The ( 1 ; being aperture radius has been set to 0:36 A optimal for Cs ¼ 0:5 mm and the radius of the annular detector is set equal to the aperture radius. For axial detectors, it has been found that the optimal diameter of the source image is about equal to the one deﬁned in Eq. (50) for the atom column types and distances given in Table 1. Furthermore, it has to be noticed that the evaluation of the diameter of the source image has no effect on the optimal settings of the other microscope parameters. Finally, it has to be mentioned that thermal diffuse scattering is not taken into account in the parametric model for the image intensity distribution used in this paper. This is justiﬁed by the fact that the inner radius of the annular detector must not be taken too large in a quantitative experiment, where high-dose efﬁciency is important in

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

286

0.006

0.016

2 σCR (Å2)

2 σCR (Å2 )

0.024

0.004 σ 2CR

0.002

2

d50 /D int.

0.008 0.000

0.3

0.000 1

2

3

4

5

d I50 /(0.54 λ/α)

0.5

Fig. 9. The criterion s2CR computed as a function of the ratio dI50 =ð0:54l=aÞ for two neighboring Si ½1 1 0 columns for an annular detector. The aperture and detector radius are set to ( 1 : The defocus is determined by their optimal value 0:36 A Eq. (51). All other parameters are ﬁxed (see Tables 2 and 4).

order to provide high precision. In this case, the 1s-states dominate the scattering. In the following subsection, an intuitive interpretation of the described results is given. 5.4. Interpretation of the results

Fig. 10. The left-hand and right-hand side members of Eq. (52) computed as a function of the aperture radius for a Sr ½1 0 0 column, using an annular detector. The radius of the annular detector is equal to the aperture radius. The defocus is determined by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

0.12

σ 2CR

(Dn.int.d 50 )/(Dint.KL ∆ x ∆y) 4

σ 2CR (Å2 )

0

0.4

Aperture radius (Å-1)

2

0.08

0.04

Proportionality relations for s2CR for dark-ﬁeld and bright-ﬁeld images consisting of an isolated atom column have been derived. In dark-ﬁeld imaging, where the non-interacting electrons are eliminated from detection, it has been found that s2CR B

2 d50 ; Dint:

ð52Þ

where Dint: is equal to the total number of interacting electrons and d50 represents the radius of the image intensity distribution of the interacting electrons containing 50% of its total intensity. In particular, relation (52) holds for annular detectors, having an inner radius greater than or equal to the aperture radius. In bright-ﬁeld imaging, where the non-interacting electrons contribute to the background intensity in the image, it has been found that s2CR B

4 Dn:int: d50 ; D2int: KLDxDy

ð53Þ

where Dn:int: is equal to the total number of noninteracting electrons. Relation (53) holds for axial

0.00 0.3

0.4

0.5

Aperture radius (Å-1) Fig. 11. The the left-hand and right-hand side members of Eq. (53) computed as a function of the aperture radius for a Sr ½1 0 0 column, using an axial detector. The radius of the axial detector is equal to the aperture radius. The defocus is determined by Eq. (51). All other parameters are ﬁxed (see Tables 2 and 3).

detectors as well as for annular detectors having an inner radius smaller than the aperture radius. The validity of these proportionality relations are illustrated in Figs. 10 and 11, where its right-hand and left-hand side are shown as a function of the aperture radius for a Sr ½1 0 0 column, using an annular and an axial detector, respectively. This relation allows us to get deeper insight into the numerical results, derived in Section 5.3. It shows that, in order to obtain a higher precision, one has to balance the width of the image intensity

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

distribution and the number of interacting and non-interacting electrons. For example, at the optimal design, the optimal aperture radius gap strongly depends on the atom column. Furthermore, an annular detector is to be used of which the inner radius gdet is equal to gap : How can this be explained? On the one hand, d50 will become smaller if the probe becomes smaller, that is, if the aperture radius increases. However, the decrease of d50 will become less important if the probe width is about equal to the width of the 1s-state. This is due to the fact that d50 is mainly determined by the excitation of the 1s state, described by Eq. (3). On the other hand, the accompanied increase of the detector radius results in an enormous loss of interacting electrons. As a consequence, the optimal design balances the excitation of the atom column-dependent 1s-state and the loss of electrons in the detector. In this way, the dependence of the optimal aperture radius on the atom column can be explained. 5.5. Comparison with conventional approach In the conventional approach, which is based on direct visual interpretability, the Scherzer conditions for incoherent imaging are usually applied [3,4] 1 4l 1=4 gap ¼ ; l Cs e ¼ ðCs lÞ1=2 :

ð54Þ

Table 5 compares these Scherzer conditions and the optimal conditions for a Sr ½1 0 0 column. In Table 5 Comparison between the optimal conditions (for an isolated Sr [1 0 0] column) and the Scherzer conditions for an annular detector, with Cs ¼ 0:5 mm

e ðnmÞ ( 1 Þ gap ðA ( 1 Þ gdet ðA 2 ( 2Þ sCR ðA DE ð%Þ ( d50 ðAÞ

Optimal conditions (Sr [1 0 0])

Scherzer conditions

16 0.40 0.40 0.0037 2.9 0.94

32 0.56 1.12 0.2304 0.034 0.75

287

the column of the Scherzer conditions, the value of gdet has been taken two times larger than gap ; which is representative for a typical Crewe detector [26]. As can be noticed clearly, the Scherzer conditions differ signiﬁcantly from the optimal conditions. The precision, expressed in terms of the variance, at Scherzer conditions is about 62 times worse than the precision that could be reached at the optimal design. In terms of the standard deviation, which is deﬁned as the square root of the variance, this fraction is about 8: This is not astonishing and can be explained from Eq. (52). Due to the large hole in the detector, the dose efﬁciency is very low at Scherzer conditions, thus affecting the precision in an unfavorable way. Due to the smaller probe size, the width of the intensity distribution is slightly smaller at Scherzer conditions than at the optimal conditions, thus affecting the precision in a favorable way. However, the extremely low number of detected electrons is the dominant factor, resulting in a low precision. From this comparison, it may be concluded that there is a world of difference between the Scherzer conditions and the optimal conditions. Although, one has to keep in mind that both conditions are derived for different purposes: direct visual interpretability on the one hand and precise measurement of the atom column positions on the other hand. However, both purposes may go hand in hand. As explained earlier, quantitative structure determination is done by numerically ﬁtting the parametric model to the experimental data. The ﬁt is evaluated using a criterion of goodness of ﬁt. In practice, the search for the global optimum of the criterion of goodness of ﬁt is an iterative numerical procedure. At each iteration, the coordinates are slightly changed in order to improve the ﬁt. In order to guarantee convergence to the global optimum of the goodness of ﬁt, good initial conditions are required. This means that it is important to ﬁnd a reasonable trial structure. Trial positions for the atom columns may be obtained from experimental images that are optimized for qualitative interpretation, whereas the reﬁnement may result from experimental images that are optimized for quantitative interpretation.

288

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289

6. Conclusions

References

Conventionally, the design of a STEM experiment is based on qualitative image interpretation. However, in terms of image interpretation, the future lies in quantitative measurement of structural parameters. Since this is a different purpose, the design has to be reconsidered. A quantitative measure has been proposed to evaluate and optimize the design of a high-resolution STEM experiment. It is based on the statistical precision with which the positions of atom columns can be estimated. In the resulting optimal design, the aperture radius has been found to be mainly determined by the object under study. More speciﬁcally, it is proportional to the weight of the atom column. The optimal inner or outer radius of an annular or an axial detector turns out to be equal to or slightly smaller than the optimal aperture radius, respectively. However, an annular detector results in a higher precision than an axial detector. The resulting optimal defocus is the one for which the transfer function comes close to unity over the whole angular range of the aperture. The merit of Cs -correctors in quantitative STEM applications depends on the object under study. It pays off more for heavy atom columns, although the precision that can be gained is only marginal. For annular detectors, increasing the size of the source image beyond the size of the diffraction-error disc, which increases the probe current at the expense of resolution, has a favorable effect on the attainable precision. For axial detectors, the optimal size of the source image is about equal to the size of the diffractionerror disc.

[1] S.J. Pennycook, Scanning transmission electron microscopy: Z-contrast, in: S. Amelinckx, D. Van Dyck, J. Van Landuyt, G. Van Tendeloo (Eds.), Handbook of MicroscopyFApplications in Materials Science, Solid-State Physics and Chemistry, Methods II, pages 595–620, Weinheim, 1997. VCH. [2] J.M. Cowley, Scanning Transmission Electron Microscopy, in: S. Amelinckx, D. van Dyck, J. Van Landuyt, G. Van Tendeloo (Eds.), Handbook of MicroscopyFApplications in Materials Science, Solid-State Physics and Chemistry, Methods II, VCH, Weinheim, 1997, pp. 563–594. [3] O. Scherzer, J. Appl. Phys. 20 (1949) 20. [4] S.J. Pennycook, D.E. Jesson, Ultramicroscopy 37 (1991) 14. [5] P.D. Nellist, S.J. Pennycook, Phys. Rev. Lett. 81 (19) (1998) 4156. [6] P.D. Nellist, S.J. Pennycook, The principles and interpretation of annular dark-ﬁeld Z-contrast imaging, in: P.W. Hawkes (Ed.), Advances in Imaging and Electron Physics, Vol. 113, Academic Press, San Diego, 2000, pp. 147–199. [7] D. Van Dyck, Prospects of quantitative high resolution $ electron microscopy, in: L. Frank, F. Ciampor (Eds.), Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology, Vol. III, The Czechoslovak Society for Electron Microscopy, Brno, 2000, pp. 13–18. [8] A. van den Bos, Measurement errors, in: J.G. Webster (Ed.), Encyclopedia of Electrical and Electronics Engineering, Vol. 12, Wiley, New York, 1999, pp. 448–459. [9] A. van den Bos, Parameter estimation, in: P.H. Sydenham (Ed.), Handbook of Measurement Science, Vol. 1, Wiley, Chicester, 1982, pp. 331–377. [10] A.J. den Dekker, J. Sijbers, D. Van Dyck, J. Microsc. 194 (1999) 95. [11] J. Broeckx, M. Op de Beeck, D. Van Dyck, Ultramicroscopy 60 (1995) 71. [12] D. Van Dyck, M. Op de Beeck, Ultramicroscopy 64 (1996) 99. [13] S.J. Pennycook, D.E. Jesson, Acta Metall. Mater. 40 (1992) S149–S159. [14] P. Geuens, J.H. Chen, A.J. den Dekker, D. Van Dyck, An analytic expression in closed form for the electron exit wave, Acta Crystallogr. Section A, 55 Supplement, Abstract P11.OE.002, 1999. [15] J.M. Cowley, Ultramicroscopy 2 (1976) 3. [16] C. Mory, M. Tence, C. Colliex, J. Microsc. Spectrosc. Electron. 10 (1985) 381. [17] J.E. Barth, P. Kruit, Optik 3 (1996) 101. [18] L. Reimer, Elements of a transmission electron microscope, in: Transmission Electron Microscopy, Physics of Image Formation and Microanalysis, Springer, Berlin, Heidelberg, 1993, pp. 86–135.

Acknowledgements The authors would like to thank Dr. J.E. Barth and Dr. M.A.J. van der Stam for fruitful discussions related to this work. The research of Dr. A.J. den Dekker has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences.

S. Van Aert et al. / Ultramicroscopy 90 (2002) 273–289 [19] A. van den Bos, A.J. den Dekker, Resolution reconsideredFconventional approaches and an alternative, in: P.W. Hawkes (Ed.), Advances in Imaging and Electron Physics, Vol. 117, Academic Press, San Diego, 2001, pp. 241–360. [20] A.M. Mood, F.A. Graybill, D.C. Boes, Introduction to the Theory of Statistics, 3rd Edition, McGraw-Hill, Tokyo, 1974. [21] M.G. Kendall, A. Stuart, The Advanced Theory of StatisticsFInference and Relationship, Volume 2, 2nd Edition, Charles Grifﬁn and Company Limited, London, 1967.

289

[22] V.V. Fedorov, Theory of Optimal Experiments, Academic Press, New York, London, 1972. [23] A. P!azman, Foundations of Optimum Experimental Design, D. Reidel Publishing Company, Dordrecht, Boston, Lancaster, Tokyo, 1986. [24] E. Bettens, D. Van Dyck, A.J. den Dekker, J. Sijbers, A. van den Bos, Ultramicroscopy 77 (1999) 37. [25] D. Van Dyck, J.H. Chen, Solid State Commun. 109 (8) (1999) 501. [26] S.J. Pennycook, D.E. Jesson, M.F. Chisholm, N.D. Browning, A.J. McGibbon, M.M. McGibbon, J. Microsc. Soc. Am. 1 (6) (1995) 231.