Deconvolution of core electron energy loss spectra

July 14, 2017 | Autor: Giovanni Bertoni | Categoría: Ultramicroscopy, Multiple Scattering, Optical physics, Spectrum, Wiener filter, Maximum entropy
Share Embed


Descripción

ARTICLE IN PRESS Ultramicroscopy 109 (2009) 1343–1352

Contents lists available at ScienceDirect

Ultramicroscopy journal homepage: www.elsevier.com/locate/ultramic

Deconvolution of core electron energy loss spectra J. Verbeeck a,b,, G. Bertoni c,a a b c

EMAT, University of Antwerp, Groenenborgerlaan 171, B-2020 Antwerp, Belgium Institut f¨ ur Festk¨ orperphysik, Technische Universit¨ at Wien, A-1040 WIEN, Austria Italian Institute of Technology (IIT), via Morego 30, IT-16163 Genoa, Italy

a r t i c l e in fo

abstract

Article history: Received 13 February 2009 Received in revised form 11 June 2009 Accepted 23 June 2009

Different deconvolution methods for removing multiple scattering and instrumental broadening from core loss electron energy loss spectra are compared with special attention to the artefacts they introduce. The Gaussian modifier method, Wiener filter, maximum entropy, and model based methods are described. Their performance is compared on virtual spectra where the true single scattering distribution is known. A test on experimental spectra confirms the good performance of model based deconvolution in comparison to maximum entropy methods and shows the advantage of knowing the estimated error bars from a single spectrum acquisition. & 2009 Elsevier B.V. All rights reserved.

PACS: 07.05.Kf 79.20.Uv Keywords: EELS Energy resolution Super resolution Maximum entropy

1. Introduction Experimental electron energy loss (EELS) spectra are related to inelastic scattering events in the sample by the so-called single scattering distribution (SSD) which can be calculated in principle. The relationship between the experimental spectrum and this SSD is complicated because of several effects:

 Multiple inelastic scattering events occur.  The energy resolution of the instrument is finite due to a nonmonochromatic gun.

 Aberrations in the spectrometer. These effects make that an experimental spectrum cannot directly be compared with a theoretically obtained SSD. For core loss spectra, it is common to solve this by deconvoluting the experimental core loss spectrum with a low loss spectrum obtained under the same conditions. In principle, this can remove all the effects described above except that, unfortunately, noise in both low loss and core loss spectra prevents this. In a noise free system, the energy resolution of the spectrometer, for instance, would be unimportant since it could be completely undone by  Corresponding author at: EMAT, University of Antwerp, Groenenborgerlaan 171, B-2020 Antwerp, Belgium. E-mail address: [email protected] (J. Verbeeck).

0304-3991/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.ultramic.2009.06.010

deconvolution, revealing a spectrum that is only limited by natural lifetime broadening. This has led several researchers to believe that if only they could find an appropriate deconvolution method, monochromators might be replaced by software [1–5]. The hope was focused on maximum entropy (ME) methods that had considerable successes in astrophysics [6,7] since they were able to go beyond the Fourier based deconvolution methods by applying extra prior knowledge that the images cannot contain negative values. At present there seems to be a large confusion in the EELS community about the application of these ME methods on EELS spectra. We will try to shed light on this confusion by comparing different deconvolution methods including some ME variants on both numerically generated virtual spectra and experimental spectra. The advantage of using virtual spectra is that in that case one knows the SSD, so it is possible to objectively compare the output of the deconvolution methods with the expected result. Unfortunately we cannot do this for real experiments which may be one of the main reasons for the confusion: there is no way in which one can be sure that the obtained result from deconvolution is even close to the SSD unless by comparing with other experimental results obtained with, e.g. a monochromator. The decision often depends on the subjective visual preferences of the experimenter. This paper is structured as follows: we start with a description of the different methods and discuss their strengths and weaknesses. Then we apply these techniques first on virtual spectra to objectively compare their performance and finally on experimentally obtained spectra.

ARTICLE IN PRESS 1344

J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

2. Deconvolution methods

2.3. Wiener filter

In general, we can view the observed spectrum IðEÞ as IðEÞ ¼ OðEÞ  PðEÞ þ NðEÞ

ð1Þ

With OðEÞ the single scattering EELS spectrum (SSD), PðEÞ the point spread function (PSF) including multiple scattering and instrumental broadening, and NðEÞ additive noise. Deconvolution ^ consists of estimating OðEÞ  OðEÞ in the presence of noise NðEÞ. We assume that the point spread function PðEÞ can be obtained P with virtually no noise and that it is normalised j PðEj Þ ¼ 1. In practice one will use a low loss spectrum because of two reasons:

 The excitations in the low loss region are the most probable



and therefore the only ones which are likely enough to occur in a multiple scattering process in combination with a core loss excitation. The excitations in the low loss region have a low energy loss which makes that they will cause only a small energy shift on the core loss excitation if a multiple core loss þ low loss event occurs.

Taking the low loss spectrum as PðEÞ also includes the instrumental broadening and the energy width of the electron source, which in principle can be undone by deconvolution. Now we will describe different practical solutions for recovering an estimate ^ OðEÞ from a recorded core loss spectrum IðEÞ.

2.1. Straightforward deconvolution It is straightforward to write the deconvolution making use of the convolution theorem: " #   ~ Þ FðIðEÞÞ Iðf ^ ð2Þ ¼ F1 OðEÞ  F1 ~ Þ FðPðEÞÞ Pðf ~ Þ ¼ FðIðEÞÞ. This process only With F a Fourier transform and Iðf ~ Þ has no zeros. In practice works in the absence of noise and if Pðf this fails badly and the so-called noise amplification occurs for ~ Þ is small [8]. regions where Pðf

2.2. Gaussian modifier The Gaussian modifier technique [8] tries to solve this issue by ~ Þ is small for damping the higher frequencies (assuming the Pðf ~ Þ higher frequencies) with a Gaussian function Gðf " # ~ Þ Iðf 1 ~ ^ Gðf Þ ð3Þ OðEÞ  F ~ Þ Pðf The width of this Gaussian can be tuned to obtain a compromise between noise amplification and resolution. This usually gives a smooth result but with less than optimal deconvolution because the Gaussian also attenuates those frequencies which have a good signal to noise ratio. This is the method that is most widely used as it is implemented in Digital MicrographTM. This method already shows that there is a trade-off between obtaining high resolution or high signal to noise ratio (weak and strong Gaussian modifier). The obtained result would be equal to what one gets for an ~ ÞÞ. In this experiment with a point spread function GðEÞ ¼ F1 ðGðf sense the Gaussian modifier is a very natural method for removing plural scattering and (slightly) increasing the total resolution of the result at the expense of increased noise. Other functions than Gaussians can be used for GðEÞ if needed.

A solution to the attenuation problem is the so-called Wiener filter obtained from minimising the root mean square (RMS) error ^ between IðEÞ and OðEÞ  PðEÞ [9,10] ^ ~ Þ ¼ IðEÞ  YðEÞ OðEÞ  F1 ½Y~ ðf ÞIðf

ð4Þ

With Y~ ðf Þ an equalising function in fourier space obtained from Y~ ðf Þ ¼

 P~ ðf Þ ~

2

~ Þj2 þ jN ðf Þj jPðf ~ Þj2 jOðf

¼

 P~ ðf Þ

~ Þj2 þ 1 jPðf SNR

ð5Þ

This can intuitively be seen as the straightforward deconvolution filter if the SNRb1 and becomes zero if the SNR51. Note that off course the true single scattered function OðEÞ is unknown, so in principle we cannot calculate the SNR. We can, however, ^ iteratively replace OðEÞ by OðEÞ: Y~ iþ1 ðf Þ ¼

 P~ ðf Þ 2 ~ ~ Þj2 þ jNðf Þj jPðf ~^ jO i ðf Þj2

~ Þ O^ iþ1 ðEÞ ¼ F1 ½Y~ iþ1 ðf ÞIðf

ð6Þ

ð7Þ

Starting with O^ 0 ðEÞ ¼ IðEÞ. One of the main obstacles in the practical use of the Wiener filter is that the power spectrum ~ Þj2 needs to be known. In [11] we describe of the noise jNðf a method to measure this for a given spectrometer making use of a Digital MicrographTM script which is available from the authors. The suppression of frequency components where SNR51 can be seen as a downsampling of the data from N to N  m points, with m the number of frequencies suppressed to zero. If the suppressed frequencies are also the highest frequencies, we get a trade-off between SNR and resolution: the better the input SNR, the less downsampling will take place and the more energy resolution will be recovered. This trade-off can be made flexible by changing the threshold for suppressing frequency components to zero, e.g. a SNR4a could be required with a a user defined parameter. Ringing artefacts occur if the true SSD is not bandwidth limited to the maximum frequency that will be restored by the Wiener filter (the only bandwidth limit on the true SSD is lifetime broadening). These ringing artefacts are indeed what would be observed by a system with a point spread function that has the form of a squared sinc function [12] (Fourier transform of the sharp bandwidth limit). A real microscope will usually have a more smooth point spread function. In the case of SSD’s that are not bandwidth limited to the point where the Wiener filter would start to cut off, it might be advisable to use the Gaussian modifier method instead, with a modifier that is the Fourier transform of a realistic point spread function.

2.4. Maximum entropy methods Another class of deconvolution methods takes into account the fact that the observations are strictly positive. Two iterative methods will be used in this paper: the iterative image space restoration algorithm (ISRA) [13,14] is based on the assumption of normally distributed noise: O^ iþ1 ðEÞ ¼ O^ i ðEÞ

IðEÞ  PðEÞ ½PðEÞ  O^ i ðEÞ  PðEÞ

ð8Þ

ARTICLE IN PRESS J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

The Richardson–Lucy algorithm (RLA) [6] is based on the assumption of Poissonian distributed noise: " # IðEÞ ^ ^ O iþ1 ðEÞ ¼ O i ðEÞ PðEÞ  ð9Þ PðEÞ  O^ i ðEÞ Both algorithms are based on maximum likelihood deconvolution like the Wiener filter, extended with an extra constraint maximising the entropy, although this is not obvious from Eqs. (8) and (9). The reader is referred to [6,13,14] for details. It is important to note here that the constraint on entropy is not guaranteed to be applicable for EELS spectra; the Shannon entropy is given by [15] X S¼ pi lnpi ð10Þ With pi ¼ PðI^i ; Ii0 Þ the probability that the proposed solution ^  PðEÞ in pixel i could agree with the expectation I^i ¼ OðEÞ spectrum Ii0 ¼ OðEÞ  PðEÞ. An alternative relative entropy can be defined (also known as Kullback–Leibler divergence [16]) as minus the log likelihood in case of independent Poisson observations: X D¼ lnpi ð11Þ Note that both definitions of entropy are quite different and can only be calculated if OðEÞ; PðEÞ and the noise model are known which makes it impossible to use them in this way in practice. Therefore, it is common in ME methods to assume that a flat spectrum would be the most probable outcome, by approximating the Shannon entropy as X S0 ¼  pi0 lnpi0 ð12Þ With pi0 ¼ PðI^i ; IÞ the probability that the proposed solution could agree with an expectation model of a flat spectrum with I counts. For the relative entropy this becomes X D0 ¼  lnpi0 ð13Þ Note that only D is a good measure of how close the proposed solution is to the true expectation model Ii since minimising D maximises the likelihood and it is this solution that is obtained in a model based fitting algorithm based on maximum likelihood. The Shannon entropy S does not necessarily help us in selecting a solution unless we can call in physical reasons why S should be maximum or minimum for any real solution I^i . ME methods maximise the Shannon entropy. An argument for this can be made by linking Shannon entropy with information. The higher S or its approximation S0 the more information will be contained in the solution. Information, however, has to be seen in the following sense: How many parameters are needed to represent this solution or how redundant is this solution? In practice, maximising S0 will create more and more noise which makes the redundancy in the solution smaller or in other words more information is needed to describe this solution. In the first few steps of an ME iteration this will typically lead to an increased resolution in the result, but after that the method starts to act as a sort of noise generator which will be demonstrated on virtual spectra in Section 3. Replacing the entropy measures S; D by their estimates S0 ; D0 , assuming a flat expectation model, however, makes things even worse and the maxima or minima of both are not guaranteed to be meaningful in any sense. In Section 3 it will be shown on numerically generated data that the closest solution in the RMS sense is obtained when minimising D and that all other measures of entropy are unrelated to the RMS distance (between the solution and the true SSD) and to each other and are therefore of little help in selecting the solution. Especially the approximated

1345

Shannon entropy S0 as typically used in ME methods is not directly connected with finding a minimum in the RMS distance ^ between the true solution OðEÞ and the estimated solution OðEÞ. Another issue to note is that both ISRA and RLA are multiplicative algorithms. This means that whenever during the iteration a certain O^ i ðEj Þ ¼ 0 it will remain zero for all subsequent iterations. This has far reaching consequences for the application to EELS spectra. It is common to apply these techniques to a spectrum of an excitation edge that was obtained by background (BG) removal from a full spectrum (see [1] for an example with and without background subtraction). This, however, violates the two assumptions needed for the application of both ISRA and RLA:

 After background removal the signal is no longer strictly



positive. In practice this means that all negative values will be set to zero and will remain zero. This will give the false impression that these deconvolution techniques produce a flatter region with less oscillations before the edge onset. On the other hand, positive values of noise in the preedge region will tend to form sharp positive peaks which could be mistaken for prepeaks or exciton peaks. Removing the background changes the distribution function of the noise. In particular the background subtracted spectrum will no longer be Poissonian, especially not in the region before the edge onset.

A straightforward solution to these problems is to use both ISRA and RLA on spectra without first removing the background signal. This works well but removes the advantage of having extra prior knowledge. The reason why these methods work well for astrophysical images is that they contain large regions of truly black areas. Note that the reason is that no photons were detected in these areas as opposed to the EELS case where every part of the spectrum contains a considerable amount of counts. This leads to a deconvolution which largely leaves these areas unchanged (once zero remains zero, guaranteed positive result) while the brighter regions (e.g. a star) will get sharper. This is a big step forward from Fourier based filter which always creates oscillations in the dark regions due to the Gibbs phenomenon (ringing artefacts). A difficult problem with the iterative techniques is the stopping criterion. The amount of iterations is undefined and in practice the result will acquire more and more high frequency noise as the number of iterations is increased. It can be shown that for an infinite amount of iterations, the result of the straightforward deconvolution is obtained, which is of course undesirable. Usually it is left up to the user to determine when is the right time to stop, which inevitably leads to a bias in the results and to an unreproducible method. Stopping criteria can be constructed, however, and a very natural choice is proposed in Ref. [17] to continue iterating until N ^ j Þ  IðEj ÞÞ2 ðIðE 1X r1 IðEj Þ N j¼1

ð14Þ

^ ^ With IðEÞ ¼ OðEÞ  PðEÞ. Alternatively we can use the likelihood ratio (LR) as described in [18] LRrw2N;1a

ð15Þ

With a the significance level, N the degrees of freedom (number of pixels that are estimated) and the likelihood ratio defined for Poisson distributed noise as (see Eq. (21) in [18]) ! N X ^ j Þ  IðEj Þ þ IðEj Þ ln IðEj Þ ð16Þ IðE LR ¼ 2 ^ jÞ IðE j¼1

ARTICLE IN PRESS 1346

J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

Physically this means that the iterations are stopped at the moment when the difference between the experiment IðEÞ and ^ the reconvolved estimate IðEÞ can be attributed to noise. If one continues to iterate longer, this difference decreases and the estimated SSD starts to reproduce features to fit the noise in the experiment. This is generally unwanted since this is essentially noise amplification. One has the freedom to tune the significance level but typically a ¼ 0:05 is chosen. At a ¼ 0:5 and for a large number of counts in each channel we recover Eq. (14). Increasing alpha leads to a higher resolution and more noise in the final result. Note that for this likelihood ratio stopping criterion to work, one needs to know the noise properties of the detector. These properties can be measured and taken into account as explained in [11].

Different lifetime broadening functions can be used for the interpolating function but this is not done in this publication since this would give an unfair advantage over the other deconvolution methods which do not have this extra prior knowledge. A remaining problem is the choice of the number of parameters: on the one hand, the number of parameters should be high enough to make a statistically acceptable model [18] for the experiment. On the other hand, increasing the number of points will lead to an increase in the error bars for the parameters. An optimal setting for the number of points can be found by checking the model acceptance versus the number of points as will be described in Section 3 but a trade-off between noise and resolution remains possible as in most other deconvolution methods.

2.5. Model based method

3. Testing the performance of different algorithms

In a model based method, the deconvolution problem is transformed into a convolution of a parametric model which is then made to fit with an experimental spectrum. This approach has been successful in the quantification of EELS spectra [19]. For a core loss spectrum we can write a model including background as

Testing deconvolution methods on experimental data is difficult since the correct single scattering distribution is usually unknown and therefore an objective figure of merit is not available. A solution is the use of so-called virtual spectra. These are computer generated spectra with a known SSD including multiple scattering by convoluting with a model for a low loss spectrum and adding Poisson noise. This way the performance of the deconvolution algorithms can be tested for different thicknesses and for different amounts of noise. As a figure of merit we will use the RMS distance between the true and the estimated SSD. The set of virtual spectra was created making use of the FEFF program [22], in the way described by Ref. [19]. The effect of the thickness is taken into account using a Poissonian distributed probability of inelastic scattering (thickness dependent). To see how the thickness (defined as t=l, where l is the inelastic mean free path of the electron in the material of interest) influences the distribution of intensity in the measured spectrum, the reader is referred to Section 5 in Ref. [19], where the model construction is detailed. Here we briefly repeat that the simulated spectra consist of:

MðEÞ ¼ OðEÞ  PðEÞ þ BGðEÞ

ð17Þ

With BG a background model and O a model for the core loss single scattering distribution. The model for O can be constructed as an atomic single scattering distribution sðEÞ multiplied with a parametric equalisation function f ðEÞ that models the fine structure. This approach has been described in detail in [20] OðEÞ ¼ sðEÞf ðEÞ

ð18Þ

The equalisation function can be a linear or a cubic spline fit through K parameters which is zero before the edge onset and one above a certain threshold where the excitation is assumed to follow closely the atomic cross section [20]. The equalisation function can also be obtained by upsampling K points to a higher number of points that coincide with the energy values Ei . Performing a maximum likelihood fit of the model MðEÞ with the experimental spectrum IðEÞ leads to an estimate of the single ^ scattering distribution OðEÞ which is the deconvolution result. There are several advantages of this method over the other methods described here:

 The trade-off between SNR and resolution is clear in this







method and is done via the choice of the number of parameters in the model. The more parameters, the higher the error bar on each of these parameters will be. An error bar is calculated on the parameters which is missing in all other methods, this is of crucial importance to distinguish deconvolution artefacts from real interpretable signal. It also gives feedback to the user on the SNR in the final result. The maximum likelihood method can be shown to be most precise. This means that for a given choice of para^ meters there is no algorithm that can produce an estimate OðEÞ with smaller error bars unless more prior knowledge is assumed. The parametric equalisation function can have non-equidistant points in energy, which is useful as a way of including extra prior knowledge for lifetime broadening.

This method is readily available in the EELSMODEL program [18] which is freely available for the scientific community [21]. Several methods for calculating the equalisation function are present: linear interpolation, cubic spline interpolation and upsampling.

 A power law background.  A K edge for boron and a K edge for nitrogen in the cubic 



structure (c-BN), simulated with FEFF (at 300 kV incident energy, 6.8 mrad collection angle). Convolution with a low loss spectrum consisting of a zero loss peak and Poissonian distributed multiple plasmons (i.e. dependent on thickness), with Lorentzian line shapes. The zero loss peak is taken to have a full width at half maximum of 3 eV as a challenge for the deconvolution to improve on. Poissonian distributed counting noise.

For statistical tests a set of 100 spectra is created with random noise for two different thicknesses. Noise is added also to the corresponding low loss spectra to simulate experimentally acquired low loss spectra. The generated virtual spectra are presented in Fig. 1 and can be obtained from the authors. We tested the retrieval of the SSD using several deconvolution techniques. In all the methods the point spread function PðEÞ is approximated with the low loss spectrum including random Poisson noise. Before proceeding with the comparison of the different deconvolution techniques on the data, we need to define a few useful quantities to judge the quality of a deconvolution result. The goodness of the deconvolution methods can be compared by means of the root mean square ^ i Þ and the true distance between the retrieved SSD OðE

ARTICLE IN PRESS J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

1347

Fig. 1. Colour online. A numerically generated core loss spectrum from a set of 100 spectra for c-BN at thickness t=l ¼ 0:2 (top), and at thickness t=l ¼ 1:6 (bottom), together with the corresponding low loss spectra (right). The true SSD is superimposed for clarity.

SSD OðEi Þ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi X ^ Þ  OðE ÞÞ2 ðOðE RMS ¼ i i

ð19Þ

i

with i running over the number of pixels in the spectrum. The minimum in the RMS distance assures that the result is as close as possible to the true SSD. This is different from the bias value (Bias) sffiffiffiffiffiffiffiffiffiffiffiffiffi X ffi ðO^ 0 ðEi Þ  OðEi ÞÞ2 ð20Þ Bias ¼ i

where O^ 0 is the retrieved SSD from a noise free experiment. The bias measures a systematic error in the deconvolution process that occurs even with noise free data. The RMS value on the other hand measures both bias and noise as they both affect the total outcome of the deconvolution. In principle the maximum likelihood [18] is unbiased ðBias ¼ 0Þ, but for ME methods this is not guaranteed. In the following, we will first quantitatively compare the different methods of deconvolution, then we will discuss the stopping criteria (the iteration to stop at in an iterative technique, as for ME methods and model based), and check the maximum entropy constraint and bias of the ME methods. For the Gaussian modifier method we chose a Gaussian modifier width of 3 eV (as resulted by measuring the full width at half maximum of the zero loss peak as is commonly done). For the Wiener filter we know the noise power spectrum jNðf Þj2 from the fact that we used Poissonian distributed noise and independent pixels resulting in white noise. In real experiments, the noise power is not white because of correlation in the detector but the noise power can be experimentally measured [11]. The ME

methods (ISRA and RLA) are tested with and without background removal. The background is removed by fitting a power law in the region preceding the boron K edge. For the non-background removed example, an extrapolation of the data outside the experimentally recorded region is used to avoid artefacts near the edge of the spectra [23], especially at the low energy side. This step is essential to apply RLA on non-BG removed spectra. The model based deconvolution is performed within the EELSMODEL program [21], with a model consisting of

 a power law background,  hydrogenic K edges for B and N taking into account collection  

and convergence angle as in Ref. [24] (see also Section 2 in Ref. [19]), two equalisation functions for B and N, to simulate the fine structures, obtained by upsampling (using up to 60 points for boron and 30 points for nitrogen), convolution with the virtual low loss spectrum including noise.

Fig. 2compares the RMS values for the different methods in the case of a thin specimen (t=l ¼ 0:2), and a very thick one (t=l ¼ 1:6) for 100 different realisations of random noise. It is interesting to note the statistical character of this result which is fundamental in deconvolution problems. Every realisation of the noise will produce its own estimate O^ which in turn has its own RMS distance from the true SSD. For the thin sample, the model based method gives the best results followed closely by RLA. The close results between RLA with and without background subtraction for the thin sample demonstrate that the extra constraint of positivity does not bring a significant

ARTICLE IN PRESS 1348

J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

Fig. 2. Colour online. Comparison between different techniques of deconvolution, in terms of root mean square (RMS) deviation with respect to the true single scattering distribution (SSD), for a t ¼ 0:2l thick (left) and a t ¼ 1:6l thick (right) c-BN sample. Note the statistical nature of the result; each noise realisation will lead to a different estimate of O^ with its own RMS distance to the true SSD.

Table 1 Comparison between different methods of deconvolutions of core loss EELS spectra in terms of RMS with respect to the true SSD. Method

Min RMS

Gaussian modifier Wiener filter RLA RLA (no BG) ISRA (no BG) Model based

4667 3924 3399 3332 5178 2899

LR stopping crit.

3907

The results are obtained by averaging the RMS value on 100 repetitions for the 0:2 t=l thick c-BN sample. See text for details.

improvement. A different scenario is found for the thick sample (note that 1:6 is an extreme case of a very thick sample). The RLA result (without subtracting the background) becomes poor due to artefacts produced by the extrapolation of the background to avoid more severe edge artefacts. For the thick sample, the RLA with background subtraction and model based results are comparable. The overall performance of all methods averaged over the 100 virtual experiments is compared in Table 1 for the t=l ¼ 0:2 thick sample. For RLA and ISRA we applied a stopping criterion by stopping at a minimum in RMS. The optimal number of iterations needed to reach the minimum in RMS is evaluated for every experiment, and then averaged over the 100 experiments. The optimal amount of iterations obtained in this way is 38 for RLA, 56 for RLA with background subtraction, and 24 for ISRA (while it is 87 for RLA, 88 for RLA with background subtraction, and 112 for ISRA for the t=l ¼ 1:6 thick specimen, respectively). We then evaluate the RMS distance corresponding to the mean optimal number of iterations (first column). This method was also used in Fig. 2. Of course this procedure is possible only if the true SSD is known, as is the case of the virtual experiment here, and gives an unfair advantage to the ME methods (even a noise generator could in principle find the true SSD if you iterate long enough with this stopping criterion). If we use the stopping criterion proposed in Eq. (15) we obtain a slightly worse result as displayed in the second column of Table 1 for RLA with background removal. The result is then close to the Wiener filter result. For ISRA and RLA without background removal, the stopping criterion does not work because the bias caused by

artefacts is higher than the expected noise and therefore the stopping criterion is never reached. For the model based method we can check the reliability of the retrieved SSD via the acceptance test, as described in [18]. In Fig. 3 we plot RMS, relative standard deviation and acceptance level as a function of the total number of free parameters in the fit. The total number of free parameters contains six parameters describing the BG and the edge onsets and strengths plus x parameters for the B fine structure and y parameters for the N fine structure. The acceptance level is related to the number of spectra (out of 100) that passed the likelihood ratio test [18]. The figure demonstrates that an acceptance level of 95% corresponds well with a minimum in RMS distance, giving a criterion to decide on the number of parameters, independent on the knowledge of the true SSD. The retrieved model based SSD for boron at t=l ¼ 0:2 is presented in Fig. 4. The results for two different numbers of parameters in the boron fine structure (corresponding to two different acceptance levels) are compared. From the figure we see that at an acceptance level of 95% most of the artefacts (especially the peak at the edge onset) are highly suppressed. The results for RLA with background subtraction, obtained with the LR stopping criterion, and the result with the Gaussian modifier, are shown for comparison. It is evident that model based and RLA (upper curves) are closest to the true SSD. Moreover, model based techniques give a prediction of the error (precision) from a single experiment that corresponds well with the standard deviation on a repeated set of measurements [19]. This is demonstrated by Fig. 5, showing the standard deviation on a single experiment together with the full set of 100 repeated experiments demonstrating the effect of random noise. It is evident from Fig. 3 that by increasing the number of points in the fine structure (i.e. increasing the number of free parameters in the fit), the noise in the resulting SSD increases, as well as the calculated estimate of the standard deviation. This means there is a trade-off between resolution and precision as is common for most deconvolution processes. This is also confirmed by Fig. 5. An optimal value for the amount of parameters would then be to obtain a 95% acceptance. Choosing less parameters would be statistically unacceptable and more parameters would mainly reproduce the noise signal as can be seen from the fact that the RMS value reaches its minimum near the 95% acceptance level. Note that if one would choose the same number of parameters as the number of pixels, the result would become similar to the brute force deconvolution or the final result

ARTICLE IN PRESS J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

1349

Fig. 3. Colour online. Resulting root mean square (RMS), acceptance, and standard deviation (st. dev.), for the retrieved single scattering distribution (SSD) for a t ¼ 0:2l (left) and a t ¼ 1:6l (right) c-BN sample, as a function of the total number of parameters in the model.

Fig. 4. Colour online. The obtained SSD (full blue line) for boron in c-BN at a thickness t ¼ 0:2l for the model based method (left). Recovered SSD using x ¼ 30 parameters (top: total parameters ¼ 30 þ 15 þ 6 ¼ 51 giving acceptance of 32%) or x ¼ 60 parameters (bottom: total parameters ¼ 60 þ 15 þ 6 ¼ 81 giving acceptance of 96%) for the boron equalisation function. The results are compared to RLA and Gaussian modifier (right). The true SSD is superimposed for comparison (dashed red line).

of ME after infinitely many iterations, and statistically this model would have a 100% acceptance. Clearly this solution is of little value. The results from the Fourier ratio method on the same dataset are shown for comparison in Fig. 6. In this case a reduction of noise is expected from the Gaussian filter (the modifier), but the RMS distance is worse as compared to the model based results as can be seen from Table 1 mainly due to a reduced resolution. Setting the Gaussian modifier width gives control over the resolution versus noise. It should be noted that there is no fundamental reason why precision estimates are missing from all methods except the model based method. In principle the Fisher information matrix can be written out and the Cramer Rao lower bound (CRLB) can be calculated just as for model based methods [18] except that there is no guarantee that the CRLB will be attained for estimators other than the maximum likelihood estimator. As explained in Section 2.4, the ME methods (RLA and ISRA) make use of a maximisation of the entropy as an extra constraint. The different expressions for Shannon (S) and relative (D) entropy can only be calculated if the true SSD is

known and therefore it is common to use their approximated 0 counterparts S , D0 assuming a flat spectrum to be the most probable. These different measures for the entropy are shown together with the RMS distance from the true SSD for a varying number of iterations in Fig. 7. Also the bias is shown on this plot. Note that in maximum likelihood methods we maximise the likelihood, which corresponds to minimising D. The other entropy measures have maxima at different positions, while only D has its minimum coinciding with the minimum in RMS value. 0 Fig. 7 shows that the approximated Shannon entropy S , which is commonly maximised in ME methods, reaches a maximum after approximately 40 iterations and then decreases again. This demonstrates the initial creation of more resolution in the estimated SSD as witnessed by the decrease in RMS and bias which is followed by a stage where amplification of noise takes over and makes the RMS go up again while the bias keeps decreasing. It could be argued that a good stopping point for ME would be when the RMS starts to deviate from the bias because then noise amplification starts to occur. Unfortunately this can only be done

ARTICLE IN PRESS 1350

J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

Fig. 5. Colour online. The estimated standard deviation (blue) on a single deconvolution experiment is compared with the results from the entire set of 100 experiments. Two results from Fig. 3 are presented, using x ¼ 59 parameters (left), and using x ¼ 79 parameters (right) showing the noise versus resolution trade-off and the prediction of the error bars from the model.

Fig. 6. Distribution of the results for the entire set of experiments in the case of Fourier ratio deconvolution. Note the reduction of noise as compared to Fig. 5 due to the Gaussian filter but with a higher RMS distance to the true SSD (red), due to a resolution reduction. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

if the true SSD is known and is therefore of little practical value. All these findings point out that in terms of minimising the RMS value, the entropy constraints commonly used in ME methods are of little value, but they might add some smoothness to the result.

Fig. 7. Colour online. Resulting true Shannon entropy S and relative entropy D, and 0 the corresponding approximated entropies S , D0 , plotted versus the root mean square RMS and Bias, for the RLA retrieved single scattering distribution, as a function of the number of iterations for a t ¼ 0:2l thick c-BN sample. For clarity all quantities are scaled to their maxima.

and Rutile in the case of real experimental spectra (the spectra from Anatase are taken from Ref. [25]). The model construction is similar to the one used for the virtual spectra:

 a power law background,  hydrogenic L2;3 edges for Ti,  an equalisation function for Ti (consisting of 65 points), to simulate the fine structure,

 convolution with an experimentally acquired low loss spectrum. 4. Applying to experiments In a second step we apply the deconvolution methods on experimentally obtained spectra from Rutile TiO2 and Anatase TiO2 , acquired on a Philips CM30 FEG equipped with a Gatan GIF200 energy filter. A set of 100 spectra were aligned in energy to reduce the effect of energy drift, and then summed to improve the signal to noise ratio. The energy resolution at the optimal working conditions (low gun extractor and smallest dispersion) was 0.7 eV. Fig. 8 compares the extracted SSD with the different deconvolution methods for the Titanium L2;3 edge in Anatase

The model based results are compared to the standard Fourier ratio method, with a Gaussian modifier of 0.7 eV (chosen to be close to the experimental energy resolution of the experimental system), and to the deconvolved spectra obtained using the Richardson–Lucy algorithm, the ME method that gave the best result in the previous section. In this case we cannot quantify the results in terms of RMS as previously, because the true SSD is not known. To have reference spectra close to the true model, we compare the results with X-ray absorption spectra (XAS), where the effect of multiple scattering with

ARTICLE IN PRESS J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

1351

Fig. 8. The obtained single scattering distribution (SSD) (full blue lines) for the Ti L2;3 in Anatase (left), and Rutile (right) TiO2 . The results from model based method are compared to the Richardson–Lucy deconvolution (RLA) and Fourier ratio with Gaussian modifier (Gauss. modifier), and with X-ray measured spectra (XAS) (dashed red lines). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

plasmons should be negligible. The energy resolution in XAS is also far superior with respect to standard EELS in an electron microscope, so that the width of the peaks can be considered mainly due to the natural width of the excitation (life time), and is a good reference for testing the removal of the energy spread of the gun in the deconvolution of the EEL spectra. The XAS data in Fig. 8 are taken from the work of Ruus et al. on titania films [26]. The model based SSD gives comparable or even better results in terms of signal to noise ratio, while the Gaussian modifier gives the worst result (see in particular the curve for Rutile), due to noise amplification, which is only partially suppressed by the filter. This effect is more evident in Rutile due to the lower signal to noise ratio, confirmed by the bigger error bars in the model based curve. The intensities of the peaks in the model based SSD correspond well with the ones from the XAS measurements. However, both model based and RLA fail to correctly reproduce the expected small peaks at the edge onset [27,28], particularly for Rutile. The advantage of model based deconvolution, however, is that the error bars indicate that, in the prepeak area, the reproduced signal is small with respect to the calculated error bars, which signals that the small peaks in the deconvolved results are probably artefacts. The deviations with XAS of the high energy tail of the spectra can be due to the different matrix elements affecting the inelastic cross section for EELS and XAS, respectively.

5. Discussion Comparing the different deconvolution methods showed that model based methods perform well in comparison with all other methods even the ME methods. To understand why this is we can compare the assumptions made in ME methods with those in

model based methods. The ME methods we used here assume only two things:

 positivity,  a good solution is one which maximises the approximated Shannon entropy S0. We have seen on the experiments with virtual spectra that both assumptions fail for core loss EELS spectra if the background is removed. Especially we have shown that there is no good relation between obtaining a minimum in RMS and the studied entropy measures apart from the relative entropy D which can only be used in model based methods since it requires an expectation and noise model. Therefore it is remarkable that the solutions of the ME methods are still that good. One should not forget, however, that we gave an unfair advantage to the ME methods by finding out from the virtual spectra which the amount of iterations was optimal to minimise the RMS distance with the true SSD. In practice this information is missing and the experimenter has to rely on his visual preferences to stop the iterative procedure which can make the method difficult to reproduce. Using a likelihood ratio stopping criterion as proposed in Eq. (15) proved useful for RLA with background removal, but artefacts prevented its use on ISRA and on RLA without background removal. Practical application of this stopping criterion requires the knowledge of the noise power in the experiment as also needed for model based methods and the Wiener filter. In model based quantification on the other hand we also have the assumption on positivity but we add extra prior knowledge about the physics of inelastic scattering by including a power law background and hydrogenic inelastic cross sections

ARTICLE IN PRESS 1352

J. Verbeeck, G. Bertoni / Ultramicroscopy 109 (2009) 1343–1352

to model the overall shape of the excitation edges. On top of this it can be shown that the maximum likelihood method is most precise, meaning that no other method can have a better precision unless it uses more prior knowledge. Considering these points, it is no surprise that model based deconvolution performs very well. On top of this there is the very important advantage of obtaining estimated error bars on the solution which allows the experimenter to distinguish deconvolution artefacts from real signal in the obtained SSD. The model validation step can be used to learn how much parameters have to be used in the fine structure. This is similar to the Wiener filter method which restores only those spatial frequencies which are above the noise. The Wiener filter, however, performs considerably less good because it misses the extra prior knowledge that we supplied in the model based method. The simple Gaussian modifier method is clearly not very good in terms of RMS when compared to the other methods. Its strength lies in its simplicity and ease of use.

6. Conclusion In this paper we reviewed different deconvolution algorithms and discussed their drawbacks. We tested the performance of these algorithms on virtual spectra where the true SSD is known. This showed a similar performance of the model based methods as compared to maximum entropy methods if optimal values for the number of iterations are used in the latter. If a more realistic stopping criterion is used, based on the likelihood ratio, the ME results show a comparable RMS value as a Wiener filter far worse as the model based result. On top of this good performance, model based deconvolution methods have another important advantage in the existence of calculated estimates of the error bars. These error bars are of fundamental importance when distinguishing between deconvolution artefacts and real features in the recovered spectra. The virtual experiments also demonstrated the absence of a good basis for assuming that the approximate Shannon entropy is a good parameter to prefer one deconvolution solution over another as is usually done in ME methods. A test on experimental titania spectra showed that the retrieved single scattering distribution is in good agreement in the ELNES part of the spectrum with XAS measurements. This is of interest when the fine structure is to be extracted from the spectra in studying chemical and electronic properties, for instance to determine the L3 =L2 ratios from EELS for valence or magnetic moment measurements.

Acknowledgements G.B. is grateful to the Fund-for-Scientific-Research-Flanders under Contract number G.0147.06. J.V. wants to thank the European Union under the Framework 6 program under a contract for an Integrated Infrastructure Initiative. Reference 026019 ESTEEM. References [1] R.F. Egerton, F. Wang, M. Malac, M.S. Moreno, F. Hofer, Fourier-ratio deconvolution and its Bayesian equivalent, Micron 39-6 (2007) 642–647. [2] K. Ishizuka, Deconvolution processing in analytical STEM: monochromator for EELS and Cs-corrector for STEM-HAADF, Microsc. Microanal. 112 (Suppl. 2) (2005) 1430. [3] A. Gloter, A. Douiri, M. Tence , C. Colliex, Improving energy resolution of EELS spectra: an alternative to the monochromator solution, Ultramicroscopy 96 (2003) 385–400. [4] M.H.F. Overwijk, D. Reefman, Micron 31 (2000) 325. [5] J. M. Zuo, Microsc. Res. Tech. 49 (2000) 245. [6] W.H. Richardson, J. Opt. Soc. Am. 62 (1972) 55. [7] L.B. Lucy, Astrophys. J. 79 (1974) 745. [8] R.F. Egerton, Electron Energy-Loss Spectroscopy in the Electron Microscope, second ed., Plenum Press, New York, London, 1996. [9] P.A. Lynn, Introduction to the Analysis and Processing of Signals, third ed., Hemisphere, Washington, DC, 1989. [10] P.E. Batson, in: Proceeding of the 49th Annual Meeting of the Electron Microscopy Society of America, 1991, p. 710. [11] J. Verbeeck, G. Bertoni, Model-based quantification of EELS spectra: treating the effect of correlated noise, Ultramicroscopy 108-2 (2008) 74–83. [12] P. Magain, F. Courbin, S. Sohy, Deconvolution with correct sampling, Astrophys. J. 494 (1998) 472–477. [13] M.E. Daube-Witherspoon, G. Muehllehner, IEEE Trans. Med. Imaging MI-5 61 (1986). [14] G.B. Archer, D.M. Titerington, Stat. Sinica 5 (1995) 77–964. [15] C.E. Shannon, Prediction and entropy of printed English, Bell Syst. Tech. J. 30 (1951) 50–64. [16] S. Kullback, The Kullback–Leibler distance, Am. Stat. 41 (1987) 340–341. [17] J.W. Wells, K. Birkinshaw, A matrix approach to resolution enhancement of XPS spectra by a modified maximum entropy method, J. Electron Spectrosc. Relat. Phenom. 152 (2006) 37–48. [18] J. Verbeeck, S. Van Aert, Model based quantification of EELS spectra, Ultramicroscopy 101 (2004) 207–224. [19] G. Bertoni, J. Verbeeck, Accuracy and precision in model based EELS quantification, Ultramicroscopy 108 (2008) 782–790. [20] J. Verbeeck, S. Van Aert, G. Bertoni, Model based quantification of electron energy loss spectroscopy: including the fine structure, Ultramicroscopy 106 (2006) 976–980. [21] The program is freely available under the GNU public license and it can be downloaded from /http://www.eelsmodel.ua.ac.beS. [22] J.J. Rehr, R.C. Albers, Rev. Mod. Phys. 72 (3) (2000) 621–654. [23] R.F. Egerton, Ultramicroscopy 107 (8) (2007) 575–586. [24] H. Kohl, Ultramicroscopy 16 (2) (1985) 265–268. [25] G. Bertoni, E. Beyers, J. Verbeeck, M. Mertens, P. Cool, E.F. Vansant, G. Van Tendeloo, Ultramicroscopy 106 (2006) 630–635. [26] R. Ruus, A. Kikas, A. Saar, A. Ausmeesb, E. Ndmmiste, J. Aarik, A. Aidla, T. Uustared, I. Martinson, Solid State Commun. 104 (4) (1997) 199–203. [27] F. De Groot, Chem. Rev. 101 (2001) 1779–1808. [28] I. Tanaka, T. Mizoguchi, T. Yamamoto, J. Am. Ceram. Soc. 88 (2005) 2013–2029.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.