Y-STR Frequency Surveying Method: A critical reappraisal

Share Embed


Descripción

Forensic Science International: Genetics 5 (2011) 84–90

Contents lists available at ScienceDirect

Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsig

Y-STR Frequency Surveying Method: A critical reappraisal Sascha Willuweit a,*, Amke Caliebe b, Mikkel Meyer Andersen c, Lutz Roewer a a

Charite´ – Universita¨tsmedizin Berlin, Institute of Legal Medicine, Department of Forensic Science, Berlin, Germany Christian-Albrechts-Universita¨t zu Kiel, Institut fu¨r Medizinische Informatik und Statistik, Kiel, Germany c Aalborg University, Department of Mathematical Sciences, Aalborg, Denmark b

A R T I C L E I N F O

A B S T R A C T

Keywords: Y-STR haplotype Haplotype frequency estimation YHRD Bootstrap MLE

Reasonable formalized methods to estimate the frequencies of DNA profiles generated from lineage markers have been proposed in the past years and were discussed in the forensic community. Recently, collections of population data on the frequencies of variations in Y chromosomal STR profiles have reached a new quality with the establishment of the comprehensive neatly quality-controlled reference database YHRD. Grounded on such unrivalled empirical material from hundreds of populations studies the core assumption of the Haplotype Frequency Surveying Method originally described 10 years ago can be tested and improved. Here we provide new approaches to calculate the parameters used in the frequency surveying method: a maximum likelihood estimation of the regression parameters (r1, r2, s1 and s2) and a revised Frequency Surveying framework with variable binning and a database preprocessing to take the population sub-structure into account. We found good estimates for 11 metapopulations using both approaches and demonstrate that the statistical basis of the method is well supported and independent of the population under study. The results of the estimation process are reliable and robust if the underlying datasets are large and representative and show small average and pairwise genetic distances. ß 2010 Elsevier Ireland Ltd. All rights reserved.

1. Introduction The Y chromosome DNA testing enables examination of the male-specific portion of biological evidence. Since fathers pass their Y chromosome onto their sons unchanged (except for an occasional mutation), all males in a paternal lineage will possess a common Y chromosome haplotype. Hence, in all populations an unknown number of men share their haplotypes by descent due to (known) familial or (mostly unknown) genealogical relationship. While the mutational process, drift and social factors continuously diversify the actual stock of Y chromosomes within a given population, the underlying relatedness is not erased and visible in the clustering of haplotypes of common descent (neighbours with small genetic distances). Access to reference databases representing the variance and relatedness of haplotypes within local populations and Metapopulations (groups of spatially separated population samples with genetic interactions) is thus of crucial importance for interpreting a Y-STR match [1]. In contrast to counting (frequentist) or probabilistic methods [2] to estimate the haplotype frequency the ‘‘Frequency surveying’’ method [3,4] developed in the year 2000 takes the molecular relationship of haplotypes into account. With worldwide Y-STR haplotype databases of adequate size, sampling scheme and scrutinized sample quality [5] the initial assumptions of the frequency

* Corresponding author. Tel.: +49 30 450525074; fax: +49 30 450525912. E-mail address: [email protected] (S. Willuweit). 1872-4973/$ – see front matter ß 2010 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.fsigen.2010.10.014

surveying can now be reappraised and further developed on basis of a much larger dataset. 1.1. Frequency Surveying Method In this section, the Frequency Surveying Method introduced in [3,4] will be explained shortly. Given N haplotypes of a population sample, let Ni for i = 1, 2, . . ., M denote the number of times the ith haplotype has been observed. Using a Bayesian approach we are considering a Y-STR haplotype frequency f (subscript i is omitted) as a random variable and using classical population genetics theory one can expect it to follow a beta distribution with parameters a and b and the density

fð f ; a; bÞ ¼

Gða þ bÞ a1 b1 f ð1  f Þ : GðaÞGðbÞ

(1)

Thus we choose this distribution as prior distribution for the frequencies in our Bayesian estimation model. Note that for a special haplotype the likelihood function is binomial with the beta distribution being the conjugate prior in Bayesian statistics. The parameters a and b are estimated using the weighted inverse molecular distance (W, a quantity of the effective population size and mutation rates) Wi ¼

1 XNj ; N j 6¼ i di j

(2)

S. Willuweit et al. / Forensic Science International: Genetics 5 (2011) 84–90

where the distance dij between the ith and the jth haplotype is the sum of their allelic differences at the considered loci. Alleles are denoted by the repeat number. dij is then the L1 or Manhattan distance of haplotypes i and j. Based on W we estimate parameters a and b as follows: All haplotypes are binned into 15 groups according to their Wi-value. For each group, the average of the Wi’s for the haplotypes in that group are calculated together with the average and standard deviation, m(f) and s(f), of the observed relative frequency N 1 : fi ¼ i NM

(3)

This is used to fit the parameters r1 and r2 in the exponential 1 regression model Eð f Þ ¼ exp ðr 1 W þ r 2 Þ (and similar parameters pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s1 and s for Varð f Þ). mi and si serve here as estimator for Eð f i Þ 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and Varð f i Þ. Using these regression models, it is possible to obtain estimates of mi = m(fi) and si = s(fi) for each haplotype, which in turn is used to obtain the prior parameters by exploiting that 2

ai ¼

Eð f i Þ ð1  Eð f i ÞÞ  Eð f i Þ Varð f i Þ

and

bi ¼ ai



 1  Eð f i Þ : Eð f i Þ

(4)

Now the prior distribution is determined and by using a binomial likelihood one can get a posterior beta distribution for the haplotype frequency. A straightforward estimate of the haplotype frequency is the mean of the posterior distribution. 2. Material and methods 2.1. Database All calculations and simulations made here are based upon the worldwide Y-STR haplotype reference database YHRD [5–8] release 34 from July 16th, 2010 (see http://yhrd.org). We analysed a total of 53,837 haplotypes as shown in Table 2. This extensive repository contains samples contributed by many different laboratories around the world. Each contribution to the YHRD is assigned to a sub-population cluster (Metapopulation [5,9]) according to the clusters defined by the corresponding author and an Analysis of Molecular Variance (AMOVA) [10,11] done by the YHRD curators. This procedure ensures that subpopulation structures in the dataset are not ignored and adequately accounted for in the Frequency Surveying Method, as it estimates haplotype frequencies from molecular distances reflecting their a priori distribution of dissimilarities. To avoid sampling and assignment errors we analyze all subpopulations from a certain Metapopulation and assess their composition. We start using the manually assigned configuration of M sub-population samples of Metapopulation X. Let Ai for i = 1, 2, . . ., M denote the number of different haplotypes and Bi for i = 1, 2, . . ., M the total number of haplotypes within subpopulation sample i. Further let F denote the M  M matrix of pairwise distances calculated using AMOVA’s fST [10,11]. Given that, we minimize the following function using standard linear programming.

Minimize :

 T  T M X M X A 1 Xþ Xþ X i X j Fi; j B B i¼1 j¼iþ1

1 In the original publication the regressions were with an intercept, i.e. r1 + exp (r2W + r3), but since it can cause negative mi and si the intercepts are removed.

85

Subject to:

P

1 P Xð X  1Þ

  Fi; j max X j X iX X M X M X X i X j Fi; j

 0:05; for all i and j with i < j 2  0:01

i¼1 j¼iþ1

Solving this problem gives us a vector X with Xi 2 {0, 1} for i = 1, 2, . . ., M meaning 0 for exclusion and 1 for inclusion of the corresponding sub-population sample. Basically this method tries to find the largest set of genetically related sub-population samples, which meets our constraints  Pairwise genetic distance shall be less or equal to 0.05.  A Metapopulation core shall contain at least 2 populations.  Average genetic distance shall be less or equal to 0.01. A set of sub-population samples which meet these criteria is hereafter referred to as a ‘‘Metapopulation core’’. 2.2. Variable binning The number of bins used in the original publication was fixed to 15 and was derived from the database size of 2439 European haplotypes at this time. Since the YHRD database is constantly growing this static grouping has to be adapted. We apply a method called ‘‘bootstrapping’’ [12] to a reasonable number of bins l 2 ½10; 100 \ N for a population k with total number of haplotypes Nk. For each population k and each number of bins l we proceeded as follows. In each bootstrap round we randomly draw with replacement Nk haplotypes from the database of population k and bin them into l bins according to their W values. The four parameters r1, r2, s1 and s2 are calculated subsequently. Repeating this 2000 times will give us an empirical distribution of all four parameters of the Frequency Surveying Method. To determine the optimal number of bins, we calculate the 95% confidence intervals of all parameters and find the number of bins with the minimal average confidence interval width. 2.3. Maximum likelihood estimation of regression parameters A theoretical well founded estimator is the maximum likelihood estimator (MLE). In this section we apply the MLE to estimate the regression parameters r1, r2, s1 and s2. First we derive the likelihood function. Given the frequencies fi the observations Ni are multinomial distributed. The likelihood function of observations Ni is thus (x = (x1, . . ., xM))

Lðr 1 ; r 2 ; s1 ; s2 ; N i Þ ¼ C

Z Y M i xN cðxÞ dx; i i¼1

where C is the multinomial coefficient and c : RM1 ! R is a prior on the space of frequencies. As prior we employ the Dirichlet distribution which is the conjugate prior to the multinomial distribution [13]. Note that for M = 2 the Dirichlet distribution reduces to a beta distribution. The density function of the Dirichlet distribution with parameters a1, . . ., aM is (x1, . . ., xM1 > 0, x1 +    + xM1 < 1)

cðx1 ; . . . ; xM1 ; a1 ; . . . ; aM Þ ¼

M 1 Y a 1 x i ; BðaÞ i¼1 i

P

a : =(P a1, . .., aM), xM :¼ 1  M1 and BðaÞ ¼ i¼1 xi M is the multinomial beta function. The G ð a Þ= G a i i i¼1 i¼1

where QM

S. Willuweit et al. / Forensic Science International: Genetics 5 (2011) 84–90

86

density is zero outside this open (M  1)-dimensional simplex S. Substituting the Dirichlet distribution in the likelihood function we obtain (n : = (N1, . . ., NM)) Lðr 1 ; r 2 ; s1 ; s2 ; N i Þ ¼ C ¼C

Z Y M N þa 1 xi i i dx1 . . . dxM1

1 BðaÞ

3.1. Database substructure

S i¼1

Bða þ nÞ 1 BðaÞ Bða þ nÞ

Bða þ nÞ ¼C ¼C BðaÞ

Z Y M S i¼1

M X

G

ai

G

xiNi þai 1 dx1 . . . dxM1

! M Y

i¼1 M X

Gðai þ Ni Þ

i¼1

ðai þ N i Þ

i¼1

! M Y

:

Gðai Þ

i¼1

Therefore, ln ðLðr 1 ; r 2 ; s1 ; s2 ; N i ÞÞ ¼ ln ðCÞ þ

M  X

!! M X  ln ðGðai þ Ni ÞÞ  ln ðGðai ÞÞ þ ln G ðai Þ

i¼1 M X  ln G ðai þ N i Þ

i¼1

!!

Metapopulation core with 6016 different out of 16,714 haplotypes. The most common haplotype was observed 351 times in the dataset, whereas all singletons (haplotypes that occur only once in the dataset) sum up to 4215 haplotypes.

:

Solving the linear program explained earlier for each Metapopulation currently available in the YHRD (release 34), we end up with compiled Metapopulation cores shown in Table 1. We used the program ld_solve written by Michel Berkelaar at the Delft University of Technology, Netherlands (http://lpsolve.sourceforge. net). The program did not find a result for 14 out of 25 existing Metapopulations, meaning that the requirements given in Section 2.1 are met by only 11 out of 25 Metapopulations. The main reasons are the small haplotype count (e.g. East Asian – Dravidian Metapopulation with 539 haplotypes), insufficient population samples (e.g. East Asian – Indo-Pacific Metapopulation with 1 population sample) and sparse and divergent haplotypes (e.g. Native American Metapopulation). 3.2. Variable binning

i¼1

The parameters ai of the Dirichlet distribution are related to the first and second moments of fi by [13]

ai ¼

ðEð f 1 Þ  Eð f12 ÞÞEð f i Þ Eð f12 Þ  ðEð f 1 ÞÞ

2

2

;

ðEð f 1 Þ  Eð f 1 Þ Þ 1 

aM ¼

i ¼ 1; . . . ; M  1

M 1 X

!

Eð f k Þ

k¼1 2

Eð f12 Þ  ðEð f 1 ÞÞ

: 2

For Eð f i Þ and Eð fi2 Þ ¼ Varð f i Þ þ Eð f i Þ we can use the same regression equations as above: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Varð f i Þ ¼ exp ðs1 W i þ s2 Þ: Eð f i Þ ¼ exp ðr 1 W i þ r 2 Þ; Therefore we can express the loglikelihood function in terms of the regression parameters r1, r2, s1 and s2. The MLE of the regression parameters can then be calculated by maximizing the loglikelihood function. 3. Results To evaluate the impact of a refined database structure and a variable binning to the evidential power of Y-STRs, we analysed all Metapopulation cores (see Section 2.1) shown in Table 1. Below we will limit some of the results to the Western European

The optimal numbers of bins for each Metapopulation core and the corresponding 95% confidence interval widths of all four parameters (r1, r2, s1, s2) are given in Table 2. We see that the chosen number of bins does not necessarily correlate to the size of the dataset (e.g. the European populations in Table 2). In Fig. 1 the width of the confidence intervals is illustrated in detail for numerous numbers of bins based on the Western European Metapopulation core. The position of the minimal confidence interval width is highlighted with black markers. The resulting estimators for the parameters are given in Table 3. The result of the exponential regression of the Western European Metapopulation core based on the optimal number of 26 bins is shown in Fig. 2. Fig. 3 compares the new estimate calculated using the refinements (removed intercept, assessment of Metapopulation cores and variable bootstrap binning) given here with calculation based on the original publication [3,4] (full Western European Metapopulation database, fixed 15 bins) together with the observed frequencies. Each bar represents the count of haplotypes (y-axis) that occur a certain times (x-axis) in the database. Both estimates perform well, but the revised framework does fit better to the actual observations, especially in the case of more frequent haplotypes. The bootstrap program was implemented in Mathworks Matlab (http://www.mathworks.com/) using the Parallel Computing Toolbox. Because each bootstrap round is independent from each other, it scales almost linear when parallelized on multiple computers or a single computer with multiple CPUs.

Table 1 Metapopulations and their resulting cores. Metapopulation

Samples

Core samples

max(FST)

avg(FST)

African – Afro-American Afro-Asiatic – Semitic East Asian – Japanese East Asian – Korean East Asian – Sino-Tibetan – Chinese Eurasian – Altaic Eurasian – European – Eastern European Eurasian – European – South-Eastern European Eurasian – European – Western European Eurasian – Indian Eurasian – Indo-Iranian

15 32 23 13 16 32 91 58 98 20 20

14 13 23 13 14 11 78 35 69 10 11

0.021 0.014 0.017 0.018 0.016 0.017 0.025 0.018 0.018 0.015 0.016

0.004 0.009 0.005 0.007 0.009 0.009 0.01 0.009 0.01 0.008 0.009

Samples: original number of samples in the database; Core samples: samples identified by algorithm given in Section 2.1; max(FST): maximal fST value for all population samples in this core; avg(FST): average fST value for all population samples in this core.

S. Willuweit et al. / Forensic Science International: Genetics 5 (2011) 84–90

87

Table 2 Generated bin sizes for all Metapopulation cores. Metapopulation core

Haplotypes

Optimal number of bins

95% CI width r1

r2

s1

s2

African – Afro-American Afro-Asiatic – Semitic East Asian – Japanese East Asian – Korean East Asian – Sino-Tibetan – Chinese Eurasian – Altaic Eurasian – European – Eastern European Eurasian – European – South-Eastern European Eurasian – European – Western European Eurasian – Indian Eurasian – Indo-Iranian

2312 2135 1835 3685 7844 1641 10,356 4164 17,340 1661 864

90 21 13 35 59 78 22 29 26 29 26

4.9227 4.3783 4.5861 3.2217 2.9283 9.6451 1.8711 4.3576 1.5132 5.3347 7.6274

0.6776 0.5562 0.5759 0.3705 0.3244 1.0175 0.2575 0.5482 0.2069 0.6440 0.9018

4.9134 5.03668 5.7824 3.9773 4.2361 8.7315 4.0369 5.9006 2.5718 5.6280 8.1267

0.6303 0.5859 0.7032 0.4813 0.4569 0.9110 0.5469 0.6965 0.3232 0.6503 0.8949

Haplotypes: number of haplotypes in the corresponding Metapopulation core; Optimal number of bins: number of bins determined by the variable binning method (number of bins with the minimal average confidence interval width); 95% CI width: width of the 95% bootstrap confidence interval for the corresponding parameters.

3.4. Online tool

The MLEs of the regression parameters are shown in Table 3. The estimates agree well with the estimates obtained by the variable binning method. The MLE method was implemented in the statistics software R, version 2.10.1 (http://www.R-project.org/) [14]. For the maximation of the likelihood the function optim was applied using the Nelder–Mead simplex algorithm with up to 1500 iterations. The likelihood function is only meaningful for frequencies between 0 and 1. Outside this range it is defined as zero. Furthermore, the likelihood surface is rough with numerous local maxima. Therefore, the MLE method depends critically on a sensible choice of starting parameters. We chose the parameters of the variable binning method as starting parameters. As can be seen in Table 3, good estimates could be obtained for all populations.

To enable the forensic scientist to calculate frequency estimates based on the haplotype repository of the YHRD, we extended the existing online Frequency Surveying available at http://yhrd.org. The frequency estimates for a given haplotype are calculated for all meaningful Metapopulations (see Table 1). Their results are provided on a separate tab together with the database counts grouped either by Metapopulations or by continents (see Fig. 4). In a future release of the YHRD the results of the MLE method will be made available.

[()TD$FIG]

3.3. Maximum likelihood estimation of regression parameters

3.5. Casework example To demonstrate the application of this method in forensic casework we point out the following case (R v KOCH, NSW District Court, Australia, October 2008): A man was indicted for attempted

Fig. 1. Results of the bootstrapping method to determine the optimal number of bins for the Western European Metapopulation core. Black symbols represent the minimal average confidence interval width.

[()TD$FIG]88

S. Willuweit et al. / Forensic Science International: Genetics 5 (2011) 84–90

Fig. 2. Exponential regression of the expectation and standard deviation of the haplotype frequency (m and s) on haplotype similarity in the Western European Metapopulation core (W) based on the optimal number of bins (26).

[()TD$FIG]

Fig. 3. Distribution of haplotype observations vs. frequency estimates in the Western European Metapopulation core. All estimates are based upon the full Western European Metapopulation database (18,845 haplotypes, release 34, July 2010). ‘‘Original estimate’’ refers to the Frequency Surveying method described in [3,4] with a fixed bin size of 15. ‘‘New estimate’’ refers to the refined Frequency Surveying method presented here. Table 3 Exponential regression formula for each Metapopulation core for m and s. Metapopulation core

m(W) (binning)

s(W) (binning)

m(W) (MLE)

s(W) (MLE)

African – Afro American Afro Asiatic – Semitic East Asian – Japanese East Asian – Korean East Asian – Sino Tibetan – Chinese Eurasian – Altaic Eurasian – European – Eastern European Eurasian – European – South Eastern European Eurasian – European – Western European Eurasian – Indian Eurasian – Indo Iranian

exp(30.79W  11.69) exp(32.41W  11.59) exp(27.77W  10.71) exp(27.30W  10.96) exp(29.34W  11.74) exp(14.06W  8.80) exp(24.89W  11.98) exp(41.34W  13.09) exp(30.82W  13.17) exp(32.89W  11.24) exp(34.17W  10.96)

exp(21.57W  9.74) exp(25.33W  9.69) exp(22.84W  9.09) exp(23.58W  9.61) exp(18.16W  9.58) exp(11.03W  7.51) exp(22.98W  10.58) exp(32.12W  11.06) exp(28.95W  11.71) exp(24.74W  9.32) exp(26.04W  8.97)

exp(32.19W  15.15) exp(31.81W  13.23) exp(27.99W  12.71) exp(27.29W  11.62) exp(30.54W  13.48) exp(17.09W  14.96) exp(26.43W  15.67) exp(41.16W  14.91) exp(30.75W  13.45) exp(35.00W  15.44) exp(34.30W  11.23)

exp(25.55W  9.04) exp(28.50W  8.89) exp(25.07W  8.23) exp(25.39W  8.29) exp(18.00W  7.20) exp(13.68W  8.09) exp(24.98W  10.29) exp(36.77W  10.45) exp(31.88W  11.32) exp(28.53W  9.57) exp(27.86W  7.37)

m(W) (binning) and s(W) (binning): results of the variable binning methods with optimal number of bins (see Table 2); m(W) (MLE) and s(W) (MLE): results of the MLE method.

[()TD$FIG]

S. Willuweit et al. / Forensic Science International: Genetics 5 (2011) 84–90

89

Fig. 4. Screenshot of the online Frequency Surveying tool (YHRD release 34).

murder of his former girlfriend by strangulation. The glove he had assumedly used gave a clear Y-STR profile but no autosomal signals. Both Y-chromosomal profiles were matching. The Y-STR profile of the suspect, who had self-declared himself as Western European decent, was not found in the database (YHRD release 31) so we calculated a frequency estimate and compared it to the estimate of the counting method (f = 1/(N + 1)) based on the Western European Metapopulation sample. Both values were in the same scale (2.16  105 and 6.12  105) and therefore confirmed the haplotypes rareness and the significance of the match by none-observation due to its molecular composition. The accused confessed the crime and was convicted. 4. Discussion With almost 90,000 haplotypes contributed to the collaborative YHRD, this project shows the importance and the potential of YSTRs in the forensic casework. In a non-exclusion scenario, the YSTR frequency taken from a reference database is crucial to be known and could be used to assess a match between a stain and a suspect. Plain counting may lead to meaningless results (f = 0) and using the counting method (virtually include the profile of your query in the database) underestimates the discrimination power of Y-STRs. The counting method itself could be seen as a Bayesian approach with a uniform prior distribution. In contrast, the Frequency Surveying framework takes the allelic composition of each haplotype in a certain database (Metapopulation core) into account and generates prior and posterior frequency distributions. Since the prior distribution does not include the haplotype in question into the calculation, we are able to estimate frequency distributions even for unobserved haplotypes (Ni is set to 1). We extended this approach to:  allow a variable number of bins used to characterize the relation between the molecular composition of a haplotype and the frequency using bootstrapping;

 apply the theoretically superior method of MLE for the parameter estimation which is independent of a binning number;  handle bigger datasets;  extend this method to other meaningful Metapopulations than the European using an algorithm to score inclusion/exclusion of populations neutrally. We found, that together with the approaches covered here the Frequency Surveying Method can be expanded beyond the European Population sample used in the original publication [3] and is therefore a reliable and robust tool to estimate Y-STR haplotype frequencies for worldwide populations. Acknowledgements We invite all scientists to apply for data access at the YHRD. See http://yhrd.org/Research for more details. References [1] L. Roewer, Y chromosome STR typing in crime casework, Forensic Sci. Med. Pathol. 5 (2) (2009) 77–84. [2] C.H. Brenner, Fundamental problem of forensic mathematics—the evidential value of a rare haplotype, Forensic Sci. Int. Genet. January (2010). [3] L. Roewer, M. Kayser, P. de Knijff, K. Anslinger, A. Betz, A. Caglia, D. Corach, S. Fredi, L. Henke, M. Hidding, H.J. Krgel, R. Lessig, M. Nagy, V.L. Pascali, W. Parson, B. Rolf, C. Schmitt, R. Szibor, J. Teifel-Greding, M. Krawczak, A new method for the evaluation of matches in non-recombining genomes: application to Y-chromosomal short tandem repeat (STR) haplotypes in European males, Forensic Sci. Int. 114 (October (1)) (2000) 31–43. [4] M. Krawczak, Forensic evaluation of Y-STR haplotype matches: a comment, Forensic Sci. Int. 118 (May (2–3)) (2001) 114–115. [5] S. Willuweit, L. Roewer, International Forensic Y Chromosome User Group, Y chromosome haplotype reference database (YHRD): update, Forensic Sci. Int. Genet. 1 (June (2)) (2007) 83–87. [6] L. Roewer, M. Krawczak, S. Willuweit, M. Nagy, C. Alves, A. Amorim, K. Anslinger, C. Augustin, A. Betz, E. Bosch, A. Caglia, A. Carracedo, D. Corach, A.F. Dekairelle, T. Dobosz, B.M. Dupuy, S. Fredi, C. Gehrig, L. Gusmao, J. Henke, L. Henke, M. Hidding, C. Hohoff, B. Hoste, M.A. Jobling, H.J. Krgel, P. de Knijff, R. Lessig, E. Liebeherr, M. Lorente, B. Martinez-Jarreta, P. Nievas, M. Nowak, W. Parson, V.L. Pascali, G. Penacino, R. Ploski, B. Rolf, A. Sala, U. Schmidt, C. Schmitt, P.M. Schneider, R. Szibor, J. Teifel-Greding, M. Kayser, Online reference database of European Y-chromo-

90

S. Willuweit et al. / Forensic Science International: Genetics 5 (2011) 84–90

somal short tandem repeat (STR) haplotypes, Forensic Sci. Int. 118 (May (2–3)) (2001) 106–113. [7] M. Kayser, S. Brauer, S. Willuweit, H. Schdlich, M.A. Batzer, J. Zawacki, M. Prinz, L. Roewer, M. Stoneking, Online Y-chromosomal short tandem repeat haplotype reference database (YHRD) for U.S. populations, J. Forensic Sci. 47 (May (3)) (2002) 513–519. [8] R. Lessig, S. Willuweit, M. Krawczak, F.-C. Wu, C.-E. Pu, W. Kim, L. Henke, J. Henke, J. Miranda, M. Hidding, M. Benecke, C. Schmitt, M. Magno, G. Calacal, F.C. Delfin, M. Corazon, A. de Ungria, S. Elias, C. Augustin, Z. Tun, K. Honda, M. Kayser, L. Gusmao, A. Amorim, C. Alves, Y. Hou, C. Keyser, B. Ludes, M. Klintschar, U.D. Immel, B. Reichenpfader, B. Zaharova, L. Roewer, Asian online Y-STR haplotype reference database, Leg. Med. (Tokyo) 5 (March (Suppl. 1)) (2003) S160–S163. [9] L. Roewer, P.J.P. Croucher, S. Willuweit, T.T. Lu, M. Kayser, R. Lessig, P. de Knijff, M.A. Jobling, C. Tyler-Smith, M. Krawczak, Signature of recent historical events in

[10]

[11] [12] [13] [14]

the European Y-chromosomal STR haplotype distribution, Hum. Genet. 116 (March (4)) (2005) 279–291. L. Excoffier, P.E. Smouse, J.M. Quattro, Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data, Genetics 131 (June (2)) (1992) 479–491. M. Slatkin, Hitchhiking and associative overdominance at a microsatellite locus, Mol. Biol. Evol. 12 (May (3)) (1995) 473–480. B. Efron, Bootstrap methods: another look at the jackknife, Ann. Stat. 7 (1) (1979) 1–26. S. Kotz, N. Balakrishnan, N.L. Johnson, Continuous Multivariate Distributions, Volume 1: Models and Applications, 2nd edition, Wiley, 2000. R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria (2009) ISBN: 3-900051-07-0.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.