Maximising sensitivity for detecting changes in protein expression: Experimental design using minimal CyDyes

Share Embed


Descripción

Proteomics 2005, 5, 3105–3115

3105

DOI 10.1002/pmic.200500083

REGULAR ARTICLE

Maximising sensitivity for detecting changes in protein expression: Experimental design using minimal CyDyes Natasha A. Karp and Kathryn S. Lilley Department of Biochemistry, Cambridge University, Cambridge, UK

DIGE is a powerful tool for measuring changes in protein expression between samples. Here we assess the assumptions of normality and heterogeneity of variance that underlie the univariate statistical tests routinely used to detect proteins with expression changes. Furthermore, the technical variance experienced in a multigel experiment is assessed here and found to be reproducible within- and across-sample types. Utilising the technical variance measured, a power study is completed for several ‘typical’ fold changes in expression commonly used as thresholds by researchers. Based on this study using DeCyder, guidance is given on the number of gel replicates that are needed for the experiment to have sufficient sensitivity to detect expression changes. A two-dye system based on utilising just Cy3 and Cy5 was found to be more reproducible than the three-dye system. A power and cost-benefit analysis performed here suggests that the traditional three-dye system would use fewer resources in studies where multiple samples are compared. Technical variance was shown to encompass both experimental and analytical noise and thus is dependent on the analytical software utilised. Data is provided as a resource to the community to assess alternative software and upgrades.

Received: February 15, 2005 Revised: March 17, 2005 Accepted: March 17, 2005

Keywords: DIGE / Expression proteomics / Power / 2-DE / Variation

1

Introduction

2-DE is a common method utilised to simultaneously separate and quantitate thousands of proteins to assess expression changes in protein levels from one state to another. Proteins are separated according to their pI and molecular weight. The resulting spot patterns can be visualised by prelabelling the sample prior to the separation, e.g. radioactivity [1], fluorescence [2–5] or post-separation with total protein stains such as colloidal CBB [6], SYPRO Ruby or deep purple [7, 8]. The gels are converted to digital images using scanning devices and these images are processed to detect the Correspondence: Dr. Kathryn Lilley, Department of Biochemistry, University of Cambridge, Building O, Downing Site, Cambridge, CB2 1QW, UK E-mail: [email protected] Fax: 144-1223-333-345 Abbreviations: ASB14, amidosulfobetaine-14; ECA, Erwinia carotovora; log10SA, log10(standardised abundance); SA, standardised abundance

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

spots, quantitate the spot volumes and match the spots patterns across different gels. Statistical methods are then employed to detect protein spots with statistical significant changes in expression. Early comparative proteomics relied on using images from different gels. Gel-to-gel variation, however, led to problematic detection and quantification of differences in protein expression. A significant advancement in reproducibility was made by Ünlü et al. [9] who multiplexed fluorescently resolvable dyes (CyDyes) within a gel, which enabled the comparison of multiple samples within the same gel. Gel running effects are thus comparable within the same gel allowing a more accurate comparison of spot volume. This approach has been commercialised by GE Healthcare (Sweden) (formerly Amersham Bioscience). A multigel approach has been demonstrated to significantly improve accuracy of quantitation by using one dye to label a standard sample that is present in every gel [10]. This standard sample is used to match the spot patterns across a gel series and to calculate a standardised abundance (SA) value for a spot, which can then be compared across many gels. www.proteomics-journal.de

3106

N. A. Karp and K. S. Lilley

Two types of CyDyes are commercially available, Minimal CyDye and the Saturation CyDye. The first generation Minimal CyDye, supplied as the N-hydroxyl succinimidyl ester derivatives, reacts with the epsilon amino group of lysine residues in 5% of the proteins in a sample [11]. Recently, new dyes have become available that maximally label cysteine residues through a maleimide group [11], which increases the sensitivity thus reducing the quantity of sample required. This study focusses on the minimal dyes that are currently well established in the field. Regardless of visualisation method utilised, the data generated can be analysed with a variety of statistical approaches. These include thresholds above which a change is significant [3, 5, 12], univariate methods [2, 4] or multivariate methods [4, 13]. Univariate methods, e.g. Student’s ttest, examine a protein spot (a variable with a number of observations) individually to detect significant changes in expression in that protein’s expression. Multivariate methods, e.g. principle components analysis, utilise all the protein spot data (all observations for all variables) simultaneously, to look for patterns in expression changes. The univariate approach gives strong candidates that have had significant changes in expression, whilst the multivariate approaches can detect more subtle changes across sets of proteins that work in cohort such as those in a pathway. Both approaches have a role to play in data analysis, however the univariate method is currently the simplest to interpret; consequently, this study focusses on the use of univariate methods for data analysis. Univariate methods, such as the Student’s t-test, calculate the probability (p) that the groups to be compared are the same (i.e., there is no difference in protein expression) and any difference is arising from sampling variation. An expression change is deemed significant if the calculated p value falls below a prescribed level, typically 0.05 (the ‘nominal significance level’). Two types of errors are possible: Type I (a): A false-positive error occurs when a spot is declared to be differentially expressed erroneously, and Type II (b): A false-negative error occurs when the test fails to detect a differentially expressed spot. Power (1–b), is the ability of a univariate test to detect change that depends on the variance (noise), effect size (change in expression), number of replicates and nominal significance the researcher sets. To increase the power for a given technique, the researcher has most control over the number of replicates. Replication is hence necessary to distinguish between true difference in expression and random fluctuations, but increasing the number of replicates beyond a certain point has little impact on the power. Different types of replicates exist and are related to the noise source. In this study, technical variability is used to describe the fluctuations that are independent of the protein source and hence technical replicates refers to replicate gel from the same protein sample, whereas biological replicates refers to different extractions. Determin© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Proteomics 2005, 5, 3105–3115

ing the number of replicates in a technique requires an estimation of the variance. An undersized study will be a waste of resources, as it will not have the capacity to detect scientifically important changes as statistically significant, whilst an oversized study will use more resources than are necessary. Variance in 2-DE has been studied in a variety of visualisation systems [14–18], but not utilising the internal standard approach. For the DIGE system, the underlying statistical assumptions and the impact on experimental design have yet to be questioned in detail. Same-same DIGE experiments, where an identical sample is utilised across the multigel experiments, were completed for a wide range of biological samples. These data sets were then used to investigate the technical noise and to assess the underlying assumptions of data being normally distributed, where the variance is independent of mean (homogeneity of variance). The impact of different CyDye combinations on the noise is also considered. From the estimation of variance, power calculations are completed. This will enable researchers to be aware of the degree of variation within an experiment such that experiments can be adequately designed. This will ensure that the desired sensitivity is obtained allowing conclusions to be accurately drawn.

2

Materials and methods

2.1 Data sets To assess the reproducibility of technical noise, three sets of six same-same gels were obtained using an Erwinia carotovora (ECA) wild-type sample. In a same-same gel, three 50 mg portions of sample were labelled individually with Cy2, Cy3 and Cy5, pooled and separated by 2-DE as detailed in the following sections. To assess the reproducibility of technical noise across sample types, six same-same gels were obtained for mouse brain, mouse heart, mouse liver and the bacterial ECA soluble protein extract. 2.2 Sample preparation Bacterial samples were grown in liquid broth media (10 g/L Bacto Tryptone, 5 g/L Bacto Yeast Extract, 5 g/L sodium chloride) at 307C, agitated at 300 rpm overnight and harvested by centrifugation for 10 min at 47C at 5000 rpm. Cells were resuspended in lysis buffer (8 M urea, 2% w/w amidosulfobetaine-14 (ASB14), 5 mM magnesium acetate, 10 mM Tris pH 8.0 and protease inhibitor cocktail set I at 16 concentration (Calbiochem, Germany) and lysed by sonication (3 6 10 s pulses on ice). From a Wistar rat the brain, liver and heart tissue were harvested 24 h postbirth. Cells were homogenized in lysis buffer using a motorised pestle and cells lysed by three cycles of freeze thawing and sonication. www.proteomics-journal.de

Bioinformatics

Proteomics 2005, 5, 3105–3115

Following lysis of samples, the soluble protein fractions were harvested by centrifugation of the sample at 13 000 rpm for 10 min at 47C and the pellet discarded. Samples were precipitated using 100 mM ammonium acetate in methanol and the subsequent pellets resuspended in lysis buffer. The protein concentrations were determined using the Bio-Rad DC protein assay as described by the manufacturer (Bio-Rad, UK).

3107

2.6 Multigel analysis Gel analysis was performed using DeCyder BVA V5.0 (GE Healthcare), a 2-DE analysis software package, designed specifically to be used for DIGE, following manufacturer’s recommendations. The estimated number of spots for each codetection was set to 2500. 2.7 Assessing data properties and power

2.3 CyDye labelling Samples were labelled using the fluorescent Cyanine dyes developed for DIGE (GE Healthcare) following the manufacturer’s recommended protocols. Fifty micrograms of protein were labelled with 400 pmol of amine reactive Cyanine dyes, freshly dissolved in anhydrous dimethyl formamide. The labelling reaction was incubated at room temperature in the dark for 30 min and the reaction was terminated by addition of 10 nmol lysine. Equal volumes of 26 sample buffer (7 M urea, 2 M thiourea, 2% ASB14, 20 mg/mL DTT and 2% Pharmalytes 3–10) were added to each of the labelled protein samples and the three samples were mixed. Rehydration buffer (7 M urea, 2 M thiourea, 2% ASB14, 2 mg/mL DTT and 1% Pharmalytes 3–10) was added to make up the volume to 250 mL prior to IEF.

Normalised spot volume was exported for spots that had been matched across all gels in the series and was used to calculate the SA. Normality of the SA for every spot was assessed using the Shapiro-Wilk statistical test. The ShapiroWilk test is a goodness-of-fit test to assess whether a random sample comes from a normal distribution and was developed for small samples [19]. Spots with a significance score less than 0.05 were considered failed with a non-normal distribution. The Lenth power tool [20] was used to calculate the power for various variances. Within DeCyder the log10(standardised abundance) (log10SA) is used in the statistical tests, whilst for the ratio change the SA values are used. To calculate the effect size the following arguments were used where gp1 refers to group 1 and gp2 refers to group 2: Null hypothesis )log10SA(gp1) 2 log10SA(gp2)) = 0

2.4 Protein separation by 2-DE Nonlinear IPG strips (13 cm long), pH 3–10 (GE Healthcare) were rehydrated with CyDye-labelled samples for 10 h at 207C at 20 V using the IPGphor II apparatus following manufacturer’s instructions (GE Healthcare). IEF was performed for a total of 40 000 Vh at 207C at 10 mA. Prior to SDS-PAGE, the strips were each equilibrated for 15 min in 100 mM Tris pH 6.8, 30% glycerol, 8 M urea, 1% SDS, 0.2 mg/mL bromophenol blue on a rocking table. The strips were loaded onto a 12%, pH 8.8, 13 cm (1 mm thick) acrylamide gel with a 1 cm 4%, pH 6.8, stacker gel. The strips were overlaid with 1% agarose in SDS running buffer containing 5 mg of bromophenol blue. The gels were run at 20 mV for 15 min and then at 40 mV at 207C until the bromophenol blue dye front had run off the bottom of the gels. A running buffer of 25 mM Tris pH 8.3, 192 mM glycine, and 0.1% SDS was used.

(1)

Alternative hypothesis )log10SA(gp1) 2 log10SA(gp2)) = effect size (2) Using the mathematical rule log(X/Y) = logX 2 logY the alternative hypothesis can be rearranged to: Alternative hypothesis )log10(SAgp1/SAgp2)) = effect size (3) DeCyder calculates the ratio change between the average group one SA and average group two SA, hence the ratio (SAgp1/SAgp2) can be substituted with the fold change required in Eq. (3): Alternative hypothesis )log10(fold change)) = effect size (4) Hence for a 1.5-fold change, the effect size will be 0.176.

3

Results

2.5 Gel imaging 3.1 Normality Labelled proteins were visualized using a Typhoon™ 9410 imager (GE Healthcare). The Cy3 images were scanned using a 532 nm laser and a 580 nm band pass (BP) 30 emission filter. Cy5 images were scanned using a 633 nm laser and a 670 nm BP30 emission filter. Cy2 images were scanned using a 488 nm laser and an emission filter of 520 nm BP40. All gels were scanned at 100 mm resolution. The PMT was set to ensure a maximum pixel intensity between 40 000 and 60 000 pixels. © 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

The univariate statistical tests currently being applied to identify spots with significant changes in expression assume that the distribution of scores is normal. Frequently, transformations (e.g., a log function) are applied to improve the distribution characteristics. Depending on the software, both log10SA (DeCyder, GE Healthcare) and SA (Progenesis, Nonlinear Dynamic and Delta2D, Decodon) are utilised with DIGE data. Table 1 shows the number of protein spots that www.proteomics-journal.de

3108

N. A. Karp and K. S. Lilley

Proteomics 2005, 5, 3105–3115

Table 1. Assessing normality of the SA using the Shapiro-Wilk goodness-of-fit test

Dataset

Number of protein spots

Data type

Percentage spots with a significance score , 0.05

ECA-1

980

ECA-2

561

ECA-3

906

Heart

784

Liver

747

Brain

830

SA Log10(SA) SA Log10(SA) SA Log10(SA) SA Log10(SA) SA Log10(SA) SA Log10(SA)

4.5 2.1 12.2 8.2 8.5 5.2 4.8 3.3 3.3 4.0 2.9 2.7

failed the normality test for each data set obtained from DeCyder for both SA and log10SA. Data set ECA-2 had a lower number of protein spots matched across the gel set as two of the gels in the series were less well resolved. Across the data sets an average of 6.0 and 4.0% of the spots for the SA and log10SA respectively failed the normality test. As 5% of spots will fail from random sampling effects, the assumption of normality for both the SA and log10SA was found to hold. 3.2 Homogeneity of variance An underlying assumption in the statistical tests being applied is that samples are obtained from populations of equal variance (described as homogeneity of variance). Variance heterogeneity can be divided into two main types. In the first type the variance depends on the mean of the signal, whilst in the second it depends on the experimental conditions. The first type is inherent in the technique, but can be reduced by the appropriate application of variance stabilising transformations and thus this type of variance heterogeneity is investigated within this article. Within the microarray community this problem has been considered extensively, and various variance stabilisation transformations, such as the logarithmic [21] or the arsinh function [22], have been proposed. To assess the homogeneity of variance, the dependence of the SD (variance) as a function of signal intensity of the SA and log10SA was assessed by visual plots and a Pearson test of correlation (Table 2 and Fig. 1). For very large samples (i.e., n . 100) very small correlation can become significant [23], hence the percentage variance shared was calculated for those with significant correlation (p , 0.01) to assess the size of the relationship. Unlike the study by Gustafsson et al. [14] on 2-DE with 35 S labelling who found a strong dependence between variance and intensity, only a weak sample dependent correla tion was found between the SD and mean signal. For all data © 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Figure 1. Assessing homogeneity of variance by examining the relationship between mean and SD of the log10SA signal across a variety of data sets. A: brain, B: Liver and C: ECA-1.

www.proteomics-journal.de

Bioinformatics

Proteomics 2005, 5, 3105–3115 Table 2. Assessing heterogeneity of variance by looking for a correlation between mean and SD of the SA and log10SA data obtained for a same-same study across a variety of samples

Dataset

Data type

Percentage variance shared

ECA-1

SA Log10(SA) SA Log10(SA) SA Log10(SA) SA Log10(SA) SA Log10(SA) SA Log10(SA)

59.9 21.2 63.0 25.2 83.7 40.8 34.6 9.7 18.8 3.0 15.8 Not significant

ECA-2 ECA-3 Heart Liver Brain

sets, logging the SA reduces the correlation, and for the heart, liver and brain data sets no significant correlation was found after applying the log transformation. For these same-same studies, the log10SA ratio should centre around zero. For the ECA data sets a significant proportion of data exhibited the expected cluster around zero, however a group of spots show deviation from this and appears to show a correlation between the mean and SD. These outliers are proposed to arise from a protein dependent labelling artefact. For the mean value to deviate from zero in these same-same studies, it suggests that for these spots preferential labelling is occurring (Fig. 2 for an example). If this preferential labelling effects only the Cy3 or Cy5 reaction then the SD would automatically increase as the log10SA distribution would become bimodal. Furthermore, the larger the difference in preferential labelling as shown by a higher mean value, the more difference would be seen between the SA values obtained from the Cy3/Cy2 versus the Cy5/Cy2 system leading to a higher SD. Hence a correlation would be obtained between the mean and SD. Studies later in this manuscript

3109

(Section 3.5) also suggest that preferential labelling is a variable process as shown by an increase in the SD for these spots. Protein specific preferential labelling by a CyDye has previously been reported by Tonge et al. [16]. Since this effect is protein dependent, a change in signal due to an expression change would not trigger a change in the variance. From these studies, we can conclude that after a log transformation the correlation is minor such that the assumption of homogeneity of variance will hold. Consequently, further studies within this manuscript focus on the log10SA.

3.3 Reproducibility of the variance To assess reproducibility of variance, the same-same approach was completed three times for the same ECA sample and then across an additional three sample types (liver, heart and brain tissue). Frequency plots of the SD (variance) experienced in a data set highlighted that a nonnormal distribution of scores is obtained (Fig. 3); conse-

Figure 3. The frequency distribution obtained for SD of the log10SA from the ECA-1 data set as an example of a typical distribution.

Figure 2. An example of the preferential labelling effect that is seen with the ECA sample. Gel image A was obtained when an ECA wildtype sample was labelled with Cy3, whilst gel image B was obtained when the same sample was labelled with Cy5. The majority of spots have a similar intensity; the spot in the upper right side of the image is significantly more intense in the Cy5 image.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

www.proteomics-journal.de

3110

N. A. Karp and K. S. Lilley

Proteomics 2005, 5, 3105–3115

Figure 4. SD versus the percentile position for various data sets. (A) Impact of including the internal standard sample on the noise experienced is shown by comparing the SD experienced for the ECA-1 data set with and without the internal standard sample. (B) Reproducibility of SD obtained across the various samples. Where percentile position of the SD is the relative position of a SD value in the range of values obtained. For example, the 65th percentile is defined as the lowest score that is larger than 65% of the scores obtained.

quently confidence boundaries cannot easily be obtained. To allow comparison across data sets, the SD obtained at various percentile positions was used to compare the scores obtained at various points within the distribution (Fig. 4B). To give the SD context, the values obtained were compared to the values obtained if an internal standard approach was not utilised (Fig. 4A). The variation seen across samples was greater than the reproducibility within samples (Fig. 4B), however the variation seen is low and it can be concluded that the dependence on sample type is not significant. 3.4 Power To assess sample size requirements for a study, a power analysis was completed for typical fold changes that researchers are interested in (target fold changes) (Fig. 5). In © 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

general, a target power of 0.8 is recommended [23]. Power depends on the variance, significance level, size of difference between means (effect size) and the sample size. The average variance calculated in the same-same study was used as an estimate of the technical noise that encompassed 75% of the spots (Section 3.3). The significance level of 0.01 was utilised, as this is recommended by the manufacturers and used currently by the community. In expression studies many thousands of statistical tests are conducted, one for each protein spot, and a substantial number of false positives may accumulate. This problem is called the problem of multiple testing. Various approaches have been discussed and utilised within the microarray field to address this problem and frequently involve adjustments to the significance level [24]. The multiple testing issue is complex to address as not all the spots and hence tests are independent. This lack of www.proteomics-journal.de

Proteomics 2005, 5, 3105–3115

Bioinformatics

3111

Figure 5. Relationship between power and number of replicates in detecting various fold change when the variance encompasses the technical noise seen with 75% of spots. For the target power of 0.8, four replicates are required for a two-fold change, seven replicates for a 1.5fold change and 18 replicates for a 1.25-fold change.

Figure 6. The average SD obtained across the various data sets when using different dye combinations to calculate the log10SA.

independence arises because proteins in vivo influence each other in complex interactions and a protein can be represented in multiple positions. The issue with 2-DE has greater complexity compared to microarray systems due to the presence of artefact spots, which could be generated by similar effects/system artefacts and thus could be connected. The issue of multiple testing is consequently beyond the scope of this study and no adjustment is currently made for this effect. The effect of increasing replicate number on power and hence the ability to detect expression changes can be seen clearly (Fig. 6). As the fold change decreases the number of replicates increases significantly. By considering power it can be seen that only utilising three replicates in an experiment © 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

will result in only a 65% chance of detecting a two-fold change, by adding just one more replicate this can be increased to a 95% chance avoiding false negatives. 3.5 Impact of CyDye type usage In an earlier study, Tonge et al. [16] considered the reproducibility of results in a pair-wise comparison for different CyDye combinations and found that Cy3 versus Cy5 was ‘marginally less variable and more reliable than other dye combination’. The Cy2 dye is a weaker fluor (unpublished observations) and hence the signal is closer to background. For each data set the log10SA was calculated using different dye combinations and the average SD calculated across the www.proteomics-journal.de

3112

N. A. Karp and K. S. Lilley

Proteomics 2005, 5, 3105–3115

Figure 7. The calculated power as a function of number of replicates for various CyDye combinations in (A) detecting a two-fold change and (B) detecting a 1.5-fold change.

spot series. The Cy3 and Cy5 combination was found to give lower variance (Fig. 6). The assumptions of normality and heterogeneity of variance was assessed and found to hold for each combination (Table 3). A slightly higher average (7.2 vs. 6.0%) failure rate was observed compared to the earlier study (Section 3.1). This probably arises from the lower number of data points available (6 data points rather than 12 in each study) increasing the effect of outliers. For some dye combinations the ECA data sets exhibited more correlation in the mean versus SD. From earlier studies (Section 3.2), the ECA sample is known to have a protein preferential labelling issue, which leads to mean values for a subset of spots deviating from zero. The correlation between SD and mean for these spots suggests that the pref© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

erential labelling artefact is variable and thus increases the noise seen. For these spots specifically, the power to detect change will be reduced. The variance figures that encompass 75% of the spots were used to complete a power calculation to investigate whether the difference in variance was significant enough to affect the number of replicates required (Fig. 7). For both a two-fold and a 1.5-fold change the difference in variance lowered the number of replicates required for a power greater than 0.8. Utilising Cy3 with Cy5 gave the lowest noise and hence reduced the number of replicates required. However, the more traditional (Cy3 or Cy5 over Cy2) approach allows two data points to be obtained from one gel and utilises only one standard sample for two data points. A costwww.proteomics-journal.de

Bioinformatics

Proteomics 2005, 5, 3105–3115 Table 3. Assessing normality and heterogeneity of variance for the various possible dye combinations across the data sets for the log10SA. Normality was assessed using the Shapiro-Wilk goodness-of-fit test. Heterogeneity of variance was investigated by looking for a correlation between the mean and SD for each spot

Dataset

ECA1

ECA2

ECA3

Heart

Liver

Brain

Dye combination

Percentage spots failed normality

Percentage variance shared

Cy3/Cy5 Cy3/Cy2 Cy5/Cy2 Cy3/Cy5 Cy3/Cy2 Cy5/Cy2 Cy3/Cy5 Cy3/Cy2 Cy5/Cy2 Cy3/Cy5 Cy3/Cy2 Cy5/Cy2 Cy3/Cy5 Cy3/Cy2 Cy5/Cy2 Cy3/Cy5 Cy3/Cy2 Cy5/Cy2

5.8 6.4 9.3 8.0 4.4 6.9 7.7 7.9 6.3 6.1 8.0 5.2 7.2 6.9 6.7 7.8 9.5 10.1 7.2

2.3 14.1 8.3 26.3 23.0 45.9 6.9 26.1 1.0 17.6 12.2 3.8 12.9 4.5 1.0 26.1 1.0 1.4 13.8

Average

3113

was the same but the three-dye system required fewer gels. Using fewer gels will lead to less variability as the study can be completed more rapidly with less separate gel batches. Considering variability, the traditional three-dye system using fewer resources where multiple samples are compared is the most suitable. 3.6 Impact of analysis software utilised The technical variance reported includes not only the technical noise from repeating the process experimentally but also analysis noise. Processing the same images from a data set with alternative DIGE analysis software gave different results for the assessment of variance (data not shown). This will arise from differences in the detection of spots, assessment of background, normalisation methods utilised etc. Hence, the same-same gel images for data set ECA-1 are provided as Supplementary Figures to allow assessment of alternative software and software updates. To allow comparison with this study Table 5 provides the variance data calculated at various percentile positions for the ECA-1 data set.

4

benefit analysis was completed to consider the number of gels and CyDye aliquots required for the two-dye approach (Cy3/Cy5) combination versus the three-dye approach (Cy3orCy5/Cy2) (Table 4), where a dye aliquot refers to the quantity of dye added to each labelling reaction. The cost-benefit analysis found that the two-dye system at low sample numbers used less or the same number of dye aliquots depending on the target fold change. When comparing higher sample number, the amount of dye required

Discussion

Multiplexing by using an internal standard has simplified the process of multigel experiments, allowing collection of multiple observations for each protein spot. Underlying the subsequent data analysis to select proteins with expression changes by the use of univariate and multivariate approaches are assumptions of normality and homogeneity of variance. Given that less than 5% of spots failed the normality test, the assumption of normality for the log10SA was found to hold. The assumption of homogeneity of variance was also found to be robust. With these assumptions holding the univariate tests will be robust without an inflation of type one errors. Further improvements by the application of transformation will occur but in terms of assessing whether the current

Table 4. A cost-benefit analysis of only utilising the Cy3 and Cy5 CyDyes in calculating the log10SA compared to the more traditional three-dye approach. Impact on number of gels and number of CyDye aliquots for a two-fold and 1.5-fold change to compare two and six samples was considered for both approaches. Aliquot refers to the quantity of dye added to each labelling reaction

System

Fold change

Replicates for power 0.8

Number of samples to compare

Number of gels required

Number of dye aliquots required

2 dye: Log(Cy3/Cy5) 3 dye: Log(Cy3or5/Cy2) 2 dye: Log(Cy3/Cy5) 3 dye: Log(Cy3or5/Cy2) 2 dye: Log(Cy3/Cy5) 3 dye: Log(Cy3or5/Cy2) 2 dye: Log(Cy3/Cy5) 3 dye: Log(Cy3or5/Cy2)

1.5 1.5 1.5 1.5 2 2 2 2

3 6 3 6 3 4 3 4

2 2 6 6 2 2 6 6

6 6 18 12 6 4 18 12

12 18 36 36 12 12 36 36

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

www.proteomics-journal.de

3114

N. A. Karp and K. S. Lilley

Proteomics 2005, 5, 3105–3115

Table 5. SD value experienced at various percentile positions in the ECA-1 data set for different dye combinations that can be utilised in the calculation of the log10SA

Dye combination used in calculation of log10SA Percentile Cy3or5/cy2

Cy3/Cy2

Cy5/Cy2

Cy3/Cy5

1 5 10 15 25 35 50 75 85 90 95 100

0.007 0.012 0.015 0.018 0.024 0.028 0.036 0.057 0.074 0.086 0.112 0.249

0.008 0.012 0.014 0.016 0.020 0.025 0.032 0.048 0.060 0.069 0.094 0.210

0.006 0.011 0.014 0.016 0.020 0.024 0.030 0.043 0.053 0.060 0.069 0.144

0.011 0.016 0.020 0.023 0.027 0.033 0.043 0.067 0.085 0.101 0.127 0.223

approach is appropriate, the data have been found to be robust. James Lyons-Weiler [25] has recently stated ‘the question of optimising analysis is perennial, due to the complex manner in which random and systematic errors creep into genomic and proteomic data streams’. Incorporation of an internal standard sample through the ability to multiplex samples has significantly increased the ability to detect protein expression changes by increasing the reproducibility. Variation is still expected and unavoidable; hence the estimation of variance is fundamental to the design of cost-efficient experiments. If the power of an experiment is low, then there is a good chance that the experiment will be inconclusive because failing to detect changes does not mean that biological significant changes were not occurring just that a statistical significant change could not be detected. Within this study, the technical variance has been investigated to provide a benchmark to the minimum number of replicates needed for a study for a set of fold change. Frequently studies utilise biological replicates; for these systems greater noise will be experienced as each replicate will have technical and biological noise. This study thus provides a guide to the minimum number of replicates needed. The technical variance was found to be reproducible and fairly independent of sample. Many published studies currently have little power, yet changes are being detected, as some of the expression changes will be robust enough to be reliable, but the number of changes will be lower than if sufficient replicates had been utilised. This power study demonstrates the role of replicates in detecting change, and shows the potential for adding a few extra replicates with a slight increase in cost that significantly adds to the sensitivity of the experiment. The number of replicates to be used depends on the user weighing the cost benefits versus detection ability, and knowing the size of changes they are interested in. This will rely on the researcher having prior understanding of the system as when subtle changes in expression are expected, the impact of low power will be significant. © 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Calculating the log10SA from just the Cy3 and Cy5 dyes leads to less noise, which is proposed to arise from avoiding the weaker Cy2 fluor. A power and cost-benefit analysis was completed when using the two-dye versus the traditional three-dye approach, and found that in the majority of cases the traditional three-dye system used fewer resources in terms of gels. Experimental variation has been assessed within this laboratory to provide guidelines for determining a significant change in expression. These guidelines, however, are based on the technical variation occurring within this laboratory. These values can only be taken as typical, as the technical variation in different laboratories could vary with different equipments and protocols. This paper, however, provides a framework by which laboratories can assess technical variation. Technical variation was shown to encompass both experimental and analysis noise, hence the calculated noise and power is dependent on the software used. Consequently, some of the same-same data utilised within this is provided as supplementary data to allow each laboratory to assess their own software.

This work was supported by a BBSRC Grant (BB/C50694/ 1), which also funds Dr. N. Karp as a BBSRC research associate. We would like to thank Paul McCormick for useful discussions regarding the analysis approaches used and Joe Byers, Sarah Coulthurst and Helen Bye for provision of samples. CyDye, DeCyder, Ettan, Typhoon, ImageQuant and Cy are trademarks of GE Healthcare.

5

References

[1] Norbeck, J., Blomberg, A., Yeast 1997, 13, 1519–1534. [2] Yan, J. X., Devenish, A. T., Wait, R., Stone, T. et al., Proteomics 2002, 2, 1682–1698. [3] Hu, Y., Wang, G., Chen, G. Y., Fu, X., Yao, S. Q., Electrophoresis 2003, 24, 1458–1470. [4] Kleno, T. G., Leonardsen, L. R., Kjeldal, H. O., Laursen, S. M. et al., Proteomics 2004, 4, 868–880. [5] Bergh, G. V. D., Clerens, S., Vandesande, F., Arckens, L., Electrophoresis 2003, 24, 1471–1481. [6] Fievet, J., Dillmann, C., Lagniel, G., Davanture, M. et al., Proteomics 2004, 4, 1939–1949. [7] Smejkal, G. B., Robinson, M. H., Lazarev, A., Electrophoresis 2004, 25, 2511–2519. [8] Chevalier, F., Rofidal, V., Vanova, P., Bergoin, A., Rossignol, M., Phytochemistry 2004, 65, 1499–1506. [9] Ünlü, M., Morgan, M. E., Minden, J. S., Electrophoresis 1997, 18, 2071–2077. [10] Alban, A., David, S. O., Bjorkesten, L., Andersson, C. et al., Proteomics 2003, 3, 36–44. [11] Shaw, J., Rowlinson, R., Nickson, J., Stone, T. et al., Proteomics 2003, 3, 1181–1195. [12] Lee, J. R., Baxter, T. M., Yamaguchi, H., Wang, T. C. et al., Appl. Immunohistochem. Mol. Morphol. 2003, 11, 188–193.

www.proteomics-journal.de

Proteomics 2005, 5, 3105–3115

Bioinformatics

3115

[13] Karp, N., Griffin, J., Lilley, K., Proteomics 2005, 5, 81–90.

[20] Lenth, R., Am. Stat. 2001, 55, 187–193.

[14] Gustafsson, J. S., Ceasar, R., Glasbey, C. A., Blomberg, A., Rudemo, M., Proteomics 2004, 4, 3791–3799.

[21] Chen, Y., Dougherty, E., Bittner, M., J. Biomed. Opt. 1997, 2, 364–374.

[15] Molloy, M., Brzezinski, E., Hang, J., McDowell, M., VanBogelen, R., Proteomics 2003, 3, 1912–1919. [16] Tonge, R., Shaw, J., Middleton, B., Rowlinson, R. et al., Proteomics 2001, 1, 377–396. [17] Blomberg, A., Blomberg, L., Norbeck, J., Fey, S. J. et al., Electrophoresis 1995, 16, 1935–1945.

[22] Huber, W., Heydebreck, A. V., Sültmann, H., Poustka, A., Vingron, M., Bioinformatics 2002, 18, 96S-104S. [23] Pallant, J., SPSS Survival Manual – A Step by Step Guide to Data Analysis Using SPSS (Version 10 and 11), Open University Press, Buckingham 2003, pp. 1–285.

[18] Burstin, J., Zivy, M., de Vienne, D., Damerval, C., Electrophoresis 1993, 14, 1067–1073.

[24] Bolstad, B. M., Collin, F., Simpson, K. M., Irizarry, R. A., Speed, T. P., Int. Rev. Neurobiol. 2004, 60, 25–58.

[19] Shapiro, S. S., Wilk, M. B., Biometrika 1965, 52, 591–611.

[25] Lyons-Weiler, J., Appl. Bioinform. 2003, 2, 193–195.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

www.proteomics-journal.de

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.