High-Frequency Genetic Contents Variations in Clinical Candida albicans Isolates

Share Embed


Descripción

624

Biol. Pharm. Bull. 34(5) 624—631 (2011)

Regular Article

Vol. 34, No. 5

High-Frequency Genetic Contents Variations in Clinical Candida albicans Isolates Feng YANG,a,# Tian-Hua YAN,a,# Elena RUSTCHENKO,b Ping-Hui GAO,c Yan WANG,c Lan YAN,c Ying-Ying CAO,c Qiu-Juan WANG,c Hui JI,*,a Yong-Bing CAO,*,c and Yuan-Ying JIANGc a

Department of Pharmacology, School of Pharmacy, China Pharmaceutical University; Nanjing 210009, China: Department of Biochemistry and Biophysics, School of Medicine and Dentistry, University of Rochester Medical School; Rochester, N.Y., 14642, U.S.A.: and c Department of Pharmacology, School of Pharmacy, Second Military Medical University; Shanghai 200433, China. Received October 25, 2010; accepted January 24, 2011; published online February 16, 2011 b

Genome plasticity is a hallmark of Candida albicans and is believed to be an adaptation strategy. But the extent of such genomic variability is not well investigated. In this study, genetic contents of clinical C. albicans isolates were investigated at whole-genome level with array-based comparative genomic hybridization (array CGH) technology. It was revealed that C. albicans possessed variations of genetic contents, as well as aneuploidy. The variable genes were scattered across the chromosomes, as well clustered in particular regions, including subtelomeric regions, retrotransposon-insertion sites and a variable region on chromosome 6. Key words

Candida albicans; comparative genomic; copy number variation; array-based comparative genomic hybridization

Candida albicans is the most common fungal pathogen, causing skin and mucosal infections in generally healthy individuals, life-threatening infections in immunocompromised patients, and leading to death in up to 50% of patients with bloodstream infections.1,2) C. albicans is known for its unstable genome. The instability of the chromosome copy number of entire chromosomes, as well as the large portions of chromosomes was extensively studied with laboratory and freshly isolated strains using pulse-field gene electrophoresis (PFGE) as reviewed by Rustchenko,3) Rustchenko and Sherman,4) and Selmecki et al.5) Combined with Southern blot analysis, PFGE allowed limited analysis of gene copy number, and also could be extended to the analysis of chromosome deletions and other rearrangements.6—15) Recently, array technology opened a new dimension in the study of genome instability. Array comparative genomic hybridization (array CGH) allows the detection of genomic variations across a whole genome. When the CGH intensity data are plotted as a function of position on the genetic map, aneuploidy of chromosomes or chromosomal segments are readily identified. The availability of the C. albicans strain SC5314 genome sequence has allowed the construction of microarrays for the analysis of gene copy number. Berman’s laboratory largely used array CGH to demonstrate aneuploidies in C. albicans derivatives Table 1.

of the sequencing strain SC5314, several laboratory strains, as well as clinical isolates, as reviewed by Selmecki et al.5) Also, Thewes et al.16) used array CGH to elucidate the genomic diversity among C. albicans less virulent strain ATCC10231 and the reference sequencing strain SC5314 in the hope to uncover genetic basis of pathogenicity. Although some variable genes were identified, this study was limited to a single strain. Despite the extensive effort of various laboratories, a comprehensive study, which would include various strains and which would focus on the DNA sequence and gene copy number variability is still lacking, although this approach was applied to other organisms, including Saccharomyces cerevisiae.17,18) In order to fill this need, we examined the genomic contents of eight clinical isolates, as compared to the SC5314 reference strain, using array CGH. The Cluster Along Chromosomes (CLAC) algorithm was employed to identify variable genes. MATERIALS AND METHODS Isolates and Culture Conditions A list of isolates used in this study is provided in Table 1. SC5314 was kindly provided by William A. Fonzi (Department of Microbiology and Immunology, Georgetown University, Washington, D.C.,

C. albicans Isolates Used in This Study No. of unstable genes

Patient

1 2 3 4 5 6 7 8

Isolate

U885 S204 S727 F32 S241 S904 P546 S197

Anatomical source

Urine Sputum Sputum Feces Sputum Sputum Pharynx Sputum

∗ To whom correspondence should be addressed. # These authors contributed equally to this work.

Isolation date

08-11-06 13-02-06 26-05-06 04-10-06 08-08-06 25-07-06 19-04-06 12-05-06

Loss

Gain

Strainspecific

111 291 88 160 302 282 125 92

31 5 9 2 10 11 0 354

75 74 50 53 62 37 55 356

e-mail: [email protected]; [email protected]

False discovery rate (FDR)

0.10 0.08 0.15 0.09 0.07 0.08 0.12 0.03

© 2011 Pharmaceutical Society of Japan

May 2011

U.S.A.). Clinical isolates were obtained from Changhai Hospital of Shanghai, China. All samples were maintained as 80 °C stocks in 30% glycerol. All isolates were cultivated in YEPD (1% yeast extract, 2% peptone, 2% glucose) at 30 °C, with 200 rpm agitation. Genotyping Analysis Polymerase chain reaction (PCR) for MTL status and the ribosomal RNA (rRNA) gene transcribed spacer region was done as previously described.19) DNA Isolation C. albicans isolates stored at 80 °C were streaked on a Sabouraud agar plate. After incubation overnight at 30 °C, several colonies were collected and inoculated overnight in 5 ml YEPD medium at 30 °C, harvested and washed with distilled water, resuspended in 200 m l lysis buffer (2% Triton X-100, 1% sodium dodecyl sulfate (SDS), 100 mM NaCl, 1 mM ethylenediaminetetraacetic acid (EDTA), 10 mM Tris, pH 8.0). DNA was isolated as described by Hoffman and Winson.20) DNA was purified using the PCR Clean-up NucleoSpin Extract II Kit (Macherey-Nagel, Germany) according to manufacturer’s instructions. Microarray Production For the production of spotted DNA-microarrays, 7925 70 mer oligonucleotides targeting the ORFeome of C. albicans were printed triplicate on amino silaned glass slides using a SmartArrayerTM microarrayer (CapitalBio Corp.). Prior to hybridization, the slides were rehydrated over 65 °C water for 10 s, UV cross-linked at 250 mJ/cm2. DNA Labeling for Array CGH Analysis The generation of DNA fragments by sonication was performed with 10 m g buffered DNA sample. For each labeling reaction, 3.5 m g of fragmented DNA and 4 m g of random nonamer were heated to 95 °C for 3 min and snap cooled on ice, then 10Klenow buffer, dNTPs and Cy5-dCTP or Cy3-dCTP (GE HealthCare) were added at final concentrations of 120 m M each dATP, dGTP, dTTP, 60 m M dCTP and 40 m M Cydye, respectively. Klenow enzyme (1 m l, Takara, Dalian, China) was added and reaction was performed at 37 °C for 1 h. The labeled DNA was purified with a PCR Clean-up NucleoSpin Extract II Kit. All samples had dye-swap replicates to remove any dye bias. Microarray Hybridization, Scanning and Data Processing For array CGH, the labeled control and test samples were mixed into 80 m l hybridization solution (3SSC, 0.2% SDS, 50% formamide). DNA in hybridization solution was denatured at 95 °C for 3 min prior to loading on the microarray. The arrays were hybridized at 42 °C overnight and washed with two consecutive washing solutions (0.2% SDS, 2SSC for 5 min at 42 °C and 0.2% SSC for 5 min at room temperature). Two types of arrays were performed under the same experimental conditions; one was a normal array (reference/reference hybridization) and the other test arrays (reference/test hybridization). Arrays were scanned with a confocal LuxScanTM scanner (CapitalBio Corp.), and the data of obtained images were extracted with LuxScan 3.0 software (CapitalBio Corp). A spatial and intensity-dependent normalization based on a LOWESS program was employed.21) The normalized log2 (test/control) ratio of signal intensity was considered as a measure of the relative abundance of each gene relative to that of the reference isolate SC5314. We used CGH-Miner

625

for statistical analysis of DNA copy number gains and losses.22) CGH-Miner uses a “Cluster Along Chromosomes (CLAC)” algorithm, which builds a hierarchical cluster-style tree along each chromosome (or chromosome arm), and the neighboring genes with positive and negative ratios are separated into different clusters. Gains and losses are then called significantly based on the height and width of clusters, and a false discovery rate (FDR) is estimated by comparison to normal–normal hybridization data. Consensus FDR, which is an estimator of the consensus result of the gain/loss across all samples, is also calculated. For data smoothing, the parameters were set for BAC analysis, to produce a moving window of three ORFs for averaging the hybridization signal.18) And the cluster tree was built on whole chromosomes. Variable genes identified by array CGH were validated by quantitative real-time PCR (qPCR) using 7500 Real Time PCR system (Applied Biosystems) and SYBR Green I (Takara Bio, Tokyo, Japan). The DNA copy number of the variable genes was determined relative to the gene TDH3 (orf19.6814), a reference gene that array CGH showed not to vary in all the clinical isolates. We used the comparative Ct method (2DDCt) to determine target gene copy number in the test isolates relative to the reference gene and the reference DNA sample of SC5314.23,24) Raw data have been deposited in NCBIs Gene Expression Omnibus (GEO) and are accessible through GEO series accession number GSE18819. Functional annotations and GO term association was done following Candida Genome Database (CGD) annotations. RESULTS Characterization of Isolates A total of eight C. albicans isolates were isolated from eight different patients attending the same hospital in the year 2006 (Table 1). All the test isolates and the reference isolate SC5314 were maintained on Sabouraud agar plates at 4 °C or as 80 °C stocks in 30% glycerol. Initially, the isolates were streaked for independent colonies on CHROMagar medium (CHROMagar Company, Paris, France) and incubated, as recommended by manufacturer. If the green color of the colonies indicated C. albicans, we then performed, in addition, polymerase chain reaction (PCR) with primers that amplified the ITS1 region of ribosomal DNA (rDNA), which also designated the ATP-binding cassette (ABC) type of each isolate (ABC type).19) ABC typing revealed that isolates SC5314, U885, S204 and S727 were of genotype A, isolates F32, S241 and S9-04 were of genotype B, and the rest two isolates (P546, S197) were of genotype C (data not shown). Statistical Analysis of Array CGH Data We estimated the DNA content of each of eight clinical isolates with array CGH approach, as compared to a control sequencing strain SC5314. Every microarray contained 19056 probes representing 6111 ORFs of SC5314 (Materials and Methods). We calculated the DNA content of every gene, as the ratio test/control and, subsequently, averaged the six values corresponding to six data points (Materials and Methods). Furthermore, we used CGH-Miner for statistical analysis of DNA copy number gains and losses (Materials and Methods). In order to determine if the differences in hybridization efficiency were due to divergence of the DNA sequence, we

626

Vol. 34, No. 5

Table 2. Primers Used for PCR Amplification and Sequencing

Table 3. Primers Used for qPCR Validation of Array CGH Data

Name

Sequence (5–3)

Name

orf19.5370

Fwd: AGCCTCTGAACACCTTATC Rev: GTAGTTGCCCTTCTCTCTG Fwd: GGGATTTCTGTCGCATGAAC Rev: TGTCTAAAACACCGCACCTC Fwd: CAACTGAAGCGGGTAGAAC Rev: ATCAAGGTGACGACGGACT Fwd: ATGAGGTGCGGTGTTTTAG Rev: CTCGTTCCTCCAGTTGCTT Fwd: CAACATACCCCCGCATCCT Rev: GTTCAAGAGCCAGCCCACG Fwd: TTCTAACATCAGGCGGTCCCAT Rev: ACCAGACCCCTTATTGCTCGGC Fwd: CGAGAAACCCTCCCTACTG Rev: CTTTGCGTAAGATTGCGTC Fwd: AGCAGAGGAGTGAACGAA Rev: AGCAGAGGAGTGAACGAA Fwd: TTCTACTCACCCATACCAA Rev: CACTTTCCCATCTTCAATC Fwd: GTGAAGCCAGAGATGAAAT Rev: AAGCGATACATACCGTGAG Fwd: ATCCTACGGCATCATCACTAC Rev: CAATCTTCTCATTTCACCCTT Fwd: CACCATCTCAACCACATA Rev: GACCATTCACCACACTTT Fwd: GTCATCAATTATCCACGGGTT Rev: AGCAAGAAAGTTGGTAAGAAG

Orf19.6078

orf19.5469 orf19.5472 orf19.5474 orf19.5475 orf19.1831 orf19.7475 orf19.101 orf19.105 orf19.107 orf19.109 orf19.48 orf19.6192

Orf19.6079 Orf19.2669 Orf19.2668 Orf19.5472 Orf19.5474 Orf19.5469 Orf19.5475 Orf19.109 Orf19.48 Orf19.107 Orf19.6192 Orf19.6191 Orf19.111

compared the sequence of the 70 mer oligonucleotides between test strains and the reference strain from 13 genes that displayed diminished hybridization signals in test strains. Fragments of 13 genes corresponding to the 70 mer oligonucleotides were PCR-amplified from test strains and sequenced (Primer sequences are provided in Table 2). It was revealed that eight genes (orf19.5472, orf19.5474, orf19.5469, orf19.5475, orf19.109, orf19.48, orf19.107, orf19.6192) had from 100% to more than 97% similarity to the control sequences, while five genes (orf19.101, PHO81 (orf19.7475), orf19.5370, orf19.1831, HAL22 (orf19.105)) had from 92% to 70% similarity (data not shown). Thus, in approximately 50% of cases, the sequence divergence might account for the diminished signal intensities in the test strains. We used quantitative real-time PCR (qPCR) to validate the loss of DNA in the eight genes with 100% to more than 97% similarity to the control sequences. Since array CGH revealed that the variable genes were scattered across the chromosomes, as well clustered in particular regions, several genes from these locations were also included in qPCR analysis, including the retrotransposon Tca4 open reading frames (ORFs) (RHD2 (orf19.2668), orf19.2669), retrotransposon Tca8 ORFs (POL93 (orf19.6078), orf19.6079), two genes on chromosome 6 (HAL22 (orf19.105), orf19.111), one gene near right sub-telomeric region on chromosome 3 (orf19.6191), two genes located on the arm of chromosome 2 (orf19.4069, orf19.4070). qPCR revealed that, except one gene, orf19.48, all the genes analyzed possessed copy number variations as indicated by array CGH. This finding suggests that, in addition to sequence divergence, copy number variations might also account for the diminished hybridization signals in the test strains. Primer sequences and the results were provided in Tables 3 and 4, respectively. Determining Variable Genes For the self–self control

Orf19.4069 Orf19.4070 Orf19.105 Orf19.101 Orf19.7475 Orf19.5370 Orf19.1831 MTLa1a) MTLa 2a) PAP1a) PAPa a) OBPaa) OBPa a) PIKaa) PIKa a) TDH3b)

Sequence (5–3) Fwd: TGCTTATGAACTTGATTTGCC Rev: TTCACTTTCTTTACCTGGACG Fwd: GCCGAAGCAAGGAACATTA Rev: CACTCCGAGCGAACATACC Fwd: GACTTAGGGTCTGGAACAA Rev: CCGTTAAGCATAGGAGAGT Fwd: TACCTGGATCATGTGTTTTA Rev: ATTCAAGTGTTTACCTGTGT Fwd: GTCTCGCCACATCATAC Rev: GGTGACGACGGACTACAT Fwd: TTCAAGAGCCAGCCCACG Rev: ATCCACCTCACCATCATCACAT Fwd: TCAGCGACTCTGAGGACG Rev: GATGACAACATTGCCACTT Fwd: CCGACAATACTCCGAAT Rev: GTGATGATGGTGAGGTGG Fwd: TTTTATTCCCTACTCCA Rev: ATCTTCTCATTTCACCC Fwd: AGATACCGTGGAAGACAGA Rev: TGGATAATGGTGGACAGAG Fwd: CGCATGAAAGAACTAT Rev: CAAAGAACATCACCCT Fwd: CCCCGAGCAGTTTGAC Rev: AAGCGAACAAGGATAGGT Fwd: ATCCATAACCCAACTGCT Rev: ACTTCTTCGCTTCCTCTG Fwd: TAACATCCCTCAAAGACAA Rev: CAATGGCAATCATAGAAACA Fwd: TTGGTAACGCTAATGCT Rev: CGAAAGTGGGACTGTATC Fwd: CATTACTAAACTTGCTGCTC Rev: AATGGCTCCTTGTCAATC Fwd: GCAGTAAAGCGTGCCTCAT Rev: TTCTTCACCCACAATCTCG Fwd: TTGTTTACTCCGAACTTATC Rev: AAGTAGGTTGCTGGACAT Fwd: CCTGCTTCCATTGTTTGAC Rev: GACACTGATCCTGGCGATA Fwd: ACTATGACTGGTCGTTGC Rev: AGATAAGGTGTTCAGAGGC Fwd: TTGTTATCTATTTAGTGTCGTT Rev: GGTGAATTTATTATTAGTCGT Fwd: AGAACAAACAGCCTAATCG Rev: ATCATCAATCCCACCAAGA Fwd: GTGTTAGAAGGGTGGTT Rev: TAGGGTTACAAAGAATG Fwd: CTGATTTGTTAGAGCGAC Rev: CACCATCCCACTGTATTT Fwd: GAACACGAAGACATACGGAG Rev: GCCATTGAATCGGACAT Fwd: TGAAATGGATAACGAGGGA Rev: CACGCAAGAACTGAAACAA Fwd: TGGCATATTTCTCCTA Rev: GTAAACCTCGTTGTCC Fwd: TAATAACGAGTGCGAAT Rev: GTGAGTCAACCAGTCCG Fwd: GGCTGCCAAACTCTACT Rev: CACTATCAACACCACCA Fwd: TAACATTATCCCATCTTCCA Rev: AGCATCTTCAGTGTAGCCCA

a) Primer sequences of the MTL locus genes are also included in this table. TDH3 is the reference gene used as internal control for qPCR.

b)

experiment, no genes with log2 fluorescence ratio greater than 1 or less than 1 were yielded. Using CLAC method, we found a total of 1116 variable genes having significant changes of signal intensities among test isolates. Of the 1116

May 2011

627

Table 4. qPCR Validation of Array CGH Data Ratio (test/control) Gene

orf19.5472 orf19.5474 orf19.5469 orf19.5475 orf19.109 orf19.48 orf19.107 orf19.6192 orf19.6191 orf19.111 orf19.2668 orf19.2669 orf19.6079 orf19.6078 orf19.4069 orf19.4070 orf19.105 orf19.101 orf19.7574 orf19.5370 orf19.1831

aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR aCGH qPCR

P546

U885

F32

S241

S904

S204

S197

S727

0.42 0.31 0.59 0.37 0.67 0.22 0.54 0.23 1.10 0.84 0.12 1.06 1.14 0.93 2.12 2.00 0.56 0.68 0.76 0.33 0.01 0.00 0.10 0.00 0.03 0.00 0.09 0.00 0.01 0.01 0.10 0.11 0.44 0.52 0.65 0.38 0.52 0.22 0.28 0.34 0.24 0.36

0.40 0.35 0.23 0.11 0.26 0.17 0.40 0.03 1.09 0.96 0.10 1.02 1.02 0.85 3.00 4.28 0.50 0.09 0.72 0.15 0.02 0.00 0.06 0.00 0.11 0.00 0.16 0.01 0.02 0.01 0.23 0.08 0.67 0.24 0.48 0.00 0.52 0.00 0.48 0.05 0.17 0.01

0.93 0.85 0.96 0.72 0.92 1.29 0.76 0.81 1.05 1.27 0.18 1.41 1.53 1.39 3.32 4.47 0.79 0.21 0.67 0.01 0.01 0.00 0.04 0.00 0.02 0.00 0.12 0.04 0.02 0.00 0.09 0.08 0.21 0.35 0.90 0.75 0.18 0.00 0.54 0.53 0.29 0.26

1.22 0.77 1.17 1.36 0.96 0.73 1.39 1.83 1.02 0.62 0.67 1.18 1.30 1.11 1.12 0.85 0.63 0.38 0.96 1.19 0.98 1.10 0.96 1.18 0.85 1.66 0.79 1.80 0.62 0.98 0.86 1.13 0.41 0.02 1.10 0.69 0.22 0.30 0.51 0.30 0.51 0.16

1.78 2.26 1.40 0.84 1.10 0.98 2.13 2.06 1.01 1.14 0.14 2.02 0.75 0.87 0.02 0.01 0.44 0.86 0.88 1.18 5.18 2.54 4.61 2.36 0.02 0.00 0.09 0.00 0.02 0.00 0.12 0.01 0.55 0.28 0.79 0.40 0.65 0.38 0.76 0.30 0.40 0.71

0.16 0.22 0.11 0.18 0.09 0.36 0.12 0.10 1.08 0.92 0.30 0.84 1.34 0.88 3.06 4.48 0.45 0.29 0.69 1.29 1.00 1.22 1.16 1.06 0.04 0.00 0.10 0.00 1.43 1.76 1.51 2.24 0.24 0.54 1.61 2.30 0.30 0.18 0.21 0.75 0.87 0.01

0.00 0.37 0.00 0.44 0.00 0.12 0.01 0.01 0.70 0.89 0.16 1.19 0.59 0.31 0.01 0.19 0.55 0.23 0.75 1.27 1.01 1.12 1.10 1.36 0.61 1.30 0.55 1.36 0.01 0.00 0.08 0.04 0.57 0.15 0.76 0.44 0.19 0.33 0.31 0.41 0.31 0.01

0.05 0.43 0.06 0.15 0.06 0.15 0.07 0.22 0.54 0.66 0.09 0.31 0.45 0.51 0.02 0.01 0.30 0.18 0.56 0.73 1.03 1.17 1.03 1.22 0.83 0.77 0.80 0.85 0.81 1.09 1.69 2.20 1.17 0.94 0.51 0.38 1.03 0.82 0.16 0.36 0.30 0.00

genes, 702 genes were associated with diminished signal intensities and, thus considered as absent/divergent, while a total of 383 genes were associated with increased signal intensities and thus considered as amplified. A small number of genes, 31, were either absent/divergent or amplified in different isolates. Different number of variable genes was identified in each isolate, ranging from 97 to 446, as presented in Table 1. Of the 1116 variable genes, 354 genes were shared between at least two isolates, including 196 putative genes of unknown function and 158 genes annotated with some molecular functions. The variable genes could be found randomly scattered on each of the eight chromosomes. However, clusters of variable genes could be clearly identified in some particular regions, as presented in Fig. 1 (also see below). Copy Number Variations of Genes Near Sub-telomeric Regions Clusters of variable genes were revealed at subtelomeric regions of chromosomes 1, 2, and 3 (Fig. 1). Near the right sub-telomeric region of chromosome 1, seven

consecutive genes (orf19.7276.1, orf19.7278, orf19.7271, orf19.7272, orf19.7274, orf19.7275, orf19.7277) spanning approximately 7.2 kb displayed copy number variations in multiple strains. In isolate P546, all the seven genes were absent. In isolate S197, three of the seven genes (orf19.7277, orf19.7276.1, orf19.7278) were absent. In isolate S904, five of the seven genes (orf19.7274, orf19.7275, orf19.7277, orf19.7276.1, orf19.7278) were amplified, and in isolate S204, six of the seven genes (orf19.7271, orf19.7272, orf19.7274, orf19.7275, orf19.7277, orf19.7276.1) were amplified. Consensus FDRs (see Materials and Methods) were 0.192 for orf19.7271 and orf19.7272; 0.017 for orf19.7274, orf19.7275, and orf19.7278; as well as 0.002 for orf19.7277 and orf19.7276. Near the right sub-telomeric region of chromosome 2, three consecutive genes (orf19.5370, orf19.5369, orf19.5368) spanning approximately 4.6 kb were absent in three isolates (P546, S204, S727), and the consensus FDR was 0.017. Genes near both the left and the right sub-telomeric

628

Vol. 34, No. 5

Fig. 1. Consensus Plot for Each Chromosome from Eight Test Arrays, as Determined with CLAC Method Genes are ordered according to the nucleotide position on 8 chromosomes. Both the height and the color of the vertical bar of each gene stand for the percentage of arrays in which the corresponding gene has DNA copy number variation. The gain/loss regions are plotted in red/green, respectively. The relationship between the percentage and the colors as well as the height is illustrated in the legend. The distance between two background gray horizontal lines is 20%. Black arrows indicate the position of retrotransposons.

regions of chromosome 3 showed copy number variations in seven of the eight test isolates. Near the left sub-telomeric region of chromosome 3, five consecutive genes spanning approximately 10 kb (orf19.5475, orf19.5474, orf19.5472, orf19.5469, orf19.5467) were absent in five test isolates (P546, U885, S204, S197, S727), while three of the five genes (orf19.5475, orf19.5474, orf19.5472) were amplified in isolate S904. In addition, one gene (orf19.5466) downstream of the five consecutive genes was also absent in the isolate S204, and two genes (orf19.5466, orf19.5465) downstream of the five consecutive genes were also absent in the isolate U885. The consensus FDR of this region was 0.01. Near the right sub-telomeric region of chromosome 3, four consecutive genes (orf19.6192, orf19.6191, orf19.6190, orf19.6189) spanning approximately 12 kb were absent in three test isolates (S904, S197, S727). The consensus FDR for these genes was 0.017. Variability of Retrotransposon-Encoded ORFs In this study, probes representing ORFs of 8 retrotranspons were spotted on the microarrays, thus, providing an opportunity to investigate the corresponding ORFs copy number. As presented in Table 5 and Fig. 1, there were 4 patterns of copy change in test isolates: the copies of the non-long terminal repeat (LTR) retrotransposons Zorro3 and Zorro2 were equivalent to or more than SC5314; Copies of LTR-retrotransposons Tca2, Tca3, and Tca8 were equivalent to or less than SC5314; Copies of LTR-retrotransposon Tca17 were equivalent to SC5314 in all test isolates; and copies of LTRretrotransposons Tca4 and TCA11 were equivalent to, more or less than SC5314 in different isolates. Taken together, of the 354 shared variable genes, eleven genes (orf19.7274, orf19.7275, orf19.559, orf19.2371, orf19.2372, orf19.2219, orf19.6078, orf19.6079, orf19.2668, orf19.2669, orf19.6469) were retrotransposon ORFs. Within the retrotransposon sequence, in addition to the retrotransposon ORFs encoding gag or pol proteins, there are some other predicted ORFs with the sequence not similar to gag or pol region. There genes can be classified as predicted ORFs located in the

Fig. 2. Scatter Plot for Chromosome 6 from Eight Clinical Isolates X axis indicates the position of the genes on chromosome 6. Y axis is the average ratio (test/control) of each gene, as revealed by array CGH. Name of each clinical isolate is indicated on the right. The black frame encompasses the variable region.

retrotransposon. Array CGH revealed that, five variable genes (orf19.7272, orf19, 7277, orf19.562, orf19.6078.1, orf19.6465) belonged to this category. A Variable Region on Chromosome 6 In addition to sub-telomeric regions and retrotransposon insertion sites, we found a region on chromosome 6 where genes displayed high-frequency loss or gain of the copy number. This region spans approximately 11.782 kb with chromosomal coordinates 195973 to 207755 (Fig. 2) and contains eight ORFs: RIM9 (orf19.101), orf19.102, KAR5 (orf19.103), orf19.104, HAL22 (orf19.105), orf19.107, orf19.109, and CAN2

May 2011

629

Table 5. Comparsion of Retrotransposon Copy Numbers between Control and Test Isolates as Revealed by Array CGH Ratio (test/control) Retrotransposon Zorro2(0.017a)) Zorro3(0.19) Tca2(1.2E-06) Tca3(5.1E-05) Tca8 (5.12E-05) Tca17(1) Tca4(0.0016) Tca11(1.2E-06)

Orf

Orf19.7274 Orf19.7275 Orf19.559 Orf19.2371 Orf19.2372 Orf19.2219 Orf19.6078 Orf19.6079 Orf19.6807 Orf19.2668 Orf19.2669 Orf19.6469

P546

U885

F32

S241

S904

S204

S197

S727

0.94 1.12 1.04 0.01 0.03 0.42 0.42 0.29 1.01 0.10 0.10 0.26

1.41 1.23 1.06 0.46 0.46 0.68 0.39 0.34 0.98 0.10 0.10 0.53

1.25 1.54 1.32 0.97 1.34 0.30 0.58 0.29 0.94 0.07 0.06 0.22

1.03 1.02 1.09 0.36 0.36 0.41 0.97 0.91 0.96 0.97 0.96 1.59

1.42 1.44 0.92 1.00 1.80 0.39 0.30 0.28 0.98 1.89 1.90 1.08

1.68 1.65 1.11 0.52 0.46 0.80 0.33 0.32 0.95 1.03 1.02 0.49

0.88 0.80 1.05 0.01 0.02 0.25 0.75 0.81 0.97 1.00 1.08 0.12

1.47 1.66 3.30 0.48 0.42 0.86 0.81 0.81 0.97 0.93 1.03 1.20

a) Consensus FDR across all the test samples.

Fig. 3. Segmental Loss on Chromosome R in Isolate S241 Panel A: Each row corresponds to a specific spot (gene) on the microarray. The genes are arranged according to their chromosomal coordinates on chromosome R. Each column corresponds to a test isolate. The status of each gene is indicated as follows: black, present; red, amplified; green, absent. The hybridization ratios are in logarithmic scale. Panel B presents the systemic names and array CGH profiles of these genes.

(orf19.111). A total of four genes RIM9, orf19.102; KAR5, and orf19.104 encode proteins with unknown functions and the rest four genes are annotated with molecular functions: HAL22 is a putative phosphoadenosine-5-phosphate (PAP) or 3-phosphoadenosine 5-phosphosulfate (PAPS) phosphatase; orf19.107 is an RNA helicase; orf19.109 is an tyrosin-tRNA ligase; while CAN2 is an arginine transmembrane transporter. Consensus FDRs was less than 0.001 for this region. qPCR confirmed the copy number variations of the above genes (Table 4). Functional Analysis of Genes with Copy Number Variations We attempted to analyze the molecular functions of 158 annotated genes that were shared between at least two isolates, with the Gene Onthology (GO) Term Finder, as well as with GO Slim Mapper. Both tools are provided by the Candida Genome Database (CGD). Go Term Finder searches for GO terms significantly shared by the query genes. The

p-value was set to be 0.1. GO Slim Mapper maps annotations of a group of genes to more general terms and/or bins them into broad categories. We used chi-square test for significance analysis of the GO Slim Mapper result. For Go Term Finder, the query set was the 158 variable genes with annotations of molecular functions, and the background set of genes was specified as the total of C. albicans genes annotated with molecular functions, excluding genes with unknown function. We found no enrichment in any functional category with the two tools. Aneuploidy of an Entire Chromosome or a Large Portion of Chromosome Array CGH unambiguously revealed chromosomal aneuploidy in two isolates: the duplication of an entire chromosome 1 in S197 and the loss of an approximately 55 kb portion of chromosome R in S241, as presented in Fig. 3. The segmental aneuploidy in S241 extended from 479692 to 535105 bp encompassing a total of twenty genes. One gene, orf19.3746, was excluded from array CGH analysis because of the bad hybridization efficiency of genomic DNA from both SC5314 and S241 with the probes on the microarray. Ratios (S241/SC5314) of three genes, orf19.3749, orf19.3735, and orf19.3734, were approximately 1.0, thus, implying no change of the DNA amounts. A simple explanation of this result is the translocation of these genes on the other chromosome(s). Ratios of the remaining sixteen genes were approximately 0.5, which indicated the loss of one copy. In addition, we found that the MTLa locus on chromosome 5 that spans approximately 9 kb is homozygous in S727. Although our microarrays lacked the probes representing the OBPa , PIKa , PAPa , and MTLa 2 genes from the MTLa locus, we used PCR approach to amplify these genes from both MTL loci from the same batch of genomic DNA, which we used for array CGH. We found MTLa , but no MTLa locus (Fig. 4A, Table 6). Furthermore, we used qPCR to confirm that the MTLa locus is present in two copies (Fig. 4B, Table 3). Future research is needed to establish if the entire chromosome 5 or, alternatively, one copy of MTL was lost and subsequently duplicated. DISCUSSION The extent of genomic variation within a species is

630

Vol. 34, No. 5

Fig. 4. Homozygosity at the MTL Locus in Isolate S727 Panel A is the result of PCR amplification of the MTLa locus genes (MTLa1, PIKa, PAP1, OBPa) and the MTLa locus genes (MTLa 2, PIKa , PAPa , OBPa ) with genomic DNA of isolates SC5314 and S727 as templates. Panel B is the result of qPCR analysis of the copies of MTL locus genes in S727 as compared to SC5314. Array CGH profile of the MTLa locus genes is also presented.

Table 6. Primers Used for PCR Amplification of MTL Locus Genes Name

Sequence (5–3)

MTLa1

Fwd: AGAATGAAGACAACGAGGA Rev: CTTACTGTGGGAAAAATGA Fwd: ATGAATTCACATCTGGAGGC Rev: CTGTTAATAGCAAAGCAGCC Fwd: CATCTGAGGTCATCAAGTAGG Rev: GTGAGTCAACCAGTCCGTAAA Fwd: GTTACCCCTTCTATTACGG Rev: TGACCATCTCCATCTACCA Fwd: AATTGCTGGTCGCTGATCG Rev: ATTATTCCCAATGTGTGCCAAC Fwd: AATTTATCCAGCGAACATGCAC Rev: CTTCTGTCCTGGAACAATCGG Fwd: AATCAAGCATACGGTGTTACAC Rev: CCTCATGTCGCCAACCACAG Fwd: CAAGAGTGACCGATGAGATA Rev: CGCCTTCAGTAAAAGATGTA

MTLa 2 PIKa PIKa OBPa OBPa PAP1 PAPa

believed to contribute to the species survival in their natural environment.25) Previous study of the C. albicans isolate ATCC10231 identified 42 variable genes, including 5 amplified, and 37 absent/divergent genes, as compared with SC5314.16) It is of interest, that we found that 20 of the 42 variable genes in ATCC10231 were also amplified or absent/divergent in our isolates. However, in contrast to the ATCC10231, eight clinical isolates from this study revealed higher genomic variability. Of the 6111 chromosomal genes represented on our microarrays, approximately 1116 genes (18.3%) differed, as compared to the reference strain SC4314. The absent/divergent genes, ranging from 88 to

302, prevailed in a total of seven isolates, while the amplified genes prevailed in one isolate, S197, which was trisomic of chromosome 1. In this regard, full or segmental aneuploidy of different chromosomes that we found in three isolates also seems to be a frequent event. Aneuploidy of the entire chromosomes or large portions of chromosomes was previously largely investigated in C. albicans.3,14,15,26) Aneuploidy was revealed in clinical isolates,27—29) as well as in vitro experiments that established formation of a specific chromosome alteration in response to a specific stressful environment.3,30—32) Our analysis with GO Term Finder showed no significant enrichment of any functional category among variable genes from different isolates. We interpreted this fact, as an indication that variability is not related to the function of the variable genes, but rather to the structural features of DNA. Indeed, substantial portion number of variations among genes occurred in the same regions on chromosomes in different isolates. These included sub-telomeric regions, retrotransposon-insertion sites, and other regions. Genome variability at sub-telomeric regions has been reported in such different organisms, as, for example, humans and S. cerevisiae, and is thought to result from ectopic recombination.33,34) C. albicans sub-telomeric regions contain repetitive sequences, including, for example, CARE-2 or Rel-2.35) These sequences might contribute to the sub-telomeric instability. Retrotransposon-insertion sites are known as regions that generate high genome diversity between yeast isolates and species.36,37) In the strain ATCC10231, 11 of the 42 variable genes were shown to be retrotransposon ORFs.16) In this study, we revealed the high variability in retrotransposon composition, with copy number variations of retrotransposon

May 2011

ORFs as well as predicted ORFs located within retrotransposon. The high retrotransposon variability observed in this study raises the question whether the reduced number of retrotransposons could result from selective pressures that affect particular regions of the genome in response to adaptation to particular environments, and the increased number of retrotransposons results from stress-activated transposition. Among the seven retrotransposons identified with copy number variations, Zorro2, Zorro3, and Tca2 are active.38,39) In addition to sub-telomeric regions and retrotransposoninsertion sites, some variable genes were clustered in a region on chromosome 6 starting from 19.5973 to 20.6049 kb. A total of 6 of the 8 genes in this region varied in at least 3 of the 8 test isolates. This region was previously reported highly polymorphic in C. albicans genome.40) In summary, this work provides the first comprehensive evaluation of intraspecies genomic variation with clinical isolates of C. albicans. We found genomic variability in different chromosomal locations scattered across the chromosomes, as well as clustered with high-frequency in certain portions of chromosomes, including the sub-telomeric regions, retrotransposon-insertion sites, as well as a variable region on chromosome 6. Genes of diverse molecular functions were involved, thus, indicating that, at least in some cases, the cause of the variability could be the structural features of the chromosomal regions. Further research is needed to elucidate the molecular mechanisms of the variability. Acknowledgements We thank Professor William A. Fonzi for the gift of SC5314. We are grateful to Dr. Liang Zhang, Dr. Yi-Min Sun, and Dr. Jian-Qing Zhao of Beijing CapitalBio Corporation for their kind help of microarray experiment. This work was supported by National High Technology Research and Development Program of China 2008AA02Z128, Major State Basic Research Development Program 2005CB523105, and Shanghai Major Basic Research Development Program 08JC1405900. REFERENCES 1) Eggimann P., Garbino J., Pittet D., Lancet Infect. Dis., 3, 685—702 (2003). 2) Gudlaugsson O., Gillespie S., Lee K., Vande Berg J., Hu J., Messer S., Herwaldt L., Pfaller M., Diekema D., Clin. Infect. Dis., 37, 1172— 1177 (2003). 3) Rustchenko E., FEMS Yeast Res., 7, 2—11 (2007). 4) Rustchenko E., Sherman F., “Fungi Pathogenic for Humans and Animals,” ed. by Howard D. H., Vol. 16, Marcel Dekker, New York, 2002. 5) Selmecki A., Forche A., Berman J., Eukaryot. Cell., 9, 991—1008 (2010). 6) Iwaguchi S., Homma M., Tanaka K., J. Gen. Microbiol., 136, 2433— 2442 (1990). 7) Rustchenko-Bulgac E. P., J. Bacteriol., 173, 6586—6596 (1991). 8) Thrash-Bingham C., Gorman J. A., Curr. Genet., 22, 93—100 (1992). 9) Rustchenko E. P., Curran T. M., Sherman F., J. Bacteriol., 175, 7189— 7199 (1993).

631 10) Rustchenko E. P., Howard D. H., Sherman F., J. Bacteriol., 176, 3231—3241 (1994). 11) Perepnikhatka V., Fischer F. J., Niimi M., Baker R. A., Cannon R. D., Wang Y. K., Sherman F., Rustchenko E., J. Bacteriol., 181, 4041— 4049 (1999). 12) Chibana H., Beckerman J. L., Magee P. T., Genome Res., 10, 1865— 1877 (2000). 13) Magee B. B., Sanchez M. D., Saunders D., Harris D., Berriman M., Magee P. T., Fungal Genet. Biol., 45, 338—350 (2008). 14) Andaluz E., Gomez-Raja J., Hermosa B., Ciudad T., Rustchenko E., Calderone R., Larriba G., Fungal Genet. Biol., 44, 789—798 (2007). 15) Ahmad A., Kabir M. A., Kravets A., Andaluz E., Larriba G., Rustchenko E., Yeast, 25, 433—448 (2008). 16) Thewes S., Moran G. P., Magee B. B., Schaller M., Sullivan D. J., Hube B., BMC Microbiol., 8, 187 (2008). 17) Daran-Lapujade P., Daran J. M., Kotter P., Petit T., Piper M. D., Pronk J. T., FEMS Yeast Res., 4, 259—269 (2003). 18) Carreto L., Eiriz M. F., Gomes A. C., Pereira P. M., Schuller D., Santos M. A., BMC Genomics, 9, 524 (2008). 19) McCullough M. J., Clemons K. V., Stevens D. A., J. Clin. Microbiol., 37, 417—421 (1999). 20) Hoffman C. S., Winston F., Gene, 57, 267—272 (1987). 21) Yang Y. H., Dudoit S., Luu P., Lin D. M., Peng V., Ngai J., Speed T. P., Nucleic Acids Res., 30, e15 (2002). 22) Wang P., Kim Y., Pollack J., Narasimhan B., Tibshirani R., Biostatistics, 6, 45—58 (2005). 23) Brennan C., Zhang Y., Leo C., Feng B., Cauwels C., Aguirre A. J., Kim M., Protopopov A., Chin L., Cancer Res., 64, 4744—4748 (2004). 24) Nicolet C., Guerin E., Neuville A., Kerckaert J. P., Wicker N., Bergmann E., Brigand C., Kedinger M., Gaub M. P., Guenot D., Cancer Lett., 282, 195—204 (2009). 25) Chaillou S., Daty M., Baraige F., Dudez A. M., Anglade P., Jones R., Alpert C. A., Champomier-Verges M. C., Zagorec M., Appl. Environ. Microbiol., 75, 970—980 (2009). 26) Selmecki A., Bergmann S., Berman J., Mol. Microbiol., 55, 1553— 1565 (2005). 27) Legrand M., Forche A., Selmecki A., Chan C., Kirkpatrick D. T., Berman J., PLoS Genet., 4, e1 (2008). 28) Wu W., Pujol C., Lockhart S. R., Soll D. R., Genetics, 169, 1311— 1327 (2005). 29) Diogo D., Bouchier C., d’Enfert C., Bougnoux M. E., Fungal Genet. Biol., 46, 159—168 (2009). 30) Rustad T. R., Stevens D. A., Pfaller M. A., White T. C., Microbiology, 148, 1061—1072 (2002). 31) Selmecki A., Forche A., Berman J., Science, 313, 367—370 (2006). 32) Cowen L. E., Anderson J. B., Kohn L. M., Annu. Rev. Microbiol., 56, 139—165 (2002). 33) Liti G., Louis E. J., Annu. Rev. Microbiol., 59, 135—153 (2005). 34) Winzeler E. A., Castillo-Davis C. I., Oshiro G., Liang D., Richards D. R., Zhou Y., Hartl D. L., Genetics, 163, 79—89 (2003). 35) Chibana H., Magee B. B., Grindle S., Ran Y., Scherer S., Magee P. T., Genetics, 149, 1739—1752 (1998). 36) Fischer G., James S. A., Roberts I. N., Oliver S. G., Louis E. J., Nature (London), 405, 451—454 (2000). 37) Garfinkel D. J., Cytogenet. Genome Res., 110, 63—69 (2005). 38) Goodwin T. J., Ormandy J. E., Poulter R. T., Curr. Genet., 39, 83—91 (2001). 39) Holton N. J., Goodwin T. J., Butler M. I., Poulter R. T., Nucleic Acids Res., 29, 4014—4024 (2001). 40) Jones T., Federspiel N. A., Chibana H., Dungan J., Kalman S., Magee B. B., Newport G., Thorstenson Y. R., Agabian N., Magee P. T., Davis R. W., Scherer S., Proc. Natl. Acad. Sci. U.S.A., 101, 7329—7334 (2004).

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.