Complete Gene Map of the Plastid-like DNA of the Malaria ParasitePlasmodium falciparum

Share Embed


Descripción

J. Mol. Biol. (1996) 261, 155–172

Complete Gene Map of the Plastid-like DNA of the Malaria Parasite Plasmodium falciparum R. J. M. (Iain) Wilson*, Paul W. Denny, Peter R. Preiser Kaveri Rangachari, Kate Roberts, Anjana Roy, Andrea Whyte Malcolm Strath, Daphne J. Moore, Peter W. Moore and Donald H. Williamson National Institute for Medical Research, Mill Hill, London NW7 1AA, UK

Malaria parasites, and other parasitic protists of the Phylum Apicomplexa, carry a plastid-like genome with greatly reduced sequence complexity. This 35 kb DNA circle resembles the plastid DNA of non-photosynthetic plants, encoding almost exclusively components involved in gene expression. The complete gene map described here includes genes for duplicated large and small subunit rRNAs, 25 species of tRNA, three subunits of a eubacterial RNA polymerase, 17 ribosomal proteins, and a translation elongation factor. In addition, it codes for an unusual member of the Clp family of chaperones, as well as an open reading frame of unknown function found in red algal plastids. Transcription is polycistronic. This plastid-like DNA molecule is conserved in several genera of apicomplexans and is conjectured to have been acquired by an early progenitor of the Phylum by secondary endosymbiosis. The function of the organelle (plastid) carrying this DNA remains obscure, but appears to be specified by genes transferred to the nucleus. 7 1996 Academic Press Limited

*Corresponding author

Keywords: evolution; malaria; non-photosynthetic plastids; plastid DNA

Introduction Like plants, malaria parasites (Plasmodium spp.) carry two organellar DNAs; one is mitochondrial, the other is a 35 kb circular molecule resembling the remnant of an algal plastid genome (Wilson et al., 1994). To account for the plastid-like DNA (plDNA), we proposed that an early progenitor of the Phylum acquired an algal plastid by secondary endosymbiosis (Williamson et al., 1994). In keeping with this hypothesis, the plDNA occurs in several genera of apicomplexans besides Plasmodium (Wilson et al., 1993). Exploratory studies have shown that the circular DNA does not co-fractionate with the mitochondrion (Wilson et al., 1992), but resides in another intracellular compartment: a presumed Abbreviations used: plDNA, plastid-like DNA; Pf, Plasmodium falciparum; EF-Tu, elongation factor Tu; LSU, SSU, large and small subunits; ORF, open reading frame; IR, inverted repeat; rp, ribosomal protein; RT-PCR, reverse transcription–polymerase chain reaction; DAPI, 4',6-diamidino-2-phenylindole; Tg, Toxoplasma gondii; Et, Eimeria tenella; Ta, Theileria annulata. 0022–2836/96/320155–18 $18.00/0

vestigial plastid, which is transmitted uniparentally by the macrogamete (female) in sexual reproduction (Vaidya et al., 1993; Creasey et al., 1994). The reduced malarial plastid genome is only half the size of the other two best known vestigial plastomes, those of Astasia longa (a non-photosynthetic euglenoid; Gockel et al., 1994) and Epifagus virginiana (a parasitic, non-photosynthetic higher plant; Wolfe et al., 1992), and it has a notably different gene content from both of these. Yet the transcriptional activity of the malarial plDNA and its comprehensive set of tRNA genes (Preiser et al., 1995) imply that it is functional. To obtain more insight into the plDNA’s origin and function, we have sequenced the 35 kb circle from the human malarial parasite Plasmodium falciparum (Pf) and obtained a complete gene map. Approximately half the sequence of the Pf 35 kb circle has been described (see references in Preiser et al., 1995). Here we describe further recently identified genes and give an overview of the complete gene map. Nearly all the genes turn out to be involved with gene expression, including those encoding three subunits of a eubacterial type 7 1996 Academic Press Limited

156 of RNA polymerase, 17 ribosomal proteins, the elongation factor Tu (EF-Tu), duplicated large and small subunit (LSU and SSU) rRNAs, and 25 tRNAs, nine of which are duplicated. In addition, there is an open reading frame (ORF) encoding a putative regulatory subunit similar to the Clp family of molecular chaperones, as well as a highly conserved ORF of unknown function found in bacteria and ‘‘primitive’’ red algal plastids, and seven small potential ORFs of various sizes. All the genes appear to be transcribed polycistronically. Further evidence for conservation of the 35 kb circle amongst apicomplexans supports our contention that it has a common origin and conserved function.

Results Physical features A continuous sequence of 34,682 nt has been constructed, corresponding to the ‘‘35 kb’’ circular DNA of Pf. A small stretch, estimated to be only tens of nucleotides in length remains unsequenced in the centre of the rDNA inverted repeat. This repeat has the propensity to form a large cruciform structure (Wilson et al., 1993), with arms that can reach 0.5 mm in length, and the torsional constraints on formation of the cruciform have been discussed for the corresponding circular molecule from the related apicomplexan parasite Toxoplasma gondii (Borst et al., 1984). The high A + T (86.9%) content of the circle sequence is consistent with its buoyancy in caesium chloride gradients: it forms a ‘‘light’’ satellite band in density gradients of DNA from apicomplexans with genomic DNA of moderate G + C richness; for example, P. knowlesi (Williamson et al., 1985). In Pf, separation of the circular DNA from the almost equally A + T-rich genomic DNA (82%; Weber, 1988) requires a multi-step fractionation procedure (Gardner et al., 1988). Gene content and organization A gene map of the 35 kb circle of Pf is illustrated in Figure 1 and the genetic content is listed in Table 1. The main features are as follows.

Inverted repeat The inverted repeat (IR) covers about one third of the circle and encodes duplicated large and small subunit rRNA genes (Gardner et al., 1991a), as well as nine duplicated tRNA genes (Gardner et al., 1994b). Sequence analysis of the rRNA genes showed that they are not closely related to those of the mitochondrion or nucleus (Feagin et al., 1992).

tRNA genes Downstream of the IRB arm of the inverted repeat there is a cluster of ten tRNA genes, one of which,

Gene Map of Malarial Plastid DNA

trn L, carries the only intron recognized so far on the circle (Preiser et al., 1995). Along with the tRNAs encoded within the inverted repeat and six others located in two small groups at remote sites on the circle, a total complement of 25 species has been found. These are believed to be sufficient to provide a minimal but complete set for translation of the protein-encoding genes on the circle (Preiser et al., 1995).

Ribosomal protein genes A string of 15 putative ribosomal protein (rp) genes forms another prominent feature of the circle, occupying a sector of about 7 kb (Figure 1). The order of rp genes in this large cluster implies a fusion of the S10, spc, alpha and str operons of Escherichia coli, as found in other plastids (Figure 2). Another rp gene, rps4, present in the alpha operon of E. coli, is separately located upstream of the clustered malarial rp genes (Figure 1). In addition, rps2 lies at a distant site immediately downstream of the rpo genes in a location characteristic of plastid DNAs rather than those of cyanobacteria. It is possible that some of the small unidentified ORFs on the circle also correspond to rp genes but, if so, they are extremely divergent. Although several rp genes characteristic of bacterial operons are no longer present on the malarial circle, the retention of gene order is striking when one takes into account the many deletions and rearrangements required to reduce what was presumably a typical plastid genome of some 150 kb or more, to its present small size and selected gene content. The clustered rp genes are closely packed, open reading frames often being separated by no more than 30 nt, or even in two cases overlapping by a single nt. One exception is an overlap of 30 nt between rpl36 and ORF91, which could invalidate the latter. In the plDNA of the primitive red alga Porphyra purpurea, the S10 operon begins with rpl3 followed by rpl4, the preceding gene of the corresponding bacterial operon (rps10) having been transposed to follow tufA at the end of the str operon (see Figure 2). In Plasmodium, the first rp gene in the S10 series is rpl4, but an open reading frame (ORF78) following the tufA gene shows no apparent similarity to rps10, which presumably has been transferred to the nucleus. The order of rp genes on the circular DNA has been useful for identification purposes and, likewise, the juxtaposition of rp and other genes occasionally helps to signify the circle’s origin. One example is the presence of a tuf gene downstream of rps7 and rps12. Unlike most algal genomes, tufA is encoded in the nucleus of higher plants (Baldauf & Palmer, 1990), so its presence on the malarial circle next to the truncated str operon strengthens our proposal that the circle is of algal origin, like the plDNA of euglenoids, rather than derived from higher plants by lateral transfer. Moreover, like algal pl genomes but unlike those of higher plants, the malarial circle encodes rps5, rps17 and rpl4. On

Gene Map of Malarial Plastid DNA

157

Figure 1. Gene map of the 35 kb circular DNA of Plasmodium falciparum. The two halves (A and B) of the inverted repeat (IR) are indicated. tRNA genes are specified by the anticodon, as well as the single letter amino acid code. ORFs specify the number of amino acid residues in the open reading frame. Genes on the outer strand are transcribed clockwise, those on the inner strand anti-clockwise.

the other hand, although the isolated rps4 gene is preceded by trn T as in P. purpurea, Cyanidium caldarium, Marchantia polymorpha and some higher plants, it is known that different trn genes abut rps4 in other plastomes (Harris et al., 1994). In this instance then, gene order seems fluid. The rp genes on the malarial circle show typical as well as unusual compositional features. Their nucleotide composition is extremely rich in A + T residues and the codon usage notably biased, consequently similarity at the peptide level to other plastid rp genes often is borderline. Nonetheless, all

the putative peptides have a basic charge as would be predicted for rps (Table 2). As is evident in the 17 pairs of DOT MATRIX plots given in Figure 3 (ten pairs for small subunit rps and seven pairs for large subunit rps), identification has relied for the most part on small stretches of conserved sequence indicating regions of global similarity to equivalent peptides from E. coli, a plastid, or in one case a mycoplasma source. It is interesting that most of the rps encoded on the 35 kb plDNA are important in the initial assembly of the 30 S subunit (Wittmann, 1983).

158

Gene Map of Malarial Plastid DNA

Table 1. Gene content of the 35 kb circular DNA of P. falciparum Class

Genes

Ribosomal RNA Transfer RNAa,b

16 S, 23 S

Ribosomal proteins: rps rpl RNA polymerase Other proteins Unassigned ORFs

2, 3, 4, 5, 7, 8, 11, 12, 17, 19 2, 4, 6, 14, 16, 23, 36 rpoB, C1 , C2 clpC tufA ORF470 51, 78, 79, 91, 101, 105, 129

a b

Single letter amino acid code and anti-codon. Asterisk represents an intron.

Features of interest in a selection of rp genes are as follows.

rps2 This maps downstream of the RNA polymerase genes rpoB/C1/C2 , as in other plastid genomes. However, it is not followed by atp genes as in other plastids, the 3' end of rps2 marking instead a possible deletion/recombination site and the cross-over point for the directions of transcription from the two arms of the inverted repeat (see Figure 1).

rpl2 This commences with an ATG codon like other plant homologues except for rice and maize, which have an ACG codon edited to AUG at the transcript level (Kossel et al., 1993). The C terminus of the predicted malarial peptide contains the usual block of conserved amino acid residues (DHPHGGG), otherwise the peptide is truncated at both ends.

rpl4 This has not been found in other pl genomes except in the primitive red alga P. purpurea (Reith & Munholland, see Harris et al., 1994).

rps4 This is one of the rRNA binding proteins that initiate assembly of the 30 S ribosome. Only the first 20 amino acid residues and a large central portion show any similarity to other versions of this protein (
Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.