Tacaribe virus L gene encodes a protein of 2210 amino acid residues

Share Embed


Descripción

VIROLOGY

170,40-47 (1989)

Tacaribe Virus L Gene Encodes a Protein of 2210 Amino Acid Residues SILVIA IAPALUCCI,*e’ RICARDO LOPEZ,* OSVALDO REY,* NORA LOPEZ,* MARIAT. FRANZE-FERNANDEZ,* GEORGES N. COHEN,t MIGUEL LUCERO,t ALBERT0 OCHOA,t AND MARIO M. ZAKlNt *Centro de Virologia Animal, Serrano 661, 14 14 Buenos Aires, Argentina, and iUnite’ de Biochimie Cellulaire, lnstitut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France Received November

4, 1988; accepted December 20, 1988

The nucleotide sequence of Tacaribe virus (TV) L gene was obtained from two sets of overlapping cDNA clones constructed by walking along the virus L RNA using two successive synthetic DNA primers. Analysis of the sequence indicated the existence of a unique long open reading frame in the viral complementary strand. The first in-phase AUG codon is in positions 31-33 from the 5’end of the viral complementary L RNA surrounded by a sequence favorable for initiation of protein synthesis. The open reading frame ends at positions 6661-6663. The predicted TV L protein is a 2210 amino acid long polypeptide with an estimated molecular weight of 251,942. Comparison of the amino acid sequence of TV L protein with peptide sequences predicted from L-derived cDNA clones of lymphocytic choriomeningitis virus shows an overall 42% of homology. o 1999Academic press. inc.

INTRODUCTIOhl

al. (1987) presented direct evidence for a 200-kDa L

Tacaribe virus (TV), a member of the Arenaviridae family, is an enveloped virus with genetic information encoded in two segments of single-stranded RNA, a large segment, designated L (=7 kb) and a small segment designated S, of approximatively 3.4 kb. TV gave its name to a group of serologically defined viruses that are geographically distributed in the Americas. Two of these viruses are human pathogens responsible for hemorrhagic fevers: Junin virus in Argentina and Machupo virus in Bolivia. Biological and serological studies indicated that the Tacaribe complex member most closely related to TV is Junin virus (reviewed by Howard, 1986). It is now well established for a number of arenaviruses that the S RNA encodes the major structural viral proteins, the nucleoprotein (N) and the glycoproteins in the form of a precursor protein (GPC) (Auperin et al., 1984; Romanowski et a/., 1985; Clegg and Oram, 1985; Auperin et al,, 1986; Franze-Fernandez et al., 1987; Southern et al., 1987). In all arenaviruses but TV and Tamiami virus, two glycoproteins can be distinguished in the virion (GP, and GPJ (reviewed by Howard, 1986; Buchmeier and Parekh, 1987). TV virions appear to contain a single glycoprotein (Gard et a/., 1977; Boersma et a/., 1982). With regard to the L RNA, studies performed with Pichinde and Munchique viruses and a reassortant of the two lend the basis for assigning to this segment, the coding of a polypeptide of = 200 kDa (Harnish et a/., 1983). Recently, Singh et

protein in lymphocytic choriomeningitis virus (LCMV) L RNA. A protein of about 200 kDa is not detected in TV or in TV-infected cells. Two minor polypeptides with estimated sizes of -79 kDa and = 105 kDa are visualized in TV infections (Gimenez eta/., 1983; Lopez and Franze-Fernandez, 1985). The close relatedness between TV and the pathogenic Junin virus and the apparently specific features of TV regarding its protein composition prompted us to study the molecular biology of this virus. We reported in a previous study the complete structure of TV S RNA. It was found that as in the arenaviruses Pichinde, LCMV, and Lassa (reviewed by Bishop and Auperin, 1987; Southern and Bishop, 1987) TV S RNA encodes the structural proteins N and GPC in an ambisense coding arrangement (Franze-Fernandez et al., 1987). It was also found that the virus-associated RNA polymerase transcribes in vivo the N mRNA but is unable to replicate the S RNA unless there is ongoing protein synthesis (Franze-Fernandez et a/., 1987). To continue with the studies on RNA replication and to get an insight into the problems posed by arenavirus biology, it is essential to know the molecular organization of the L segment which comprises about 70% of the coding potential of arenavirus genome. In pursuing this goal, we have cloned and sequenced the TV gene encoding the L protein.

MATERIALS AND METHODS Virus purification and RNA extraction

Sequence Data from this article have been deposited with the EMBUGenBank Data Libraries under Accession No. JO4340. ’ To whom requests for reprints should be addressed.

0042-6822189

$3.00

Copyright 0 1999 byAcademIc Press. Inc. All rights of reproduction in any form reserved.

The origin of Tacaribe virus used in these studies has been described (Lopez and Franze-Fernandez, 1985).

40

TACARIBE 4

5

Ii

41

VIRUS L GENE STRUCTURE 1

2

3

Okb

f 3’vRNA

5’

P14 -c---//

----r -

P%

--

-cc -t---c-

_c s-

e---

3

-

-

--

c

-

---zc --4

-

L-

p19

3)

t-M-

-

FIG. 1. Molecular cloning and sequencing strategy of TV L gene. The cDNA clones and subclones were obtained as indicated under Materials and Methods. Sequences performed by the method of Sanger et al. (1977) and by the procedure of Maxam and Gilbert (1980) are indicated by thin and thick arrows, respectively. The scale in kb is numbered from the 3’end in the viral sense.

Virus was grown in monolayers of BHK,, cells and purified as indicated before (Lopez and Franze-Fernandez, 1985) with the following modification: the pelleted virions were disrupted with a solution containing 4 M guanidinium thiocyanate, 2 mM sodium citrate, pH 7.0, 0.1 M 2-mercaptoethanol, and 0.2% sodium N-laurylsarcosine (w/v). The viral RNA was then purified by sedimentation in CsCl as described by Chirgwin ef al. (1979).

Genome cloning and cDNA sequencing cDNA was synthesized from viral RNA using as primer synthetic oligodeoxynucleotides. The method proposed by Gubler and Hoffman (1983) with the modifications described previously (Franze-Fernandez et a/., 1987) was used. The cDNA was ligated into the PAT1 53/Pvull/8 vector as indicated (Franze-Fernandez et a/., 1987). Transformations were performed using competent fscherichia co/i MC 106 1 cells (Casadaban and Cohen, 1980). Clones were screened by colony hybridization with the 5’-labeled oligodeoxynucleotide used as primer and, in some experiments, with a labeled restriction fragment obtained from a previously characterized clone. Sequences were mostly obtained from plasmids p14 and p96 (Fig. 1) which contain inserts covering all the gene encoding the L protein. Two overlapping restriction fragments that covered all the insert of plasmid pl4 and the entire insert of clone p96 were cloned in both orientations in M 13 vectors. From the six constructions, a series of overlapping subclones for sequencing were generated by using the Cyclone System kit (International Biotechnologies, New Haven, CT). A 0.78-kb restriction fragment from clone p19 and the entire insert from clone p6bl were also inserted in M 13 vectors for sequencing. Sequences were done by the dideoxy chain termination procedure

of Sanger et a/. (1977). Certain genomic regions were sequenced, in addition, by the procedure of Maxam and Gilbert (1980).

RESULTS AND DISCUSSION Cloning of the L gene The nucleotide sequence of TV L gene was obtained from two sets of overlapping clones that were isolated from two cDNA libraries (Fig. 1). The initial library was generated by priming first strand cDNA synthesis with a synthetic 19-mer oligonucleotide known to be complementary to the 3’terminus of arenavirus S RNA (Auperin et al., 1982). The longest inserts’ of positive clones obtained by colony hybridization with the 5’-labeled 19-oligomer were characterized by restriction mapping. Six of the inserts exhibited a common restriction pattern that was different from the known pattern of TV S RNA inserts (Franze-Fernandez et a/., 1987). The clones were identified by Northern blotting to contain TV L gene sequences (not shown). The recovery of L cDNA clones from this library was presumably due to the similarities between TV L and S RNA sequences (Auperin et a/., 1982). From the 5’ proximal end of the largest of the inserts obtained in this first screening (p6bl), a nick-translated restriction fragment was used, in addition to the 19-mer primer, to further screen the cDNA library. Four clones with inserts ranging from 3.4 to 5.1 kb were recovered and studied by digestion with 13 different endonucleases and by Northern blotting. All four inserts gave identical cleavage patterns and hybridized to TV L RNA. From the sequence of the 5’ proximal end of the two largest inserts (~14 and ~19) an oligodeoxynucleotide was synthesized and was used to prime the first strand cDNA synthesis of a second preparation of cDNA. Three clones with the largest

42

IAPALUCCI

ET AL.

1 CGCACAGTGGATCCTAGGCGGCACTTGACCATGGATGATG~CTGTGTCTGMCTC~GACTTGGTTAG~CACATTCC AAATAGGCATGAGTTTGCCCACCAGAAAGATGCCTTCCTGTCACACTGTCATTCAGGGTCACTGCTTCMGMGGTTTTA MCTTCTCTCCMCCTTGTGAGTTAGAGTCCTGTGMTCTCATGCATGCCACTTGMCACCTGCC~TATGTTGAT GTGATCCTCAGTGATCATGGMTCCCATGTCCCACTCTCCCT~GTGATACCTGATGGATTC~CTCACT~~C ATTAATACTATTGGAAACATTTGTGAGGGTCAATCCAGAATAC TAAATCTCAAACMGACCTTCTGAGGTCAGGCATCACACTTAGG TTCACTCCTGMTGGGTGGTGGAGAGGATCAGATGGTTGTTGTT~TTG~TTTTGCGG~TCCCGATCGTCCGCAGAGAT TGATATTGAGGATCMGAGTATCMCGTCTGATACACAGTCTTTC~CGTTCGTMTC~GTTTAGGATTTGAGMTA TAGAGTGTCTCAAAAGAAATCTTTTGGAGTATGATGATAGACTG~C~TCATTATTTGTT~AGTC~G~GATGTT CGAGMTCTGTGATTAGGGMGMTTAAGAC 801 GTTTAGGATMCTMTAGATCAGAGCTTCTGAACAATTTACTCAGATT GTCCCTTTTGTGTCAACAAATTCATGGACATAATCTATAG TCAAACTCTGMTTAGATCAGTATGTTGTCTGCCCCCATGAGMGGCCTATTTA~CGTGTTGTCGATTTGCMT~T TMGGGATTAAAAGTTTTTMCACCCGMG~TACGTTGC TATTTACAGCCAAACCCGAAGCACTGGACTCTTTGAGAAGT GACAGGGCCTTGGATTTTTTGGAGGCGGTGAAATTAATTATTG CTCTAAAATCTTGAGGAGGTCTCMC~GATATCTGGTCAC~TATCGGTGT~~MGATATCCTGATTTATCCMGC TAATATCTATAGCTCAGACMTATCTTCTGATAGACCCATTATGAGATACTCTGCTffiGG~~CTTTMCACTGMTGT AAACATAAGACATTTCACATGATGTCAGATGCTGAACAGGTTGAGGCTTTC~TACTCTCCTCAGTGTCTCTTTCCTT MTAAACTCTATGAAAACCTCGTTTTCTTCMGGTT~TTATTMTG~GMTACTCM~TATTTTG~MTGTGA GACTMGGGAGTGCTACCMCAGAGATTCTTTTTAACTGATGAG 1601 CGTTCAGGATGCTACTCMTTTACACATGTGMGATGGTGTTCTTGTCGAGM~CTCTTTTTACTGTGACCCGMGAG ATTTTTTCTGCCAATCTTTTCACMGAGGTTGAGTG ATCTTATGGTGATATCCAAAGAG~CTMGACTTTATTCAGGTA TTTCTGCAAGGGCTAAGGTATTTTTTGATGGCATACTCAAAGT GGAGTGTATGTCTGGCTCTGAAGTCATAGTCAGAGGCTGTGTTG ATTCGGACCCCTATTTTGCTAGGAGGTTCAAGTATCTCTTCA CCTGACAGATTAACAGACCATCAAATGTTTTGAGAAAT 2401

3201

4001

5601

6401

AAGCTAAAAGTCACAGGTGAGCTTMGMCGACCCATTCA TMGAGCGTAGTAGTACCATAGATGMCTT~G~TGTTCTTTCGGTATATGATAGAGAG~GATGATTTCTTCAT GTGTTTCTTCCATGGCGGATTCRAAACAAAGGGMGATACMCATTGATCCMGTACATTAGATTACTTGATACTA AAAAACTTMCCGGGCTTGTTAGTATTGGTTCTAAAACCCAGAGAGATTGTGMGA~TGTCMTGATGTTTGM~TT GACAGMGAGCAAGCAGAAGCTTTCMTGATAT AAAAAATTCCGTGCMTTAGCMTGGTCAAMTGAAGGACTCCAAAT CTGGAGATGTCMTCTMGTCCGMTC~~~G~AGGGTT~~AGCACA~TACACTTGA~MCTTT~T CCTTTTGGGATTATGAGAGAGATTAGGACAGAGGTCTCTCACTTCATGA~TG~GACTTTGATCCTGATGTACTC~TTC AGACTTGTATAAAGAACTATGTGATGTGGTCTATTACTCCTCMGC~CCAGAGTATTTCTTffiAGAGACCTTTAGMG TTTGTCCTCTAGGATTGCTTTTGAAAAATCTCACTACTTCT~ATATTTTGATG~GAGTATTTTGAGT~TTC~TAC TTATTGATACMGGTCATTATGATCAAAAACTAGGAAGTTGA AGCACTGAGAGTGAAAGATGMGTMGACTTAGCATGAGAGAGAGCMTTCTGMGCTATA~AGATMGTT~ATAGGA GTTATTTTACCMTGCTGCACTM~~CCTGTGCTTTTTACTCTGATGATTCTCCTACTGAGTTTACCAGTATCAGTTCA MTMTGGTMCTTAAAGTTTGGCTTAAGTTACAAGGAGC CACTMGCTGATMCAAGGTTAGTTGAAGACTTTGCAGAGA~CAGTTGGCAGTTCCATGAGATATACATGTCTCAGTTCAG AAAAAGMTTTGATAGAGCCATCTGTGACATGAAGTTAGCAGTGMC~T~TGACCTATCATGTTCTCTTGATCACTCT AAATGGGGACCMCCATGAGTCCTGCGCTTTTCTTGACATTCCTCCMTTTTTGGMCTTA~ACCCC~G~~M CATTATTAATCTTGMCCAGTCTTAAATGTGTGCTGGTG

TACARIBE

VIRUS L GENE STRUCTURE

43

1

VC sense FIG. 3. Translation

of TV L gene in all six reading frames.

inserts (p96, ~65, and p92) were characterized by restriction mapping and showed by Northern blotting to contain L RNA sequences. Nucleotide

sequence of the L gene

Figure 2 shows the nucleotide sequence (presented as the DNA sequence in the viral complementary (vc) sense) of the first 6683 nucleotides of TV L RNA. The sequence at the 3’end of the viral RNA was determined in clones that were obtained employing a primer corresponding to the conserved 19 nucleotides -3’ of arenavirus S RNA. This sequence was shown to differ from the conserved sequence of the L RNA at positions 6 and 8 (Auperin et al., 1982). Therefore our sequence at the 3’ end (5’ in the vc sense) may not represent the exact TV L RNA sequence. Analysis of the nucleotide sequence in all six reading frames (Fig. 3) indicates the existence of a unique long open reading frame in the vc strand. The first in-phase AUG codon is in positions 31-33 surrounded by a sequence very favorable for initiation by eukaryotic ribosomes (ACCAUGG) (Kozak, 1984, 1987). The second in-frame methionine codon

is at positions 388-390, having a G in the -3 and A in the +4 position. Since the first methionine codon proximal to the 5’ end of the vc RNA is in a favored context, we presume that translation initiation in the L protein gene begins at this codon. The open reading frame ends at an amber codon at positions 6661-6663. The L protein The L protein of TV is a 2210 amino acid long polypeptide with an estimated molecular weight of 25 1,942 (Fig. 4). It is very similar in size to the L protein of the nonsegmented negative strand viruses which range from 2233 to 2109 amino acids in length (Schubert et al., 1984; Shioda eta/., 1986; Yusoff eta/., 1987; Tordo et al., 1988; Galinski et al., 1988; Blumberg et a/., 1988). The amino acid composition of TV L protein (Table 1) exhibits a higher content of Leu plus Ile and a lower percent of Ala than those found in an average protein (Dayhoff et a/., 1978). A similar characteristic has been reported for the L proteins of rhabdoviruses and paramyxoviruses (Morgan and Rakestraw, 1986; Blumberg et al., 1988; Galinski et al., 1988). Another

FIG. 2. Nucleotide sequence of TV L gene. The sequence is shown as a DNA sequence in the message sense. Nucleotides corresponding to positions 6 and 8 are those of the synthetic 19-mer oligonucleotide used as a primer for cDNA synthesis. This is complementary to the 3 terminus of arenavirus S RNA. In TV L RNA, nucleotides at positions 6 and 8 are, respectively, C and A (Auperin et al., 1982). The boxed nucleotide sequences represent the start AUG codon and the stop codon.

44

IAPALUCCI ET AL. 1 MDETVSELKDL~HIPNRHEFAHQKDAFLSHCHSGSLLQEGFKLLSNLVELESCESHAC HLNTCQKYVDVILSDHGIPCPTLPKVIPDGFKLTGKTLILLETFVRVNPEEFERKWKSDM TKLLNLKQDLLRSGITLVPVGRTNYSNRFTPEWWERIRSRSSAEIDIE DQEYQRLIHSLSNVRNQSLGFENIECLKRNLLEYDDRLAKSLFVGVKGDVRESVIREELM KLRLWYKKEVFDKNLGKFRITNRSELLNNLIRLGKHEDNTTSDCPFCVNKFMDIIYSLTF TALKRQDREKSNSELDQYWCPHEKAYLGVLSICNKIKGLK~NT~TLLFLDLI~F LDDLFTAKPEALDSLRRSGLILGQMVTLVNDRALDFLEAVKLIKKKIETNVKWVENCSKI LRRSOQDIWSQISWIARYPDLSKLISIAQTISSDRPIMRYSAGGNFNTECKHKTFHSD AEQVEAFKILSSVSLSLINSMKTSFSSRLLINEKEYSRLLINEKEYSR~GN~L~CYQQ~FLTDGLI VILFYQKTGERSGCYSIYTCEDGVLVEKGSFYCDPKRFLPIFSQEVLVEMCDEMTTWLD 601 FNSDLMVISKEKLRLLLLSILCAPSKRNQVFLQGLRYFLMAYSNQFHHMLLSKLKVECM SGSEVIVQRLAMLFQCLLGEGVDSDPYFARRFKYLLNVSYLCHLITKETPDRLTDQIKC FEKFIEPKIDFNCVIVNPSLGQLTEAQEGMMLDGLDKFYSKTL~CSDTKLPGVS~ELL SYCISLFNKGKLKVTGELKNDPFKPNITSTALDLSSNKSWVPKLDELGNVLSVYDRKM ISSCVSSMAERFKTKGRYNIDPSTLDYLILKNLTGLVSIGSKTQRDCEELSMMFEGLTEE QAEAFNDIKNSVQLAMVKMKDSKSGDVNLSPNQKEGRVKSSTGTLEELWGPFGIMREIRT EVSLHEVKDFDPDVLASDLYKELCDVVYYSSSKPEYFLERPLEVCPLGLLLKNLTTSAYF DEEYFECFKYLLIQGHYDQKLGSYEHRSRSRLGFTNEALRVKDEVRLSMRSNSEAIADK LDRSYFTNAALRNLCFYSDDSPTEFTSISSNNGNLKFGLSYKEQVGSN~L~GDLNTKL ITRLVEDFAEAVGSSMRYTCLSSEKEFDRAICDMLAVNNGDLSCSLDHSKWGPTMSPAL 1201 FLTFLQFLELRTPKERNIINLEPVLNVLRWHLHKVIEVPVNVAEAYCTGNLKRSLGLMGC GSSSVGEEFFHQFMPVQGEIPSHIMSVLDMGOGILHNMSDLYGLITEQF~~LDLLYDV IPTSYTSSDDQVTLIKLPCASDDNQ~EWLEMLCFHEYLSSKL~FVSPKSVAGTFV~ FKSRFFVMGEETPLLTQFVAAALHNVKCKTPTQLSETIDTICDQCVANGVSVQIVSKISQ R~QLIKYSGFKETPFGAVEKQD~~GTRGYRLQ~IESIFSDDEMTGFIRSC~V FNDIKRGKVFEENLISLIGRDGDDALVGFLRYSSCSEQDIMRALGFRWVNLSSFGDLRLV LRTKLMTSRRVLEREEVPTLIKTLQSRLSRNFTKGVKKILAESINKSAFQSSVASGFIGF CKSIGSKCVRDGEGGFLYIKDIYTKVKPCLCEVCNMKRGVIYCRPSLEKIEKIEKFSKPILWD YFSLVLTNACEIGEWVFSSVEPQIPWLSNRNLFWAVKPRIVRQLEDQLG~HVLYSIR KNYPKLFDEHLSPFMSDLQVRTLDGRKLKFLDVCIALDLNSVY 1801 IVKQSDCAMAHVRQSDWDKEVGLSPQQVCYNFMVQIILSSMVNPLVMSTSCLKSFFWFN EVLELEDDGQIELGELTDFTFLVRDQKISRAMFIEDIAMGWISNLEDVRLYIDKITIGE QPLAPGRHIM)LLDLLGNFDDHEDCDLRLIQVEHSRTSTKYRFKRI(MTYSFSVTCVSKV IDLKEASVFiLQWDVTQSVSGSGGSHLLLDGVSMIAGLPIFTGQGTFNMASLMMDADLVE TNDNLILTDVRFSFGGFLSELSDKYAYTLNGPVDOGEPLVLRDGHFFMGTEKVSTYRVEL TGDIIVKAIGALDDPEDVNALLNQLWPYLKSTAQVMLFQQEDFVLVYDLHRSGLIRSLEL IGDWVEiFVNFKVAYSKSLKDLWSDNQGSLRLRGIMCRPLARRNTVEDIE FIG. 4. Predicted amino acid sequence of n/ L gene

observation is the low content of Pro in TV L protein, which is half of that in an average protein (Dayhoff et a/., 1978). Assuming the charge of amino acids at neutral pH for lysine or arginine as + 1, for histidine as +0.5, and for aspartic or glutamic acid as -1, the L protein of TV is negatively charged (-7.5). The charged amino acids are evenly distributed throughout the molecule. The hydropathicity profile calculated according to the program of Kyte and Doolittle (1982) (not shown) shows no hydrophobic domains like the ones described in rabies virus and vesicular stomatitis virus L proteins (Tordo et al., 1988). The amino acid following the amino terminal Met in TV and in LCMV L protein is Asp (Fig. 5). This same amino acid follows the N end Met in the L proteins of Sendai virus (Shioda et al., 1986), measles virus (Blumberg et al., 1988) and human parainfluenza 3 virus (Galinski et a/., 1988). Since the initiator Met is frequently removed in vivo (Wold, 198 l), it would be predicted that proteins with N-terminal Asp would have a very short half-life in the cytoplasm (Bachmair et a/., 1986). This was pointed out initially by Blumberg et al. (1988) for the L protein of measles virus but appears to be a more general feature of virus L proteins.

Our finding of a 2210 putative protein in the viral complementary sense of TV L RNA is in agreement with earlier reports on LCMV L RNA. Romanowski and Bishop (1985) had predicted a genomic complementary L mRNA from the analysis of the 3’ terminal sequence of LCMV-WE strain. Later, Singh et a/. (1987) raised antisera against synthetic peptides whose sequences were predicted from the nucleotide sequence-in the vc sense-of L-specific cDNA clones of LCMVArmstrong strain. The antisera reacted with a protein of =200,000 Da in LCMV-infected cells, Why a protein of this size is not detected in TV-infected cells is a matter of conjecture. It is hoped that this question and that concerning the orjgin of the ~105 and p79 proteins consistently found in TV infected cells will be answered with the aid of antibodies directed against the L protein. Homology of TV L protein with L-specific predicted peptides of LCMV The amino acid sequence of TV L protein was compared with peptide sequences predicted from L-derived cDNA clones of LCMV strains WE (Romanowski

TACARIBE TABLE 1 AMINO ACID COMPOSITIONOF THE L PROTEIN Amino acid

Number

Phe Leu Ile Met Val Ser Pro Thr Ala Tyr His Gln Asn LYS Asp Glu CYS Trp Arg GIY

117 275 121 56 167 197 64 100 74 69 37 74 104 149 143 151 57 20 119 116

45

VIRUS L GENE STRUCTURE

Percentage 5.29 12.44 5.48 2.53 7.56 8.91 2.90 4.52 3.35 3.12 1.67 3.35 4.71 6.74 6.47 6.83 2.58 .90 5.38 5.25

Note. Total amino acids, 2210; molecular weight, 251,942.

and Bishop, 1985) and Armstrong (Singh et al., 1987). One of the clones corresponds to the 3’ end of the genomic L segment of LCMV-WE strain and spans amino acids l-364 of a predicted L protein. The other L-specific sequences derive from LCMV Armstrong strain and include two peptides that align in the first quarter of the molecule and a third sequence that shares homology with a region closer to the carboxyl end of TV L protein (Fig. 5). The overall amino acid identity between the compared sequences is 34% and increases to 42% when conservative amino acid changes are considered. Three more conserved regions can be distinguished near the amino terminal end of the molecule (Figs. 5A and 58). The first region with higher homology comprises amino acids l-208 with 55% of the positions strictly or conservatively maintained. Noteworthy is the frequency of Pro residues (two to three times the average abundance in TV L protein) occupying invariant positions within amino acids 80-t 5 1. A second homologous peptide spans positions 327-356. The third more conserved region begins at TV position 490 and extends up to the end of the aligned LCMV L sequence. In the region proximal to the carboxyl end of the molecule (Fig. 5C), alignment with LCMV peptide showed that the invariant positions are more evenly distributed throughout the sequence. Here, the overall homology is 44%.

We should like to emphasize that comparison of the entire sequence of TV and LCMV L proteins showed that the more conserved region lies in the central part of the molecule, spanning positions 680-880 (56% overall homology) and 1000-l 600 (62% overall homology). In these regions, the runs of consecutive invariant amino acids are up to 20 residues in length and lie in highly conserved stretches 50 to 150 residues long; homology in these latter sequences is 70-80% (not shown). It is noteworthy that when the L proteins of nonsegmented negative strand viruses are compared, the profile of conservation obtained (Tordo et al., 1988) is similar to that of TV and LCMV L proteins (The entire unpublished sequence of LCMV L protein was kindly provided by Maria Salvato for comparison). Since the L protein should function in transcription and replication of the virus RNA, we examined its sequence in search for motifs with invariant or highlysimilar residues conserved in RNA-dependent polymerases. One of these motifs is a 14 amino acid sequence consisting of an AspAsp sequence fianked by hydrophobic residues (Kamer and Argos, 1984). The amino acid sequence LIMVNFLDDLFTAK in TV L protein (residues 356-368) exhibits the consensus sequence of Kamer and Argos (1984). However, in the corresponding region of the predicted LCMV L protein sequence, this motif is not conserved. Instead, the peptide consists of a SerSer sequence surrounded by hydrophobic amino acids (Fig. 5). Another motif that we searched for is the pentapeptide QGDNQ, invariably present in the L protein of unsegmented negative strand viruses (Tordo et al., 1988; Galinski et al,, 1988). This motif was not found in TV L protein. In one of the influenza virus polymerase (P) proteins (PBI) a peptide sequence was detected that shares homology with a sequence predicted in TV L protein (Table 2). The peptide is conserved in influenza virus A and in influenza virus B PBl protein (Kemdirim et al., 1986). Moreover, the homologous sequence in TV L protein is conserved in the preTABLE 2 HOMOLOGYBETWEENA SHORT PREDICTEDSEQUENCEIN TV L PROTEIN AND A CONSERVEDSEQUENCEIN THE PB 1 PROTEINOF INFLUENZAA AND INFLUENZAB VIRUSES Amino acid residues TV La Influenza A virus PBl b Influenza B virus PBIC

1325-1335 441-450 440-449

a Fig. 4. b Sivasubramanian and Nayak (1982). c Kemdirim er al. (I 986).

YTSSDDO VTLI LQSSDD-FAU LQSSDD-FALF

46

IAPALUCCI

ET AL. B

A MDETIABLRBLCLNYlEGDERLSRGKLNFLGGR&PR"VLIEGLKLLSRCIEIDSADKSGC ::::* a:**: : t. :: :::: *:tt: l t: ::* "DET"SELKDLVRK"IPNR"EFAHQ&LS"C"SGSLiQEGFKLLSNLVELSSCESHAC

VKWVENCSKILRRSQQDIWS----QISV"ARYPDLSKLISIAQTISSDRPIMRYSAGGNF :

60

.

:

:t

::

:

::

;:

::

l ::::t::

+:::::

:::

;

::.:*

:

HLNTCQKYYOVILSDWGIPCPTLPKVlPDGFKLMKTLILL

120 TAC

NK~CIKED~VAGITLVDIVDGRCDYDNSFMPEWVNFKFRDLLFKLLEYS-SQDEKVFE ::::::t:::: t: : : :;:: t::: t: :: : : :: +:*:: TKI!LNLKQDLLRSGITLVFWDGRTNYSNRFTPEWWERIRWLLIEILRXSRSSAF,IDIE ': &DIYBKLLEY~-NQN~KVF~

LCl4QiE 180 TAC LCMAF#

ESEYFRLCESLKTTVDKRSGI4DSt4KILKDARSFRNDEIMXMCHDGVNPNt4SCDDVVFGIN LCMWE t :: :: :: . l l :: * :* :: DQEYQRLIHSLSNVRNQSLGFENIECLR-----------R-220 TAC t :: :: :: t :* :: :: * : : ESEYFRLCESLKTTIDKRSGSMXILKDARSTHNDEIMRMCHEGINFNMSCDDWFGIN LCMARM SFFGRFRRDL-------LNGKL-----KRNFQKVSPGGLIKEFSELYETLT------D-:: t:. :: ::: SLFVGVKGDVRESVIREELMKLRLWYKKEVFDKNLGKFR;TNR&,NNLIRLGK”S;NT ::: t :* : :

280 TAC LCMAPM

SLFSW-L-------ISG~~-----~~FQXVNPEGLLKEFSEL~~~------~--

LCMWE ::=

326 TAC ::*

NTEYERLL"SLNKVKSLKLLNTRRRGLLNLDVLCP---SSLI--KGSISKGLENDN : t: ::*:*::t :::: :: ::tt : : l : ---YLGVLSICNKIKGLINTNTRRNTLLF~LI~F~DLFTAKOELG ::*:*::a :::: ::::*.t : : . *:: *: STEYERLLSHLNKVKSLKLLT~QL~~VLCL---SSLI--KRSKFKGLENDKHIQVO QbfVTLVNDRALDF------LEAVKLIKKXIFTN t

.. .. .. ..

l

:

466 TAC

::

LcnARM

NTECKH-----KTF~SDAQ~K-I-----LSSVSLSLINS~TSFSS~LINEK .. .. .. .. .. .. . .. .. MKIGAliPI"YYTk-&DYNFQPST;QL&QSLMLSSVC&NS,,KTSSVAi~RQNQI

514 TAC :* LcMARn

EYSRYFGNVRLRECYQQRFFLTDGLIVILFYQKTGERSGCYSIYTCEDGVLVSKGSFYCD574 TAC ::

l : *

:

:f

::::::

:

:::

:*

:::

GSVRYQV-~CI(EVFCQV~KGDSEEY"~LYQKTGESSSRSYSIQGP-~"LI---SFY~ PKRFFLPIFSQEVLVE"CDSt4TTWLDFNSDLMV----ISKEK-LR-LLLLSILC .. .. .. .. .. .. .. .. .. .. .. .. .. : : : t:t :: : : :: :::: : PKRFFLPIFSDEVLYNMIDIMISWIRSCPDLKDCLTDI--EVALRTLLLU4-LT

LcwAPJ4 622 TAC LCl4APl4

C CKSIGSKCVFDGEGGFLYIKDIYTKVKPCLCEVCNMKRGVIYCRFSLEKIEXFSXPILWD 1680 TAC l t :: ::* :::: :::: CNRDGIT--LYICD-----,iQS"Pb+DKICLLRPL-----------LWN LCluRw YFSLVLTNACEIGEWVFSSVXEPQIPWLS--------NRNLFWAVKPRIVRQLEDQLG,4 . l ::: :::* : t :*: :*: :: :: YICISLSNSFELG----EPTKGKNNSENLTLKHLNPCDWARKPESSSLLEDKVNL

1732 TAC LCNAPJ

LCl4AF.M NHVLYSIRKNYPKLFDEHLSPFMSD~VNRTLDGRKLKFLDVCIALDLNNENLGIVSHLL 1792 TAC l .t::::t:* t:t : ;***::t* : .. .. .. .. .. : :* :*:= :::*:** LcMP.Fa NQVIQSVRRLYPKIFEDQLLPFPISDnSS~~WSPRfKLIDINSESLSLISHW 383 TAC l :

LcKAfa

KAF0NSVYIVKQSDCA"A"VRQ-SDYVDKEVGLSPQQVCYNF"VQIILSSMVNPLV"STS 1851 TAC :

l

l

:

:

::

:

:

:

:

::

:

:

::

::

:*

:

:

: l :

KWKRDEHYTVLFSDLANSHQRSDSSLVDEFW-STRDVCXNFLKQVYFESFVRSFVATTR

LCNAFU

CLKSFFWFN--EVLELEDDGQIELGELTDFTFLVRDQKISRYVISNLEDV . : :: :* :* : : :: : :: : :: : : l VVNXNVERpBIFPNDLqFGFG TLGNFSWFPHKENMPSED-GAXALGPFQSFVSX

LcwAru

410 TAC

:

CCYSSVNDRLVSFDSTKEE

‘:

LCNWE

:::

-ND---------DI---LHLS------KEAVES--------CP~IT~THGHERGSDA l :: *: TS~CPFCVNKFWIIYS~TFTALKRQDREKSNBE~~~~-----------X--:: :: ::a :: : :: -SD---------DI---LTLS------REAVES--------CP-IT~T"G"XRGSET

:

CLLWGLSFEHYGLSEH-----LEQECHIPFTEFE-----------NF

TAC

IHNHDDRSVETILIDSGIVCPGLPLIIPDGYKLIDNSLILLECF~TPASFE~IEDT

l :

LCNARM

1909 TAC

FIG. 5. Comparison of the amino acid sequence of TV L protein with peptide sequences predicted from L-derived cDNA clones of LCMV. Comparison was made according to the programme of Lipman and Pearson (1985). Conservations and conservative changes of amino acids are indicated between the sequences by colons and stars, respectively. Numbers indicate amino acid positions in TV L protein (TAC). The LCMV WE strain (LCMWE) sequence was taken from Romanowski and Bishop (1985) (A). Sequences of LCMV Armstrong strain derive from L cDNA clones L122 (A), L39 (B), and L123 (C) as reported by Singh era/. (1987).

dieted L protein of LCMV. A common motif in all the sequences is the tetrapeptide SSDD. (Comparison was made with the LCMV L protein sequence provided by Maria Salvato.) It would be expected that as more information on arenavirus L proteins becomes available, conserved regions will be more clearly distinguished. Some of these sequences would share homology with other proteins involved in RNA transcription and replication while others may be specific structures that fulfill the functional requirements for transcription and replication of an ambisense RNA. ACKNOWLEDGMENTS We are grateful to Maria Salvato (Scripps Clinic) for providing the entire sequence of LCMV before publication. We greatly thank Noel Tordo (Unitk Rage Recherche, lnstitut Pasteur) and Olivier Poch (Institut de Biologie Moleculaire du C.N.R.S., Strasbourg)for helpful dis-

cussions and pertinent suggestions. This work was supported by grants from Centre National de la Recherche Scientifique (U.A. 1129), France; Consejo National de lnvestigaciones Cientificas; Secretaria de Ciencia y Tecnic,a and Universidad de Buenos Aires, Argentina. The collaboration between our two laboratories was made possible by an INSERM (France)-CONICET (Argentina) convention

REFERENCES AUPERIN, D. D., COMPANS, R. W., and BISHOP, D. H. L. (1982). Nucleotide sequence conservation at the 3’termini of the virion RNA species of New World and Old World arenaviruses. virology 121,200203. AUPERIN, D. D.. ROMANOWSKI,V., GALINSKI, M.. and BISHOP, D. H. L. (1984). Sequencing studies of Pichinde arenavirus S RNA indicate a novel coding strategy, an ambisense viral S RNA. J. Viral. 52, 897-904. AUPERIN, D. D.. SASSO, D. R., and MCCORMICK, J. B. (1986). Nucleotide sequence of the glycoprotein gene and intergenic region of the Lassa virus S genome RNA. Virology 154, 155-l 67.

TACARIBE VIRUS L GENE STRUCTURE BACHMAIR, A., FINLEY, D., and*VARsHAvsKY, A. (1986). /n viva half life of a protein is a function of its amino-terminal residue. Science 234,179-186. BISHOP, D. H. L., and AUPERIN, D. D. (1987). Arenavirus gene structure and organization. In “Arenaviruses’‘-Current Topics in Microbiology and lmmunology(M. B.A. Oldstone, Ed.), Vol. 133, pp. 5-l 7. Springer-Verlag, Berlin. BLUMEERG,B. M., CROWLEY,1. C., SILVERMAN,J. I.1 MENONNA, J., COOK, S. D., and DOWLING, P. C. (1988). Measles virus L protein evidences elements of ancestral RNA polymerase. virology 164, 487-497. BOERSMA, D. P., SALEH, F., NAKAMURA, K., and COMPANS, R. W. (1982). Structure and glycosylation of Tacaribe viral glycoproteins. Virology123,452-456. BUCHMEIER, M. J., and PAREKH, B. S. (1987). Protein structure and expression among arenaviruses. In “Arenavirus”-Current Topics in Microbiology and Immunology (M. B. A. Oldstone, Ed.), Vol. 133, pp. 41-57. Springer-Verlag, Berlin. CASADABAN, M. J., and COHEN, S. N. (1980). Analysis of gene control signals by DNA fusion and cloning in Escherichia co/i. 1. Mol. Biol. 138,179-207. CHIRGWIN,J. M., PRZYBYLA,A. E., MCDONALD, R. J., and RUTTER,W. J. (1979). Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18, 5294-5299. CLEGG, J. C. S., and ORAM, J. D. (1985). Molecular cloning of Lassa virus RNA: Nucleotide sequence and expression of the nucleocapsid protein gene. Virology 144, 363-372. DAYHOFF, M. O., HUNT, L. T., and HURST-CALDERONE,S. (1978). Composition of proteins. In “Atlas of Protein Sequence and Structure” (M. 0. Dayhoff, Ed.) Vol. 5, Suppl. 3, pp. 363-373. Natl. Biomed. Res. Found., Washington, DC. FRANZE-FERNANDEZ,M. T., ZETINA, C., IAPALUCCI, S., LUCERO, M. A., BOUISSOU, C., LOPEZ, R., REY, O., DAHELI, M., COHEN, G. N., and ZAKIN, M. M. (1987). Molecular structure and early events in the replication of Tacaribe arenavirus S RNA. Virus Res. 7, 309-324. GALINSKI,M. S., MINK, M. A., and PONS, M. W. (1988). Molecular cloning and sequence analysis of the human parainfluenza 3 virus gene encoding the L protein. virology 165,499-510. GARD, G. P.. VEZZA, A. C., BISHOP, D. H. L., and COMPANS, R. W. (1977). Structural proteins of Tacaribe and Tamiami virions. viralogy 83,84-95. GIMENEZ, H. B., BOERSMA,D. P., and COMPANS, R. W. (1983). Analysis of polypeptides in Tacaribe virus-infected cells. Virology 128,469473. GUBLER, U., and HOFFMAN, B. J. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25,263-269. HARNISH, D. G.. DIMOCK, K., BISHOP, D. H. L., and RAWLS,W. E. (1983). Gene mapping in Pichinde virus: Assignment of viral polypeptides to genomic Land S RNAs. J. Viral. 48,638-641. HOWARD, C. R. (1986). In “Arenavirus”-Perspectives in Medical Virology (A. J. Zuckerman, Ed.), Vol. 2, pp. 130-l 38. Elsevier, Amsterdam. KAMER, G., and ARGOS, P. (1984). Primaty structural comparison of RNA-dependent polymerases from plant, animal and bacterial viruses. Nucleic Acids Res. 12, 7269-7282. KEMDIRIM, S., PALEFSKY,J., and BRIEDIS,D. J. (1986). Influenza B virus PBl protein: Nucleotide sequence of the genome RNA segment predicts a high degree of structural homology with the corresponding influenza A virus polymerase protein. Virology 152, 126-l 35. KOZAK, M. (1984). Compilation and analysis of sequences upstream from the translational start site in eukatyotic mRNAs. Nucleic AcidsRes. 12,857-872.

47

KOZAK, M. (1987). Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Ce//44,283-292. KYTE,J., and DOOLITTLE, R. F. (1982). A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105-l 32. LIPMAN, D. J., and PEARSON,W. R. (1985). Rapid and sensitive protein similarity searches. Science 277, 1435-l 441. LOPEZ, R., and FRANZE-FERNANDEZ,M. T. (1985). Effect of Tacaribe virus infection on host cell protein and nucleic acid synthesis. J. Gen. Viral. 66, 1753-1761. MAXAM, A. M., and GILBERT,W. (1980). Sequencing end-labeled DNA with base specific chemical cleavages. /n “Methods in Enzymology” (L. Grossman and K. Moldave, Eds.), Vol. 65, pp. 499-560. Academic Press, New York. MORGAN, E. M., and RAKESTRAW,K. M. (1986). Sequence of the sendai virus L gene: Open reading frames upstream of the main coding region suggest that the gene may be polycistronic. Virology 154,3 l-40. ROMANOWSKI,V., ATSURA, Y., and BISHOP, D. H. L. (1985). Complete sequence of the S RNA of lymphocytic choriomeningitis virus (WE strain) compared to that of Pichinde arenavirus. Virus Res. 3, 1 Ol114. ROMANOWSKI,V., and BISHOP, D. H. L. (1985). Conserved sequences and coding of two strains of lymphocytic choriomeningitis virus (WE and ARM) and Pichinde arenavirus. Virus Res. 2,35-51, SANGER,F., NICKLEN, S., and COULSON,A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Nat/. Acad. Sci. USA 74, 5463-5467. SCHUBERT,M., HARMISON, G. G., and MEIER, E. (1984). Primary structure of the vesicular stomatitis virus polymerase (L) gene: Evidence for a high frequency of mutations. J. Viral. 51, 505-514. SHIODA, T., IWASAKI,K., and SHIBUTA, H. (1986). Determination of the complete nucleotide sequence of the Sendai virus genome RNA and the predicted amino acid sequences of the F, HN and L proteins. Nucleic Acids Res. 14, 1545-l 563. SINGH, M. K., FULLER-PACE, F. V., BUCHMEIER, M. J., and SOUTHERN, P. J. (1987). Analysis of the genomic L segment from lymphocytic choriomeningitis virus. Virology 161,448-456. SIVASUBRAMANIAN,N., and NAYAK, D. P. (1982). Sequence analysis of the polymerase 1 gene and the secondary structure prediction of polymerase 1 protein of human influenza virus AiWSN/33.J. Viral. 44,32 l-329. SOUTHERN, P. J., and BISHOP, D. H. L. (1987). Sequence comparison among arenaviruses. /n “Arenaviruses”-Current Topics in Microbiology and Immunology (M. B. A. Oldstone, Ed.), Vol. 133, pp. 19-39. Springer-Verlag, Berlin. SOUTHERN, P. J., SINGH, M. K.. RIVI~RE,Y., JACOBY, D. R., BUCHMEIER, M. J., and OLDSTONE, M. B. A. (1987). Molecular characterization of the genomic S RNA segment from lymphocytic choriomeningitis virus. Virology 157, 145-l 55. TORDO, N.. POCH. O., ERMINE, A., KEITH. G., and ROUGEON,F. (1988). Completion of the rabies virus genome sequence determination: Highly consewed domains among the L (Polymerase) proteins of unsegmented negative-strand RNA viruses. Virology 165, 565576. WOLD, F. (1981). “/n vivo” chemical modification of proteins (posttranslational modification). Annu. Rev. Biochem. 50, 783-8 14. YUSOFF, K., MILLAR, N. S., CHAMBERS, P., and EMMERSON,P. T. (1987). Nucleotide sequence analysis of the L gene of Newcastle disease virus: Homologies with Sendai and vesicular stomatitis viruses. Nucleic Acids Res. 15, 3961-3976.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.