DNA sequence comparison of micropia transposable elements fromDrosophila hydei andDrosophila melanogaster

Share Embed


Descripción

Chromosoma (Berl) (1990) 99 : 11l-117

CHROMOSOMA © Springer-Verlag1990

DNA sequence comparison of micropia transposable elements from Drosophila hydei and Drosophila melanogaster Dirk-Henner Lankenau, Peter Huijser*, Erik Jansen, Koos Miedcma, and Wolfgang Hennig Department of Molecular and Developmental Genetics, Catholic University, Toernooiveld, NL-6525 ED Nijmegen, The Netherlands ReCeived October 1, 1989 / in revised form January 19, 1990 Accepted January 19, 1990 by H. J/ickle

Abstract. Members of the retrotransposon family micropia were discovered as constituents of wild-type Y chrom o s o m a l fertility genes from Drosophila hydei. Several members of the micropia family have subsequently been recovered from Drosophila melanogaster and four micropia elements, micropia-DhMiF2, -DhMiF8, - D m l 1 and -Dm2, two each from D. hydei and D. melanogaster, have been totally sequenced ( 1 7 k b of micropia sequences and 6.8 kb f r o m insertions) 1. Comparative analysis of micropia sequences revealed a complex pattern of divergence within a single Drosophila genome. The divergence includes deletions, possibly by a slipped mispairing mechanism, insertions of a retroposon, and of another retrotransposon (copia) and "positional nucleotide shuffling" within the tandem repeats of the 3' non-protein-coding region of micropia elements. A 10 bp long sequence of each repeat unit of the 3' tandem repeats of micropia elements is highly conserved and is therefore a candidate of functional importance either in transposition events or in regulatory activity on flanking D N A sequences.

Introduction The putative biological functions of transposable elements relevant in the regulation of cellular genes have already been considered by McClintock (1956). Whether transposons became regulatory modules and pacemakers o f wild-type genes in the course of evolution has

* Present address: Max-Planck-Institut fiir Zfichtungsforschung, Egelspfad 3, D-5000 K61n 30, Federal Republic of Germany Abbreviations: LTR long terminal repeat; PBS primer binding site; PolII RNA polymerase II; bp base pairs; kb kilobases (pairs); LINE long interspersed sequence; MHC major histocompatibility complex ~The DNA sequence of micropia-Dm2 has not been published (EMBL sequence data library accession no. X14173). The other sequences have been published by Huijser et al, (1988) and Lankenau et al. (1988) (accession no. X14037) This paper is dedicated to the 90th birthday of Prof. Dr. Bernhard Rensch. OJfprint requests to: W. Hennig

not yet been confirmed. It has been pointed out that none of the genuine transposable elements analysed so far is an integral part of wild-type genes (Schwarz-Sommer and Saedler 1987). Recently however retrotransposons of the micropia family have been identified as natural constituents of fertility genes of Drosphila hydei (Huijser et al. 1988). Micropia element transcripts are part of the giant Y c h r o m o s o m a l transcription units o f the lampbrush loops " T h r e a d s " and "Pseudonucleolus" in p r i m a r y spermatocytes. To a p p r o a c h an understanding of the function of micropia elements within these wild-type fertility genes we c o m p a r e d the evolutionary changes of two micropia elements microdissected from the lampbrush loops Threads of D. hydei (Hennig et al. 1983; Huijser etal. 1988) and of two randomly chosen micropia elements f r o m Drosophila melanogaster (Lankenau et al. 1988; D.-H. Lankenau, unpublished results) by sequence comparison. Under certain conditions, as outlined in this paper, the evolutionary differences and similarities can serve as experiments of nature to identify putative functional sequences, because sequence conservation is expected. O f particular interest is the 3' tandem repeat region. We identified sequences of 10 bp within each repeat unit of all micropia elements which are highly conserved. They are therefore candidates of functional importance either in transposition events or in regulatory activity on flanking D N A sequences. Additionally molecular drive events which have taken place in members of the micropia family are described.

Materials and methods Molecular techniques. Isolation of nucleic acids was carried out according to standard protocols (Maniatis et al. 1982). DNA blotting, labelling by nick translation and hybridization are described by Hennig et al. (1982) and Huijser and Hennig (1987). DNA sequencing was performed by the dideoxy chain termination method of Sanger et al. (1977) as described (Lankenau et al. 1988). Computer analysis. The analysis of DNA sequences was performed with the aid of computer programs from Pustell and Kafatos (1984) and a Turbo-Pascal program package from C.R. Lankenau. Dot matrixes were computed as described (Pustell and Kafatos 1982).

112 Codon bias and coding prediction analysis was done with the Cstatistics program of Pustell and Kafatos (1986). DNA sequence data were taken from GenBank release 52.0. RNY-rhythm analysis was done according to Shepherd (/981). CAP site-TATA box correlation analysis was used to produce constraint profiles of RNA polymerase II (PolII) promoters as described (Lankenau et al. 1988, 1989). The retroposon inserted into micropia-Dm2 was screened on the EMBL sequence data library with the aid of the search program FASTN (Lipman and Pearson 1985).

p r o p e r t i e s o f a n i n t a c t r e t r o t r a n s p o s o n ( L a n k e n a u et al. 1988). H o w e v e r , this e l e m e n t also has two defects w h i c h m i g h t d e s t r o y its a b i l i t y to t r a n s p o s e a u t o n o m o u s l y . T h e first defect is a d e l e t i o n o f 30 n u c l e o t i d e s in the 5' L T R , d e s t r o y i n g the C C A A T box. This deletion, like m o s t others in different m i c r o p i a elements, is f l a n k e d b y s h o r t direct r e p e a t s (Fig. 2). T h e s e c o n d defect is a 4 b p deletion w i t h i n the i n t e g r a s e c o d i n g region, causing a r e a d i n g f r a m e shift t h a t d o e s n o t exist in the o t h e r m i c r o p i a elements (Fig. 1) ( L a n k e n a u et al. 1988). M a n y o t h e r r e a d i n g f r a m e defects are b a s e d on single n u c l e o t i d e m u t a t i o n s in m i c r o p i a - D h M i F 2 , - D h M i F 8 a n d - D i n 2 (Huijser et al. 1988). In the 3' n o n - p r o t e i n - c o d i n g r e g i o n a n d also in the c o d i n g sequences o f the elements we find deletions w h i c h

Results

An overview o f rearrangements in micropia elements Minor modifications. A m o n g the f o u r s e q u e n c e d m i c r o p i a elements (Fig. 1) m i c r o p i a - D m 1 1 shares m o s t o f the

Orosophila melanogaster Leu tRNA .~1/pbs il MHC

~TR IIlmicropia,Dm11/5 = TACA

H [ f L'PROT I

~-

RT

tandem

H RNase

I t I

I

INT

TACA 3 I

I IIII11

3'pbs

ORFs 51 LTR AGCAA ,......~ *7

HE ii

\

3I LTR ~=~----~ ~AGCAA

~

insertion of total oopia element

L T E 5 vector ~ 1 i

l micropia Dm2]

E i ~

R

/ 5'pbs Leu tRNA

~ MHC

~

r~trop .... I f I PRO]" I LI I

RT

H RNase

I t I

E I

INT

tandem ~ l illll

H~LTR nsert on 3'pbs

ORFs

Drosophila hydei E LT".~

Imicropia DhMi21 5' vectorl |]

f 5pbs Leo tRNA

H

MHC

f I PROT J

I

RT

I t I

/

I

INT

,

!~

JJJl[J Jill

LTR

3'

E vector 3'pbs

H

,

l micropia DhMi81

tandems

E

' RNase

5J

-

entiedi ....

vector i RT I t IRNasel

E

i

tandem alignment . . . LIrt . gap ~ puz.

[lllll . . . .

r~--T~;1~

1 3'prr

insertion of retroposon?

3~

T T X T X T T AAT T T T T T T T X X T

lkb lOObp

Fig. 1. Macro-alignment of four micropia elements from Drosophila hydei and Drosophila melanogaster. All positions are exactly defined at the DNA sequence level. A detailed structural analysis of one element, micropia-Dml 1, has been given by Lankenau et al. (1988). The most conserved part of all elements includes the protease (PROT) and parts of the reverse transcriptase (RT). The 3' non-protein-coding region, including the 3' tandem repeats (tandem; cf Fig. 5) is highly diverged (compare Fig. 4). The long terminal repeats (LTRs) of the D. melanogaster micropia-Dml 1 and -Din2 elements are homologous and the LTRs of the D. hydei micropia-DhMiF2 and -DhMiF8 elements are at least partially homologous. In contrast, the LTRs of D. hydei and D. melanogaster share no sequence similarity, LTRs of micropia-Dmll and -DhMiF8 both possess internal tandemly duplicated sequences (arrowheads) which are not homologous between the two species. Most obvious rearrangements of these elements are (1) a short deletion within the 5' LTR of micropia-Dmll, (2) the insertion of a complete copia element and a retroposon into micropia-Dm2, (3) an abrupt ending at the 3' end of micropia-Dm2 perhaps caused

by another insertion of an unidentified transposable element, (4) a large unidentified insertion unprecisely flanked by 19 bp duplications in micropia-DhMiF8 (Huijser et al. 1988), and (5) a shorter 3' end of micropia-DhMiF8 followed by a poly(T/A) tail which might belong to another unidentified retroposon or DhMiF8 itself. The reading frame shifts shown for D. melanogaster are explained in the text. An alignment gap was introduced because of a shorter 3' non-protein-coding sequence region in micropia-DhMiF8 (no deletion can be observed). Target site duplications are indicated for the copia element and micropia-Dmll; The MHC region is only similar to the maj or histocompatibility complex genes of mammals, and is most likely not homologous (Lankenau et al. /988). put putative; ORF open reading fi'ames of micropia-Dmll and -Dm2; pbs primer binding site; prr purine rich region; f CCHCfinger motif of retroelements; t tether; RNase homology to bacterial RNase H ; INT integrase; E EcoRI ; H HindII1 ; wavy lines show the sequenced parts of copia inserted into micropia-Dm2; E and H in copia are mapped restriction sites which are consistent with the published sequence of copia (Emori et al., 1985)

113 DmlI-5'LTR

218 C G G G A T T T T G C A A A A A C G A

CTTGCGCTG

Dmll-3'LTR

5169 C G G G A T T T T G C A A A A A C G A

.... 25 .... G G C C A C T T G C G C T G

Dmll

2251

Mi2

2018 G A C G G T C A A .... 27 .... ~ T C A A A A A T

GACGGC

may be based on a "slipped mispairing" mechanism (Figs. 2 and 4) as has been suggested to account for some deletions in human globin genes (Efstratiadis et al. 1980). Statistically it is unlikely that short direct repeats (4 8 bp) always occur in the flanking sequences close to the deleted fragments just by chance. Therefore one might suggest that the direct repeats promoted deletions perhaps by slipped mispairing during DNA replication according to the model of Efstratiadis etal. (1980) (Fig. 2).

CTCAAAAAT

Dmll

3691 T A T C T G T T A C C T T A A .... 14 .... AGCTGTGTTAAT

Mi2

3481 CACCTGT-ACCAC

Dmll Mi2

4623 GACGATGA GTTTGGATTGAA ******** * * ** * 4313 G A C G A T G A T T G T T T ..... 82 ..... ATCATGTTTTAT

DN2

4281 AAAT

Hi2

4201

A-CTGTATCAGG

slippage 1 in Fig.4

ATAGAA

AAATGGTCTGATAGAA

Dm2

3576 A G A T G C C G A A A G ..... 15 ..... GAAATCTTC

Mi2

3505 CCAT

Dm2

2106 T G T G C C A G A

Hi2

2008 T G T G A C C C C G G A C ..... 29 ..... CGGACTCAAAAA

AAAG

TCTTC

CGGCCTCAAAAA

Ni8

2983

tandem-i ********

ATTGAGTTT *** *****

Ni2

4223

tandem-i

ATTG ........ 212 + T2 ......... A T T A A G T T T

Mi8

3059 A T T G A G T T T T G A A T - T

Mi2

4638 A T T A A G T - T T G A A A - T ..... 50 .... T G T - C A A G G T C A - G G A - T

Dmll

4871

slippage 4 in Fig.4

GT-AAATTTGCCATA-T slippage 3 in Fig.4

G A A T A G T G A T G A A A G T ..... 50 ..... G T G A A A T G T C A - - G A A T

Fig. 2. The major deletions in micropia elements. Ten major deletions of 14 to 300 bp were detected. Eight of these are flanked by short direct repeats of between 4 and 8 bp (two of them are flanked only by 2 bp direct repeats, and are therefore not significant). The distance of repeats to the site of deletion does not exceed 12 nucleotides while the chance of such a duplication is only 4 -n (where n is the length of one repeat unit). Since eight of the ten deletions possess direct repeats with a length of 4 bp (five times), 5 bp (twice) and 6 bp (once) which statistically would be expected once in 256, 1024 and 4096 bp, a significant correlation seems to exist between deletions and short direct repeats. This seems even more to hold true, when it is taken into account that the accuracy of the alignments is reduced by evolutionary divergence. Seven of the ten deletions remove one of the repeats entirely and either none or part of the other repeat. This pattern is very similar to that observed by Farabaugh and Miller (1978), and Efstratiadis et al. (1980). These authors point out that the presence of direct repeats could promote deletions by slipped mispairing during D N A replication according to a model proposed by Streisinger et al. (1966). Compare also with slippages shown in Figure 4. The deletion of D m l l - 5 ' L T R has been confirmed by sequencing several different clones originating from different transformations. Mi2 and Mi8 micropia-DhMiF2 and -DhMiF8 respectively; Droll and Din2 micropia-Dml 1 and -Dm2; numbers within dotted lines' indicate the number of nucleotides not shown. T2 tandem repeat cluster 2 of micropia-DhMiF2. The numbers in front of each sequence represent the published sequence numbers. For micropia-Dm2 the numbers refer to the element without the copia insertion

Large insertions: retroposon, copia, and "unidentified" insertion. The frame shift within the protease region of micropia-Dm2 is caused by an insertion 90 nucleotides in length (Fig. 3) flanked by an 8 or 6 nucleotide target site duplication. The inserted sequence possesses two open reading frames extending to the end of the insertion and two polyadenylation signals. Since the insertion carries a poly(A) tail sequence at the 3' end as typically added in a polyadenylation reaction to RNA transcripts, all characteristics fit different insertion models for retroposons, like the "in situ cDNA synthesis" model (Rogers 1985) or the mechanism proposed for Alu sequences by Jagadeeswaran et al. (1981). Another large insertion into the RNase/integrase region of micropia-DhMiF8 has been described by Huijser et al. (1988) (Fig. 1). The exact ends of this insertion cannot be identified even though it is characterized by a duplication of 10 bp. It might cary a DNA sequence derived from a prior site of insertion as does the jockey element near the yellow gene (Geyer et al. 1988). One of the 10 bp duplications created as a consequence of the insertion of jockey is not immediately adjacent to its insertion site but at a distance of 25 bp. The DNA between the duplication and the jockey element seems to originate from the chromosomal location where jockey was located before the transposition event. The largest insertion found in micropia occurred within the region similar to MHC of micropia-Dm2. Here, a complete copia element is integrated with the target site duplication 5'-AGCAA-copia-AGCAA-3'. We sequenced parts (1.4 kb) of the copia element and found no differences at positions 1-70, 2768-3220, 41464650 and 4768-5143 from the published sequence (Emori et al. 1985). In the regions that had not been sequenced target site

Dmll

AA T T G T T C ** * * * * * *

Din__2_2 AA

TTGTTC

TTTTATATGTTAATTGCGCTGTTATGTTACTGTTACTGCATTGTATTGATTCATCGC

1__) Dmll Dm2

poly

ORFs

1.3

A

TTCTAAATAAATAAATATATAAAAAAAAAAAA

ORFX+S

TTGTTC

CGTTAC ******

3'

CGTTAC

3'

slt* duplication

Fig. 3. Retroposon insertion into micropia-Dm2. Two putative polyadenylation signals are located unusually close to the poly(A) tail. No significant homologies were detected in the EMBL D N A sequence library, searching 22 x 106 nucleotides, poly A polyadenylation signal; O R F open reading frame

114

we mapped the EcoRI, HindIII and XmnI sites at the expected positions. Thus it seems that this copia element is not modified at all and may have been actively transposed more recently into micropia-Dm2 (Fig. 1).

Identification of conserved sequences in the 3' non-proteincoding region including the 3' tandem repeats of micropia elements The pattern of divergence in micropia 3'non-protein-coding region in D. hydei and D. melanogaster. Micropia elements in both D. melanogaster and D. hydei possess a non-protein-coding region between the 3' end of the large open reading frame and the 3' LTR. In micropiasl|pplael Mi8

tandem-i

...........................................................

Mi2

tandem-i

ATTG_...__TTTCTTTTTGAATGAAATTTGGAAGTTTAGTTAAAGAAAATGTAAAATCGACAA

Dmll

tandem-i

.......................................................... sllpplge4

4377

fllpp|gel Mi8

.....................................................................

Mi2

TTTGGGCAAAATATTATGTAATAAAACAAGCATCAT'GTTTTATTTTTGAAACrTGCATAGGTGAAGTTA * * ** * ** ** * * *

4446

Dmll

...............................

*****

*

GTTTGGATTGAATTAATAATCAAGTGTGTGTGAACTGG 4668

Mi8

.....................................................................

Mi2

TTGAA--TTGAATTGAAAGAAATATGTTTTCAAATGTTTTAATTAAGAATAAATGTTAAAAGTTTGTA*** . * *~ . ****** ~**** * ***** **** * *

4513

Dmll

CGGAAGATCG-ATATATAGAAAT

Mi8

....................................................................

,

***

.... CGATAAATGATAATGTTAAG-ATAAGTTGTGAGCTGATGTAT 4731

Mi2

TGAAGAAATGTTGAACTGAATAT . ** . ***********

Dmll

TACTGATCAATGGAACTGAATATGAAATAGAATAAGTTATCCCAGCAACAGTGAAATAAGAGCTGTTT

: ..............

tandem-2

....................

......

4800

s11ppswe~

Mi8

:

?

....................................................................

Mi2

..............................

Dmll

TGTTTCTTCACAGAATTAAGATTTAAGAAATACACCTGATAAAGTCAAACTAATGAAATTAAATGTTAT

tandem

2 .............................. 4869

Illppngo4 .... slipped

mlspelrlng

Mi8

--ATTGAGTTTTGAAT-T

Mi2

--ATTAAGTTT-GAAA-TCTATTAC~AAGACATTTTTAAAGTTAATGTTTGGCATATTACA* ** * **** * * * ** ** ** * * *

***

Dmll

*****

***

....................

*

4694 *

*****

TGAATAGTGAT-GAAAGTAGGTGATCTTGATATCTTGGTATCTCGGTATCAAAAGCTTACAC ............ ?

sllppnge2

4930

Micropia 3' tandem repeats are conserved and therefore seem to be functional. Significant evidence for functional

nllppngl3

3074 Mi8

Din11, -Dm2 and -DhMiF2 this region is about 550 bp long, while in micropia-DhMiF8 it spans only 180 nucleotides. While it has been possible to assign well-known functions to all other regions of micropia elements (Lankenau et al. 1988) the highly conserved 3' tandem repeats within this region represent a new feature o f retroelements. To assess possible constraints against mutations we carried out a comparative sequence analysis (Fig. 4). Such an analysis is dependent on a proper alignment of the sequences. When sequences are compared that have been constrained by clear functional pressure during evolution (e.g. RNA- or protein-coding sequences), a sufficient number of homologies and invariances distributed along the entire sequence almost always allows an assignment of positions (Eigen et al. 1985). A priori this does not hold true for non-coding regions because the sequences might have diverged to complete randomization. Other problems are caused by molecular drive events, modifying the sequences to such an extent that alignments are difficult or impossible. Figure 4 shows an alignment of the non-protein-coding region of the three micropia elements, -DhMiF8, -DhMiF2 and -Din11. Some sequence blocks within the alignment (Fig. 4) are better conserved than others. The overall similarity in different stretches of the compared sequences ranges between the lowest K_min value (54.3%) and the highest Kmax value (62.9%) (Miyata 1982; see Fig. 4 legend). Two long sequence blocks are nearly identical in D. melanogaster and D. hydei: 5 ' T X G A A C T G A A T A T - 3 ' (DhMiF2 position 4524; micropia-Dml I position 4742) and the ( + ) strand primer binding site region 5 ' - T T A C A X G A G G A C G T G X XAAXGTCAGXATGGCCG-3' (DhMiF2 position 4690 and m i c r o p i a - D m l l position 4925); these might represent functional islands (Fig. 4). However we cannot exclude that the divergence of these conserved blocks might not yet have reached total randomization just by chance.

...... GTAAATTTGCCATATTGGCC ** • ** *****

:..... >prr

Hi2

GAGGACGTGTCAAGGTCAGGATGGCCG

:..... >LTR

Dmll

GAGGACGTG-AAATGTCAGAATGGCCG:

..... >LTR

mI|ppBgQ3

and

pIS

DhMiF2 Dmll

reglo.

Fig. 4. Alignment of non-protein-coding and 3' tandem region. This alignment is based on alignments of the 3' end of long ORFs from the integrase-coding region. These alignments (compare Lankenau et al. 1988) define the places of the first tandem repeat unit within the non-coding region (Fig. 1). From here, the alignment can be extended towards the 3' LTRs. Good positional assignment of sequences can be achieved if we take into account that often deletions of DNA sequences might have occurred by slippage replication (Efstratiadis et al. 1980; Fig. 2 this paper). Therefore we should find shorter duplications close to many larger gaps within the alignment. Such duplications in turn argue for a correct alignment. Even though we have to account for modifications within the ancestral duplications that "catalysed" the deletion, we can indeed identify such duplications in every large gap of this alignment (slippages 1~4 in Figs. 2 and 4; duplications flanking the deletions are underlined). '"", primer binding site

sequences has only been obtained for the 3' tandem repeats (Fig. 5). While the micropia alignments in Figure 4 are only based on 4 sets of sequences, in the case of the 3' tandem repeats we can work with an extended set of data (25 repeat units, including unpublished c D N A sequences). An alignment of a representative number of tandem repeats is given (Fig. 5). From this we can appraise under what constraints evolution has " w o r k e d " on these sequences. Comparisons of the repeat segments are possible at four distinct levels which we shall consider in the following paragraphs: (level 1) within one micropia element, (level 2) between elements of one species, and (level 3) between the species D. melanogaster and D. hydei. The fourth level is a result of alignment (Fig. 5) and describes, for example, the distribution of variability within one segment. The major result of the sequence alignments is the identification of two highly conserved sequence blocks of 10 bp and 4 bp within each tandem unit. The 10 bp

115 highly conserved

~ :

! .

.

.

.

.

.

.

.

.

.

.

.

.

.

1 Din2 TI

.

.

.

.

.

.

.

.

.

.

.

I I $-2 Dmlll T1 I S-3

.

I

I

.

.

.

I ~ [ TCATCGTCTC IACCT G IACGG ] .~ I TCATCGTCTC ] ACCTAG I ACGG

TCRTCGTCTC IACCTAG IACGG IATATCTC

.

,;;[

.

.

.

.

.

.

CG T

. . . . .

I ~ I TCRTCGTTTC ITCTTAA IACGG I ~ I TCGTCTC I TCTTRG, RCGG . . . . . . . . . . . .

I I S-2 Mi2 I T2 1S-3 I IS-4 ;;-[

.

TCA . . . . . . . . . . . .

....

.

I. . . . . . . . LA.T..AA..CTGR CART I ~ ~ I ATA~ CART

IS- 5 I IS-6

Mi2 T1

.

" " CA AT

I 4{ I 1. . . . . . . . TCATCGTCTC, ACCTAG IACGG IA T R ~-~-~ CART

I 1S-2 I ] S-3

.

~"

I IS-4

....

Mi8 T1

_2

TCATCGTCTC IACCTAGI ACGG I.A.T.A .~-T~ CART I ~ l I TCATCGTCTC IACCTAG IACGG IATA

.

....

~ 2_I

.

I

S-5 .

!

'i 4{ i i TCATCGTCTC I ACCTAG ] ACGG ] A.T,A~-~

S-4

.

I

TCATeGTCTCIRe~TG IACGGI~T.A.~t"~'~CART

Is- 2 I I S-3

I .

.

variable highly variable conserved

~~

I IRTAACCATT ARCAA I ...... I RTATCCAACTGATAA ....

I ag CRTCGTCTC I ACTTGG l ~ CTTCGTCTC IACTTGG ~ ~ CATCGTCTC 1ACTTGG

I I ACGG I IACGG I IACGG

. . . . . . . . . .

I ] ~T~.. C 1 ] ~ C I I TTCR C

. . . . . . . . . . . .

I I S-2 I IS-3

. . . . . . . .

~~

I ~ 1 I TCATCGTCTC I ACTTGG ] ATGG I 1 ~ I I TCGTCTCIACTTGGIACGGIAT

[E~]

C

CC

CTGA C

CC

IS-4 TCRGCTTC CIACT AGIACC IAT TTGA CGA GCCC ............................................................... ** ** * * ** ** *** **** L1 TC TTTAIAGGGIAT CCTG CART rc 3'5' ............................................................... ** ******* , ** • ************ satelliteadjacent plasmid 1.672-453

TC TCGTTTC A

GGGIA

IATAACCCACCAA C

TTG

Fig. 5. Positional alignment of micropia 3' tandem repeats. Tandem repeats are shown reversed complementary to the published micropia sequences. The analysis shows that there is a highly conserved region-l, a variable region-I, a conserved region-2 and a highly variable region-2. These different sequence blocks might have evolved in different fashions, it is most likely that the conserved regions reflect functional pressures acting on them. Short sequence blocks (here represented by differently marked blocks of nucleotides) may be well "conserved" within the variable regions. But the pattern of occurrence of these sequence blocks within one micropia element, within one species or between Drosophila hydei and Drosophila melanogaster resembles a random shuffling of playing cards. For further explanations see text. S segment; T1 tandem1 in a cluster; Mi2, Mi8, Din2, Dmll tandem repeats from four micropia elements; L1 LINE1 (Wincker et al. 1987); 1.672-453 is a moderately repeated DNA adjacent to simple satellite DNA namely (AACAATA)68... of D. melanogaster (Lohe and Brutlag 1987). A insertion within 1.672-453. * positions identical with at least one micropia element. Filled and outlined stars represent species-specific nucleotides long sequence especially is 100% conserved within every single micropia element (level 1), in elements of the same species (level 2) and even between the two species D. melanogaster and D. hydei (level 3). In addition this se-

quence is found in transcripts of D. melanogaster (D.-H. Lankenau, unpublished results) (levels 1 and 2). The significance of the conservation pattern becomes clear if one compares this high degree of conservation with the high degree of variability in the flanking sequences within the tandem unit itself (Fig. 5, level 4) or with the other 3' non-coding sequences outside o f the tandem repeat clusters (Fig. 4). The high degree of conservation of the tandem repeat makes its functional importance very likely. This assumption is supported by the fact that the tandem repeats are conserved between the two Drosophila species (level 3) while not even the LTRs of micropia elements, with well-established functions, possess any interspecific similarity (level 3, Lankenau et al. 1989). Also, less conserved sequences within the tandem units possess interesting patterns of divergence which we call "positional nucleotide shuffling". The higher degree of conservation of the variable region-2 between the D. melanogaster micropia elements (level 2) could be the result of molecular drive mechanisms such as unequal crossing over, which is a typical homogenization mechanism within tandem arrays (Dover et al. 1982). A strong argument against unequal crossing over is, however, the high differential divergence of the two elements derived from D. hydei (micropia-DhMi2, level l; micropiaDhMi2 and -MiF8, level 2). The patterns of divergence are shown in Figure 5 as a co-ordinated alignment, taking into account the four levels of comparison. The vertical axis represents tandem segments (S-1 through S-n) belonging to tandem clusters Tn of micropia elements -Din2, D m l 1, -DhMiF2, and -DhMiF8. The horizontal axis defines four regions: highly conserved-l, conserved2, variable-i, and highly variable-2. There is more freedom for mutations in variable positions of the tandem repeats on level 3 as well as on level 4 compared with coding sequences. But certain nucleotide constellations are preferred and relatively stable (wavy underlining and boxes in Fig. 5). Sometimes sequence motifs may disappear, like 5'-CART-3' in Mi2T2 (wavy underlining in other clusters, Fig. 5) and consequently the tandem clusters become shorter. Another sequence motif may "arise" which can be aligned to another position and may represent a very small functional constraint (Fig. 5, boxed 5'-CTCA-3' in Mi8T1). The length of tandem units may indeed also play a functional role since the number of nucleotide residues within one tandem repeat segment is always of the same order of magnitude as in the other segment members of the cluster (level 1). Additionally only minor length variations are found between clusters of different species as well as within a species (levels 3 and 2, respectively). The only species specific nucleotide (level 2/3) is located within the variable-1 region (marked by filled and outlined stars in Fig. 5). Another indication of functional importance is the similarity of the micropia tandem repeats to the 66 nucleotide tandem repeats of the 3' part of L I N E I elements (Wincker et al. 1987). Twenty nucleotides from a LINE1 tandem repeat unit can be aligned to the micropia tandem repeats (Fig. 5, bottom) (Lankenau et al. 1988). Another, much less obvious similarity exists with the rood-

116 erately repeated DNA 1.672-453 which was found adjacent to simple satellite DNA of D. melanogaster (Fig. 5, bottom) (Lohe and Brutlag 1987). An explanation of the observed conservation patterns is that specific protein-DNA interactions with the micropia non-coding region have played a role in creating selective constraints. One can speculate whether the conserved sequences have some function either in the regulation of the transposition activities of micropia or, alternatively, in protein-DNA interactions influencing chromatin regions outside the micropia element as described for other retroelements (cf. Parkhurst et al. 1988).

Discussion

Micropia elements were discovered as constituents of the D. hydei Y chromosomal wild-type fertility genes Threads and Pseudonucleolus. From ultrastructural data it is believed that these lampbrush loops represent large transcription units with transcript lengths of 500 to 1000 kb, and larger than 1000 kb, respectively (deLoos et al. 1984; Grond et al. 1983, 1984; Grond 1984). Micropia element sequences are found to be transcribed on the Threads and Pseudonucleolus (Huijser et al. 1988). There are two ways to interpret the hybridization reactions with transcripts in these lampbrush loops: (1) The initiation site of the transcription unit of the lampbrush loops is far away from the regulatory sites of any micropia element. The transcription process ignores the micropia regulatory sites and just reads through to a specific R N A termination signal at the end of the loops. A similar mechanism has been described from the transcription pattern of HBV (hepatitis B virus). During circular HBV proliferation the signals for cleavage and polyadenylation are ignored during the first transit past these sites but honoured on the second passage (Ganem and Varmus 1987). (2) The regulatory sites of transcription of micropia are functional and the radioactive signals observed by in situ hybridization are autonomous transcripts of micropia. This interpretation is favoured by the finding of small transcripts in Miller spreading experiments between the large transcripts of Pseudonucleolus or the two types (bush-like and fibrillar) of transcripts on the Threads. Both results may be indicative of secondary initiation sites of transcription within the loops (deLoos et al. 1984; Grond 1984). Both possibilities are likely to represent two simplified alternatives of a much more complex natural situation. It is shown in this paper, and by Huijser et al. (1988) and Lankenau et al. (1988, 1989) that micropia possesses the functional sequences of typical retroelements. Even though it has been argued that transposable elements may not directly play a major role in cell differentiation processes (Finnegan et al. 1982; Potter et al. 1979), micropia might represent an example of the competition between regulatory and transcription factors of the retroelement itself and of the Threads and Pseudonucleolus transcription units.

A well-studied example is the retrotransposon gypsy. In the y2 mutant, this transposable element inserted 700 bp upstream from the transcription start of the yellow gene, giving rise to the temporal and tissue-specific y2 phenotype. The insertion does not affect the early transcription of this gene but alters expression in the pupa such that adult y2 flies have normal-coloured bristles, whereas the wings and the body cuticle are yellow. This altered differential expression of the yellow gene is not simply the result of insertion of gypsy into sequences necessary for tissue-specific expression or a distancing of these sequences from the yellow promoter but rather is caused by specific sequences located in the 5' untranslated region of gypsy. Revertants with one remaining solo LTR or those where the sequences between both gypsy LTRs have been replaced by another transposable element no longer show the y2 phenotype. The y2 phenotype can be altered (reverted) by the gene product of suppressor of Hairy-wing [su(Hw)], which is a protein with 12 repeats of the Zn finger domain. This su(Hw) protein interacts with 12 copies of a sequence motif of the gypsy elements (Parkhurst and Corces 1986; Kubli 1986; Geyer et al. 1988; Spana et al. 1988; Parkhurst et al. 1988). One might assume that the conserved micropia tandem repeats possess an analogous function. Recently DNA sequences homologous to the protease and reverse transcriptase of gypsy have been identified on the lampbrush loops "Nooses", another wild-type Y chromosomal fertility gene of D. hydei (R. de Graaf, D.-H. Lankenau, P. Vogt and W. Hennig unpublished results). Further research will show if a Zn finger binding site is conserved within this new retrotransposon. It is still unknown whether the transcripts from the Threads and Pseudonucleolus possess exons and if micropia elements are parts of Y chromosomal introns. If this holds true, their influence on the transcription chemistry of the flanking exons can be compared to that of the wa mutation where copia is inserted into the second white intron (Gehring and Paro 1980), or to the f~ mutation where gypsy is inserted into the R N A coding region of forked. On the other hand it is also known that gypsy does not affect the expression of other genes located in vicinity of forked (Parkhurst and Corces 1985). Comparable to the regulative capacity of these insertions micropia elements (and the recently found gypsyrelated retrotransposon in the Nooses) may play a regulatory role within the Drosophila fertility genes. The testis specific transcription of Y chromosomal micropia elements (Huijser et al. 1988) might compete with the transcription of other Y chromosomal lampbrush loop D N A sequences. Possible candidates for regulatory influences are sequences on the LTRs and the 3' tandem repeats of micropia, especially those regions highly conserved between all known elements. The putative function of these sequences is not necessarily disrupted by the large number of rearrangements described in this paper, as long as they are unaffected themselves. Therefore "defective" micropia elements like DhMiF8 may also contribute to wild-type lampbrush loop function.

117

Acknowledgements. We are grateful to S. Lankenau and Dr. D. Ribbert for critically reading the manuscript. We thank C.R. Lankenau for writing a package of DNA sequence analysis programs in Turbo-Pascal, and Dr. R. Brand, Dr. J. Hackstein, R. Hochstenbach, H. Kremer, and F. Wang for discussion. Excellent technical support was given by R. Dijkhof, R. de Graaf, D. ten Hacken and W. Janssen. One of us (D.-H.L.) was supported by a Ph.D. fellowship of the Studienstiftung des deutschen Volkes. References Dover GA, Brown S, Coen E, Dallas J, Strachan T, Trick M (1982) The dynamics of genome evolution and species differentiation. In: Dover GA, Flavell RB (eds) Genome evolution. Academic Press, London, pp 343-372 Efstratiadis A, Posakony JW, Maniatis T, Lawn RM, O'Connel C, Spritz RA, DeRiel JK, Forget BG, Weissman SM, Slightom JL, Blechl AE, Smithies O, Baralle FE, Shoulders CC, Proudfoot NJ (1980) The structure and evolution of the human betaglobin gene family. Cell 21 : 653 668 Eigen M, Lindemann B, Winkler-Oswatitsch R, Clarke CH (1985) Pattern analysis of 5s rRNA. Proc Natl Acad Sci USA 82:2432 2441 Emori Y, Shiba T, Kanaya S, Inouye S, Yuki S, Saigo K (1985) The nucleotide sequences of copia and copia-related RNA in Drosophila virus-like particles. Nature 315 : 773-776 Farabaugh P J, Miller JH (1978) Genetic studies of the lac repressor VII. On the molecular nature of spontaneous hotspots in the lac I gene ofEscheriehia eoli. J Mol Biol 126:847 863 Finnegan D J, Will BH, Bayev AA, Bowcock AM, Brown L (1982) Transposable DNA sequences in eucaryotes. In: Dover GA, Flavell RB (eds) Genome evolution. Academic Press, London, pp 2940 Ganem D, Varmus HE (1987) The molecular biology of the Hepatitis B virus. Annu Rev Biochem 56:651-693 Gehring WJ, Paro R (1980) Isolation of a hybrid plasmid with homologous sequences to a transposing element of D. melanogaster. Cell 19:892904 Geyer PK, Green MM, Corces VG (1988) Reversion of a gypsyinduced mutation at the yellow (y) locus of Drosophila melanogaster is associated with the insertion of a newly defined transposable element. Proc Natl Acad Sci USA 85 : 3938-3942 Grond CJ (1984) Spermatogenesis in D. hydei. Ph. D thesis, University of Nijmegen Grond CJ, Siegmund J, Hennig W (1983) Visualization of a lampbrush loop-forming fertility gene in Drosophila hydei. Chromosoma 88 : 50-56 Grond CJ, Rutten RGJ, Hennig W (1984) Ultrastructure of the y chromosomal lambrush loops in primary spermatocytes of Drosophila hydei. Chromosoma 89: 85 95 Hennig W, Vogt P, Jacob G, Siegmund I (1982) Nucleolus organizer regions in Drosophila species of the repleta group. Chromosoma 87:279 292 Hennig W, Huijser P, Vogt P, J/ickle H, Edstr6m J-E (1983) Molecular cloning of microdissected lampbrush loop DNA sequences of Drosophila hydei. EMBO J 2 : 1741-1746 Huijser P, Hennig W (1987) Ribosomal DNA-related sequences in a Y chromosomal lampbrush loop of Drosophila hydei. Mol Gen Genet 206:441451 Huijser P, Kirchhoff C, Lankenau D-H, Hennig W (1988) Retrotransposon-like sequences are expressed in the Y chromosomal lampbrush loops of Drosophila hydei. J Mol Biol 203:689697 Jagadeeswaran P, Forget BG, Weisman SM (1981) Short interspersed repetitive DNA elements in eucaryotes: transposable DNA elements generated by reverse transcription of RNA PolIII transcripts? Cell 26:141 142 Kubli E (1986) Molecular mechanisms of suppression in Drosophila. Trends Genet 2:204-209

Lankenau D-H, Huijser P, Jansen E, Miedema K, Hennig W (1988) Micropia: a retrotransposon of Drosophila combining structural features of DNA viruses, retroviruses and non-viral transposable elements. J Mol Biol 204:233 246 Lankenau D-H, Huijser P, Hennig W (1989) Characterization of the long terminal repeats of micropia elements microdissected from Y-chromosomal lampbrush loops "Threads" of D. hydei. J Mol Biol 209:493-497 Lipman D J, Pearson WR (1985) Rapid and sensitive protein similarity searches. Science 227:1435-1441 Lohe AR, Brutlag DL (1987) Adjacent satellite DNA segments in Drosophila structure of junctions. J Mol Biol 194:171 179 deLoos F, Dijkhof R, Grond CJ, Hennig W (1984) Lampbrush chromosome loop-specificity of transcript morphology in spermatocyte nuclei of D. hydei. EMBO J 3:2845-2849 Maniatis F, Fritsch EF, Sambrook J (1982) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY McClintock B (1956) Controlling elements and the gene. Cold Spring Harbor Yearbook 21 : 197-216 Miyata T (1982) Evolutionary changes and functional constraints in DNA sequences: In: Kimura M (ed) Molecular evolution, protein polymorphism and the neutral theory. Japan Scientific Societies Press, Tokyo/Springer, Berlin Heidelberg New York, pp 233~60 Parkhurst SM, Corces VG (1985) Forked, gypsys, and suppressors in Drosophila. Cell 41:429-437 Parkhurst SM, Corces VG (1986) Interactions among the gypsy transposable element and the yellow and suppressor of Hairywing loci in D. melanogaster. Mol Cell Biol 6:47-53 Parkhurst SM, Harrison DA, Remington MP, Spana C, Kelley RL, Coyne RS, Corces VG (1988) The Drosophila su(Hw) gene, which controls the phenotypic effect of the gypsy transposable element, encodes a putative DNA-binding protein. Genes Dev 2:1205 1215 Potter SS, Brorien WJ, Dunsmuir P, Rubin GM (1979) Transposition of elements of the 412, copia and 297 dispersed repeated gene families in Drosophila. Cell 17 : 415427 Pustell F, Kafatos FC (1982) A high speed, high capacity homology matrix: zooming through SV40 and polyoma. Nucleic Acids Res 10:47654782 Pustell F, Kafatos FC (1984) A convenient and adaptable package of computer programs for DNA and protein sequence management, analysis and homology determination. Nucleic Acids Res 12:643-655 Pustell F, Kafatos FC (1986) A convenient and adaptable microcomputer environment for DNA and protein sequence manipulation and analysis. Nucleic Acids Res 14:479488 Rogers JH (1985) The origin and evolution of retroposons. Int Rev Cytol 93:187-279 Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-termination inhibitors. Proc Natl Acad Sci USA 74: 5463-5467 Schwarz-Sommer Z, Saedler H (1987) Can plant transposable elements generate novel regulatory systems? Mol Gen Genet 209: 207-209 Shepherd JCW (1981) Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci USA 78 : 159(~1600 Spana C, Harrison DA, Corces VG (1988) The D. melanogaster suppressor of Hairy-wing protein binds to specific sequences of the gypsy retrotransposon. Genes Dev 2:1414-1423 Streisinger G, Okada Y, Emrich J, Newton J, Tsugita A, Terzaghi E, Inouye M (1966) Frameshift mutations and the genetic code. Cold Spring Harbor Syrup Quant Biol 31:77-84 Wincker P, Jubier-Maurin V, Roizes G (1987) Unrelated sequences at the 5' end of mouse LINE-1 repeated elements define two distinct subfamilies. Nucleic Acids Res 15:8593-8606

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.