Major Late Transcription Unit - Europe PMC

4 downloads 0 Views 1MB Size Report
(A), (T)q (n, p, q 2 1) is found 4 to 24 nucleotides beyond all the adenovirus- specific polyadenylation sites except the 3'-coterminal family L4. This sequence.
Vol. 48, No. 1

JOURNAL OF VIROLOGY, OCt. 1983, p. 127-134

0022-538X/83/100127-08$02.00/0 Copyright © 1983, American Society for Microbiology

Polyadenylic Acid Addition Sites in the Adenovirus Type 2 Major Late Transcription Unit J. M. LE MOULLEC,' G. AKUSJARVI,2 P. STALHANDSKE,2 U. PETTERSSON 2* B. CHAMBRAUD,3 P. GILARDI,3 M. NASRI,1 AND M. PERRICAUDET'3 Institut Pasteur, 75724 Paris Cedex 15, France,' Departments of Medical Genetics and Microbiology, Uppsala University, Biomedical Center, 5-751 23 Uppsala, Sweden,2 and Institut de Recherches Scientifiques sur le Cancer, Centre National de la Recherche Scientifique, 94800 Villejuif, France3 Received 11 April 1983/Accepted 20 June 1983

The cytoplasmic mRNAs which are transcribed from the major late adenovirus promoter can be arranged into five 3'-coterminal families, Li to L5. We have defined the polyadenylation sites of the mRNAs that belong to the five families at the nucleotide level. From the results, the following conclusions can be made. (i) The hexanucleotide sequence AAUAAA is present at the 3' end of all late adenovirus type 2 mRNAs and precedes the site of polyadenylation by 12 to 30 nucleotides. (ii) Between one and three A residues are present in the genomic sequence at the polyadenylation site. (iii) A sequence with the composition (T),, (T)q (n, p, q 2 1) is found 4 to 24 nucleotides beyond all the adenovirus(A), specific polyadenylation sites except the 3'-coterminal family L4. This sequence is also found beyond many cellular polyadenylation sites. (iv) The Li and L2 polyadenylation sites are very similar in structure. The other polyadenylation sites show no apparent sequence relationship, except for the hexanucleotide sequence.

adenovirus promoter also is active during the early phase of infection (12, 22, 37, 47). However, both the poly(A) addition site selection and the mode of RNA splicing are different at early and late times after infection (2, 27). Thus, it seems that polyadenylation may play a key role in adenovirus gene regulation. Polyadenylation occurs normally before the transcript is spliced (28, 45), and the poly(A) tail is thought to be essential for the stability of the mRNA (46). To study the signals which regulate the polyadenylation of late adenovirus mRNAs, we cloned double-stranded cDNA copies of late Ad2 mRNAs and determined the nucleotide sequences at the junction between the poly(A) tail and the mRNA body. We also compared these sequences with previously determined sequences of polyadenylation sites for mRNAs which are synthesized at early and intermediate

The lytic replication cycle for adenovirus type 2 (Ad2) in HeLa cells can be divided into an early and a late phase, which are separated by the onset of viral DNA replication. During the early phase, five transcription units, ElA, E1B, E2, E3, and E4 are expressed (see Fig. 1; for a review, see references 11 and 33). Each transcription unit is transcribed into overlapping sets of mRNA which in most cases share 5' as well as 3' ends (12). However, mRNAs initiated from the promoter in region E2 can be polyadenylated at two positions, thus defining regions E2A and E2B (39), and mRNAs initiated from the promoter in region E3 can be polyadenylated at three different positions (1, 12, 32, 37a). Late after infection, transcription from the major late promoter, located at 16.5 map units, is considerably enhanced (15, 47). A 28-kilobase pair-long nuclear RNA precursor which is processed into at least 16 different mRNAs is transcribed from this promoter (for a review, see reference 11). These mRNAs can be arranged into five 3'coterminal families, Li to LS, each family comprising mRNAs with common 3' ends. The five polyadenylate [poly(A)] addition sites have been mapped to coordinates 38.5, 49, 62, 78, and 92 on the Ad2 genome by electron microscopy (12), hybridization (28), Si endonuclease cleavage (9), and RNA fingerprint analysis (14). It is now well established that the major late

times after an adenovirus infection. MATERIALS AND METHODS Construction of the cDNA library. Methods for mRNA preparation and cDNA synthesis have been previously described (6, 29). In short, total cytoplasmic RNA, isolated 40 h post-infection, was fractionated by chromatography on oligodeoxythymidylate cellulose. Usually, 10 ,ug of poly(A)-containing RNA was mixed with 1 ,ug of oligodeoxythymidylate,2-Ig and used for the synthesis of double-stranded cDNA. The double-stranded cDNA was tailed with deoxycytidine 127

128

LE MOULLEC ET AL.

1 2

3

J. VIROL.

LI

L2

L5

L4

L3

E3

EIA ElI,IX

100

0

lVa2

E2

HIND III J

HIND III

E4

A

HINDIIIF SEA I D HINDIII H FIG. 1. Different mRNA species expressed during a lytic infection with Ad2. ElA, E1B, E2, E3, and E4 represent the different early transcription units; IX and IVa2 represent genes which are expressed at intermediate times after infection; and Li to L5 represent the five families of late coterminal mRNAs. Positions of restriction fragments used for colony hybridization are also shown. and annealed to plasmid pBR322, tailed with deoxyguanidine at the PstI cleavage site. The hybridization mixture was used to transform Escherichia coli DP50, and colonies were selected for tetracycline resistance. Colony hybridization. The method of Grunstein and Hogness (16) was used. Restriction fragments of Ad2 DNA (38) were 32p labeled in vitro by nick translation (35). 32P-labeled polydeoxythymidylate, used to select cDNA clones containing a poly(A) tract, was obtained by enzymatic extension of oligodeoxythymidylate12-18 with terminal transferase and 32P-labeled thymidine triphosphate (104 mCi/mmol). DNA sequence analysis. Plasmid DNA was purified as described by Alestrom et al. (6), and sequence analysis was carried out by the method of Maxam and Gilbert (26).

RESULTS Construction and screening of the cDNA library. A cDNA library containing late Ad2 mRNA sequences was constructed, and cDNA clones corresponding to mRNAs from the different late 3'-cotermination families were identified by colony hybridization, with 32P-labeled Ad2 restriction fragments as DNA probes. Restriction fragments were chosen so that they would hybridize preferentially with the 3' parts of the LI to L5 mRNAs (Fig. 1). Positive clones were subsequently screened with a 32P-labeled polydeoxythymidylate probe to select plasmids containing a poly(A) tail. Several clones giving positive signals with both probes were isolated and further characterized by restriction endonuclease cleavage and DNA sequence analysis. The PstI cleavage site which was regenerated by the tailing reaction was used for the DNA sequence analysis (Fig. 2). The poly(A) sites for only one member within each 3'-cotermination family were determined since we expected that

all members within the respective families use the same polyadenylation site. The poly(A) addition site for the Li family. A clone, designated 3d-1-3, was selected by hybridization to restriction fragment HindIII-J (coordinates 37.3 to 41.0). It has a cDNA insert which is about 600 base pairs (bp) long and contains cleavage sites for restriction endonucleases HindIII and PstI (Fig. 2). Since the DNA sequence of this region of the Ad2 genome was unknown, we sequenced a 523-bp genomic DNA segment located immediately to the right of the HindIII cleavage site at position 37.3 (Fig. 3). The DNA sequence revealed one AATAAA hexanucleotide sequence, and a comparison between the cDNA sequence and the genomic DNA sequence showed that the poly(A) tract is added 15 to 16 nucleotides downstream from the hexanucleotide sequence (Fig. 3, positions 404 to 405). The precise position for the poly(A) addition cannot be identified since one A residue is present at the junction between the mRNA body and the poly(A) tail. It is noteworthy that Si endonuclease analysis reveals a second poly(A) addition site for the Li mRNAs. Approximately 50% of the Li mRNAs appear to terminate 5 bp downstream of the identified poly(A) site (Fig. 3, positions 409 to 410) (G. Akusjarvi, unpublished observation). We have not confirmed the existence of the second poly(A) addition site by sequence analysis since no cDNA clones corresponding to this hypothetical poly(A) site have yet been encountered. The poly(A) addition site for the L2 family. The polyadenylation site for mRNAs belonging to the L2 family has been previously reported (3). It is located 14 to 15 nucleotides downstream

VOL. 48, 1983

POLYADENYLATION SITES FOR LATE ADENOVIRUS mRNA

129

LI (3d-1-3) Hind III J

3773PSt -

I

41.0 -

A

L31ld-4-5) Hind III A -r------_-______-------------

50.1

__

I

72.8

Kpn I

-

A

L412d-5-5 Hind III H

Bgl 11

72.5

-

0

7g.9

A

L5(11-8-4) Hind III F 9..

'IBg 11

37.1

FIG. 2. Structures of the four cDNA clones used to define the poly(A) addition sites of the late Li to L5 mRNAs. The length of the cDNA clones relative to the restriction fragments used for the colony hybridization is indicated by a thick arrow. A, Positions at which the poly(A) tracts are found in the clones. The cDNA inserts are flanked by PstI cleavage sites generated by the cloning procedure.

from hexanucleotide sequence AATAAA (nucleotides 71 to 72 in the sequence of Akusjarvi and Persson [3]) and is followed 30 nucleotides further downstream by the splice acceptor for the mRNA encoding precursor polypeptide VI, which belongs to the L3 family (Fig. 4). The poly(A) addition site for the L3 family. A clone, designated ld-4-5, which was selected by hybridization to restriction fragment HindIII-A (coordinates 50.1 to 72.8) (Fig. 1) had an approximately 900-bp-long insert and contained a cleavage site for restriction endonuclease KpnI (Fig. 2). Sequence analysis of this clone revealed that the poly(A) addition site is located 19 to 20 nucleotides downstream of the AATAAA sequence (positions 870 to 871 in the sequence by Akusjarvi et al. [5]).

The poly(A) addition site for the L4 family. A clone, designated 2d-5-5, was selected by hybridization to restriction fragment HindIII-H (coordinates 72.8 to 79.9). The cDNA insert was approximately 800 bp long and contained a restriction cleavage site for endonuclease BglII (Fig. 2). Sequence analysis of this clone revealed that polyadenylation occurs 12 to 13 nucleotides downstream from the hexanucleotide sequence AATAAA (nucleotides 850 and 851 in the sequence by Herisse et al. [17]). The poly(A) addition site for the L5 family. A clone, designated 11-8-4, which was selected by hybridization to restriction fragment HindIII-F (coordinates 89.5 to 97.1) had an approximately 900-bp-long cDNA insert and contained a cleavage site for endonuclease BglII. The polyadenyl-

130

LE MOULLEC ET AL.

J. VIROL.

200

250

5'-AGGGATGTGCCC GGCCCGCGCCCG CCCACCCGTCGT CAAAGGCACGAC CGTCAGCGGGGT CTGGTGTGGGAG GACGATGACTCG ArgAspValPro GlyProArgPro ProThrArgArg GlnArgHisAsp ArgGlnArgGly LeuValTrpGlu AspAspAspSer 350 300 GCAGACGACAGC AGCGTCCTAGAT TTGGGAGGGAGT GGCAACCCGTTT GCGCACCTTCGC CCCAGGCTGGGG AGAATGTTTTAA AlaAspAspSer SerVdlLeuAsp LeuGlyGlySer GlyAsnProPhe AlaHisLeuArg ProArgLeuGly ArgMetPhe***

400 poly(A) AAAAAAAAAAAA AAAAAGCATGAT GCAAAATAAAAA ACTCACCAAGGC CATGGCACCGAG CGTTGGTTTTCT TGTATTCCCCTT 500 splice L2 450 AGTATGCAGCGC GCGGCGATGTAT GAGGAAGGTCCT CCTCCCTCCTAC GAGAGCGTGGTG AGCGCGGCGCCA GTGGCGGCGGCG-3' MetGlnArg AlaAlaMetTyr GluGluGlyPro ProProSerTyr Glu-SerVal Val SerAlaAlaPro ValAlaAlaAla FIG. 3. Nucleotide sequences surrounding the poly(A) addition site of the Li family mRNAs. A potential splice acceptor for a L2 family mRNA is also shown.

ation site is located 18 to 19 nucleotides downstream relative to the hexanucleotide sequence AATAAA (nucleotides 5426 and 5427 in the sequence of Herisse et al. [19]). DISCUSSION The poly(A) addition sites in the major late transcription unit: structural and functional implications. The positions of the poly(A) addition sites for the late Ad2 mRNAs belonging to the 3'-cotermination families Li to L5 were determined. The hexanucleotide sequence AATAAA precedes each poly(A) site by 13 to 30 nucleotides (Fig. 5). This sequence has been found to be present at the 3' termini of most cellular mRNAs (34) and is thought to play a critical role, either as a termination signal for transcription or as a signal for cleavage and polyadenylation of the primary transcript. The poly(A) addition sites which were determined in the present study correspond to mRNAs which are transcribed within a single transcription unit. Since the 3' ends of these mRNAs are generated by endonucleolytic cleavage of the nuclear precurP1

Al

CATGGCACCGAGCGTTGGTTTTCTTGTA-TTC[CCT -TAGTATGCAG 11

11

11 11111

1111 I

III III

CA-CGCTC --- G[-TTGGT- --CCTGTAACTATTTTGTAGAATGGAA A

A

Al P2 FIG. 4. Comparison of the sequences which surround the poly(A) addition sites for the Li and L2 mRNAs. P1, Poly(A) addition site for Li mRNAs; P2, poly(A) addition site for L2 mRNAs; Al, proposed acceptor site for splicing of an L2 mRNA; A'1, acceptor site for splicing of the L3 mRNA encoding the pVI polypeptide; ATG, initiation codons in L2 and L3 mRNAs which are followed by long open reading frames.

sor RNA (28), followed by polyadenylation and splicing, it seems unlikely that the hexanucleotide sequence serves as a signal for termination of transcription. Rather, the termination signal for transcription, if it exists, is located close to the right-hand end of the viral DNA (15). The hexanucleotide sequence must instead be considered as a possible signal for a concerted process by which the nuclear RNA is cleaved and polyadenylated. (i) LI family. The polyadenylation site for the Li mRNAs is interesting with regard to the regulation of mRNA abundance. Li mRNAs are preponderant in the cytoplasm early after infection despite equimolar transcription of each of the Li to L3 families. Later after infection the relative quantitities are reversed, and the Li mRNAs are present in lower amounts than mRNA sequences from the other five families (2, 27). Thus, the choice of poly(A) site seems to be regulated during the course of the lytic adenovirus replication cycle. It seems likely that one or more cellular proteins might be involved in this control by interacting with signals in the DNA sequence. It is noteworthy that a stretch of 19 A residues preceded the hexanucleotide AATAAA in the Li family (Fig. 3). This sequence might constitute the basis for such a signal. (ii) L2 family. The polyadenylation site for mRNAs belonging to the L2 family is followed 30 nucleotides downstream by the splice acceptor site for the pVI mRNA belonging to the L3 cotermination family. A similar gene organization has been suggested for the boundary between the Li and L2 families by electron microscope studies (Fig. 1). Therefore we compared the nucleotide sequences which surround the poly(A) addition sites for the mRNAs belonging to the Li and L2 families. Interestingly, the

VOL. 48, 1983

POLYADENYLATION SITES FOR LATE ADENOVIRUS mRNA

131

GTTGATGTAAGTTTAATAAAGGGTGAGATAATGTTTAACTTGCATGGCGTGTTAAATGGGGCGGGGCTTAAAGGGTATA

ad2

CATAAATAAAAACCAGACTCTGTTTGGATTTTGATCAAGCAAGTGTCTTGCTGTCTTTATTTAGGGGTTTTGCGCGCGC

ad2

a,b,c,d ElB/IX e,f,g,h

TCACCCGAGAGTGTACAAATAAAAACATTTGCCTTTATTGAAAGTGTCTCCTAGTACATTATTTTTACATGTTTTTCAA

ad2

E2

i

AGTATGATTAAATGAGACATGATTCCTCGAGTTCTTATATTATTGACCCTTGTTGCGCTTTTCTGTGCGTGCTCTACAT

ad2

E3-1

TAACATAAACACACAATAAATTACTTACTTAAAATCAGTCAGCAAATCTTTGTCCAGCTTATTCAGCATCACCTCCTTT

ad2

E3-2

j,k j,k

ATTTTCTGCAATTGAAAAATAAACACGTTGAAACATAACATGCAACAGGTTCACGATTCTTTATTCCTGGGCAATGTAG

ad2

E4

AATAAAGACAGCAAGACACTTGCTTGATCAAAATCCAAACAGAGTCTGGTTTTTATTTATGTTTTAAACCGCATTGGGA

ad2

IVa2

TTTGTGAGTCATGTATAATAAAACTGGTTTCGGTTGAAGTGTCTTGTTAATGTTTGTTTGGGCGTGGTTAAACAGGGAT

adl2

ElA

',m

CCAACCTGTAACCCAATAAAGAAAAAACTTAAATTGAGATGGTGTTATGAATCTTTATTGATACTTGTTTTT ------

adl2

ElB

g,n,o

AAAAGCATGATGCAAAATAAAAAACTCACCAAGGCCATGGCACCGAGCGTTGGTTTTCTTGTATTCCCCTTAGTATGCA CATGTGGAAAAATCAAAATAAAAAGTCTGGAGTCTCACGCTCGCTTGGTCCTGTAACTATTTTGTAGAATGGAAGACAT GAGACACTTTCAATAAAGGCAAATGTTTTTATTTGTACACTCTCGGGTGATTATTTACCCCCCACCCTTGCCGTCTGCG CATCTCTGTGCTGAGTATAATAAATACAGAAATTAGAATCTACTGGGGCTCCTGTCGCCATCCTGTGAACGCCACCGTT ACATTGCCCAGGAATAAAGAATCGTGAACCTGTTGCATGTTATGTTTCAACGTGTTTATTTTTCAATTGCAGAAAATTT

ad2

Li

ad2

L2

p

ad2

L3

i

ad2

L4

q

ad2

L5

r

(T) (A) p (T) q ----AATAAA----(N)1-:6---(A).----(N) 12-30-m4-24,pn

ElA

e

consensus sequence

m,n,p,q >1 FIG. 5. Comparison of DNA sequences which flank the poly(A) addition sites of 12 different Ad2 mRNAs and two mRNAs. The hexanucleotide sequence AATAAA is underlined and the site for polyadenylation is shown by an arrow. A consensus sequence which is deduced from the comparison is also shown. Data from: a, Perricaudet et al. (29); b, van Ormondt et al. (42); c, van Ormondt et al. (43); d, Maat and van Ormondt (24); e, Alestrom et al. (6); f, Maat et al. (23); g, Bos et al. (10); h, Perricaudet et al. (30); i, Akusjarvi et al. (5);j, Stalhandske et al. (37a); k, Herisse and Galibert (18); 1, Perricaudet et al. (31); m, Sugisaki et al. (40); n, Virtanen et al. (44); o, Kimura et al. (21); p, Akusjarvi and Persson (3); q, Heriss6 et al. (17); r, Herissd et al. (19).

nucleotide sequences of these two regions were found to be closely related (Fig. 4), and we would like to propose that an acceptor site for splicing of a L2 mRNA follows the AG dinucleotide at position 442 (Fig. 3). The existence of an acceptor site for splicing at position 442 is supported by Si endonuclease mapping studies (G. Akusjarvi, unpublished data). The dinucleotide AG, which is located 36 to 37 nucleotides downstream from the Li poly(A) addition site, is followed by a tentative initiation codon for translation. This ATG triplet is located at the beginning of an open translational reading frame and is likely to specify the N terminus of virion polypeptide III, encoded by the L2 region (3) (Fig. 3). Since the homology between the Li and L2 polyadenylation sites is strikingly high, particularly if substitutions between pyrimidine residues are disregarded (Fig. 4), it appears likely that the virus has condensed the processing signals at the L1/L2 and L2/L3 boundaries in a similar fashion, possibly by a duplication event. (iii) L3 family. The 3' end of the L3 mRNAs overlaps by 22 nucleotides the 3' end of the

mRNAs transcribed from region E2A (5). This type of genetic organization could provide the basis for regulation of transcription termination from the major late adenovirus transcription unit early after infection. It is known that transcription from the major late promoter early after infection proceeds in the rightward direction, toward a position located around 60 map units. Thus, it is possible that the leftward transcripts from the E2 promoter early after infection prevent transcripts from the major late promoter from extending beyond coordinate 60 of the Ad2 genome by steric hindrance. Late after infection, when transcription from the major late promoter is enhanced relative to that from the E2 promoter, a read-through beyond coordinate 60 is accomplished, and thus transcription of regions L4 and L5 occurs. (iv) L4 family. The L4 poly(A) site is located inside the first intron of mRNAs belonging to region E3. Since this poly(A) site is not used by polymerases initiated at the E3 promoter, it seems likely that a viral or virally induced protein regulates its use. Alternatively, the L4

132

LE MOULLEC ET AL.

poly(A) addition site is also recognized early after infection, yielding short RNA sequences initiated at the E3 promoter, which, however, turn over quickly. From the position of the L4 poly(A) site, it is apparent that the promoter for the E3 region is transcribed into late mRNA. This constitutes an analogous situation to the polypeptide IX promoter region, which is present in transcripts originating from the E1B promoter (6, 12). (v) L5 family. Electron microscopy has shown that the mRNAs belonging to the L5 and E4 families terminate very close to each other and form a so-called strand switch region, i.e., transcription of the E4 region proceeds in the opposite direction of that of the L5 region (Fig. 1). To obtain more precise information about the relative positions of these two poly(A) addition sites, we constructed and sequenced a cDNA clone containing the 3' end of an E4 mRNA (M. Perricaudet, A. Virtanen, A. Naslund, and U. Pettersson, manuscript in preparation). Sequencing studies revealed that the poly(A) addition site of the E4 mRNAs is located between nucleotides 5428 and 5430 in the sequence of Herisse et al. (19) and Gingeras et al. (iSa). Thus, it appears that the 3' ends of the L5 and E4 mRNAs do not overlap, like the other mRNAs in the Ad2 genome with juxtaposed 3' ends, i.e., the polypeptide IX/IVa2 mRNAs (6) and the L3/E2 mRNAs (5). Another hexanucleotide sequence, AAUAAA, is found downstream of the L5 poly(A) site. This sequence does not, however, seem to be efficiently recognized as a signal for polyadenylation, as judged by Si endonuclease analysis of late cytoplasmic mRNA (G. Akusjarvi, unpublished data). A comparison among the sequences which flank the five poly(A) sites in the major late transcription unit does not reveal any longer homologous sequences, except for the Li and L2 families. Thus, it seems difficult to predict from the primary sequence how the choice of poly(A) site is regulated. It is possible that a combination of primary and secondary structures could modulate the selection of poly(A) addition sites. Hairpin structures can be formed ar9und the L3 and L5 polyadenylation sites (Fig. 6). Hexanucleotide AAUAAA as a signal for cleavage and polyadenylation of nuclear RNA. Previ-

ous genetic studies have shown that the hexanucleotide sequence AATAAA is part of the signals which govern the polyadenylation reaction. Studies of simian virus 40 deletion mutants which lack sequences between the hexanucleotide and the poly(A) addition site for the late mRNAs have shown that one function of the hexanucleotide is to predetermine the precise

J. VIROL.

B

\A UG Uu A U G U U U

G U U G U C A

AAC

C

AAA

U -6

A

AA

A =U C 6

-U

G-U AGU A =U A=U

U=A A=U

6-U AUC G AGA =U A=U A= U U=A A= U A=U 6-u G-U

CA A3 UGUACAC 5c cc FIG. 6. Possible secondary structures at the 3' ends of the L3 (A) and L5 (B) mRNAs. The polyadenylation sites are indicated by arrows. 5CUUUC

position at which the polyadenylation reaction is to take place (13). However, the hexanucleotide is not sufficient by itself to signal polyadenylation since the AATAAA sequence sometimes can be found at positions at which no 3' ends have been mapped. One example has been noticed in region ElA from Adl2 (31) and in the r-strand sequence of the L3 cotermination family (P. Alestrom, G. Akusjarvi, and U. Pettersson, manuscript in preparation). In previous studies we have reported the positions of the poly(A) addition sites for Ad2 mRNAs from regions ElA, E1B, E2, E3, and IVa2, as well as those for the ElA and ElB mRNAs of Adi2 (see legend to Fig. 5). A catalog which includes all polyadenylation sites in the Ad2 genome, except for a minor one in the E3 region, is presented in Fig. 5. The hexanucleotide sequence AATAAA is found adjacent to all the Ad2 poly(A) addition sites, the only exception being the two minor polyadenylation sites in region E3, where in one case the related sequence ATTAAA appears to serve a similar function (1, 37a). The ATTAAA sequence has also been found in the rat genes encoding pancreatic amylase (25) and muscle actin (36) and in the anglerfish gene encoding pancreatic somatostatin (20). Both the ATTAAA and AATAAA sequences appear therefore to function as signals for polyadenylation in the nuclear precursor RNA. Polyadenylation in Ad2 occurs 12 to 30 nucleotides downstream from the hexanucleotide sequence at a position at which one or more A residues are present in the sequence (Fig. 5). Since A residues are present in the genomic DNA sequence at the junction between the poly(A) tract and the transcribed RNA, it is not possible to determine

VOL. 48, 1983

POLYADENYLATION SITES FOR LATE ADENOVIRUS mRNA

precisely where polyadenylation takes place. This feature has been consistently encountered at the 3' ends of all cellular mRNAs studied so far. For the nucleotide preceding the A residues, there seems to be no preference for purine or pyrimidine residues; however, in many cases, stretches of pyrimidine residues follow the po-

ly(A) site. In a study comparing nine different poly(A) addition sites of both cellular and viral origin, Benoist et al. (7) proposed that the consensus sequence 5'TTTTCACTGC3' precedes most poly(A) addition sites. We investigated whether this consensus sequence or a related sequence is found adjacent to the poly(A) addition sites in the Ad2 genome. We found, however, no apparent homology with this proposed consensus sequence. Rather, we found the sequence (T),, (A)p (T)q (n, p, q - 1) located 4 to 24 nucleotides downstream from the poly(A) addition sites.

This consensus sequence is also found in early adenovirus mRNAs and many cellular mRNAs. Based on our results, we would like to propose the following sequence organization at poly(A) addition sites (n, p, q 2 1):

AATAAA------Al-3------(T),, (A)p (T)q 12 - 30

4-24

LITERATURE CITED 1.

2. 3. 4.

5.

6.

7. 8. 9. 10.

11.

Ahmed, C. M., R. Chanda, N. Stow, and B. S. Zain. 1982. The sequence of 3'-termini of mRNAs from early region III of adenovirus 2. Gene 19:297-301. Akusjarvi, G., and H. Persson. 1981. Controls of RNA splicing and termination in the major late adenovirus transcription unit. Nature (London) 292:420-425. Akusjarvi, G., and H. Persson. 1981. Gene and mRNA for precursor polypeptide VI from adenovirus type 2. J. Virol. 38:469-482. Akusjairvi, G., and U. Pettersson. 1979. Sequence analysis of adenovirus DNA. VI. The genomic sequences encoding the common tripartite leader of late adenovirus messenger RNA. J. Mol. Biol. 134:143-158. Akusjairvi, G., J. Zabielski, M. Perricaudet, and U. Pettersson. 1981. The sequence of the 3' non-coding region of the hexon mRNA discloses a novel adenovirus gene. Nucleic Acids Res. 9:1-17. Alestrom, P., G. Akusajrvi, M. Perricaudet, M. Mathews, D. Klessig, and U. Pettersson. 1980. The gene for polypeptide IX of adenovirus type 2 and its unspliced messenger RNA. Cell 19:671-681. Benoist, C., K. O'Hare, R. Breatnach, and P. Chambon. 1980. The ovalbumine gene-sequence of putative control regions. Nucleic Acids Res. 8:127-141. Berget, S. M., C. Moore, and P. A. Sharp. 1977. Spliced segments at the 5'-terminus of adenovirus 2 late mRNA. Proc. Natl. Acad. Sci. U.S.A. 74:3171-3175. Berget, S. M., and P. A. Sharp. 1979. Structure of late adenovirus-2 nuclear RNA. J. Mol. Biol. 129:547-565. Bos, J. L., L. J. Polder, R. Bernards, P. Schrier, P. J. van der Elsen, A. J. Van der Eb, and H. Van Ormondt. 1981. The 22Kb Elb mRNA of human adl2 and ad5 codes for two tumor antigens starting at different AUG triplets. Cell 27:121-131. Chow, L. T., and T. Broker. 1980. The complex transcrip-

133

tion patterns of adenovirus-2, p. 175-209. In D. H. Dean. L. F. Johnson, and P. C. Kimball (ed.), Gene structure and expression. Ohio State University Press, Columbus. 12. Chow, L. T., T. Broker, and J. B. Lewis. 1979. Complex splicing patterns of RNAs from the early regions of adenovirus-2. J. Mol. Biol. 134:265-303. 13. Fitzgerald, M., and T. Shenk. 1981. The sequence of 5'AAUAAA3' forms part of the recognition site for polyadenylation of late SV40 mRNAs. Cell 24:252-260. 14. Fraser, N. W., C. C. Baker, M. A. Moore, and E. B. Ziff. 1982. Poly A (sites) of adenovirus serotype 2 transcription units. J. Mol. Biol. 155:207-233. 15. Fraser, N. W., J. R. Nevins, E. Ziff, and J. E. Darnell. 1979. The major late adenovirus type 2 transcription unit: termination is downstream from the last poly (A) site. J. Mol. Biol. 129:643-656. 15a.Gingeras, T. R., D. Sciaky, R. E. Gelinas, J. Bing-Dong, C. E. Yen, M. M. Kelly, P. A. Bullock, B. L. Parsons, K. E. O'Neill, and R. J. Roberts. 1982. Nucleotide sequences from the adenovirus-2 genome. J. Biol. Chem. 257:13475-13491. 16. Grunstein, M., and D. S. Hogness. 1975. Colony hybridization, a method for the isolation of cloned DNAs that contain a specific gene. Proc. Natl. Acad. Sci. U.S.A. 75:3961-3%5. 17. Herisse, J., G. Courois, and F. GalUbert. 1980. Nucleotide sequences of the EcoRI-D fragment of the adenovirus 2 genome. Nucleic Acids Res. 8:2173-2191. 18. Herisse, J., and F. Galibert. 1981. Nucleotide sequence of the EcoRI E fragment of adenovirus 2 genome. Nucleic Acids Res. 9:1229-1249. 19. Herisse, J., M. Rigolet, S. Dupont de Dinechin, and F. Galibert. 1981. Nucleotide sequence of adenovirus 2 DNA fragment encoding the carboxylic region of the fiber protein and the entire E4 region. Nucleic Acids Res. 9:4023-4042. 20. Hobart, P., R. Crawford, L. P. Shen, R. Pictet, and W. J. Rutier. 1980. Cloning and sequence analysis of cDNAs encoding two distinct somatostatin precursors found in the endocrine pancreas of angler fish. Nature (London) 288:137-141. 21. Kimura, T., Y. Sawada, M. Shinawawa, Y. Shimizu, K. Shiroki, H. Shimozo, H. Sugisaki, M. Takanani, Y. Vemizu, and K. Fukinaga. 1981. Nucleotide sequence of the transforming early region Elb of adenovirus type 12 DNA: structure and gene organization, and comparison with those of adenovirus type 5 DNA. Nucleic Acids Res. 9:6571-6588. 22. Kitchingham, G. R., and H. Westphal. 1980. The structure of adenovirus 2 early nuclear and cytoplasmic RNAs. J. Mol. Biol. 137:23-48. 23. Maat, J., C. P. van Beveren, and H. van Ormondt. 1980. The nucleotide sequence of adenovirus type 5 early region El: the region between map positions 8.0 (Hind III site) and 11.8 (Sma I site). Gene 10:27-38. 24. Maat, J., and H. van Ormondt. 1979. The nucleotide sequence of the transforming Hind III-G fragment of adenovirus type 5 DNA. The region between map position 4.5 (Hpa I site) and 8.0 (Hind III site). Gene 6:75-90. 25. MacDonald, R. J., M. M. Crerar, W. F. Swain, R. L. Pictet, G. Thomas, and W. Rutter. 1980. Structure of a family of rat amylase genes. Nature (London) 287:117122. 26. Maxam, A. M., and W. Gilbert. 1980. Sequencing endlabeled DNA with base-specific chemical cleavages. Methods Enzymol. 65:499-560. 27. Nevins, J., and W. Wilson. 1981. Regulation of adenovirus-2 gene expression at the level of transcriptional termination and RNA processing. Nature (London) 290:113118. 28. Nevins, J. R., and J. E. Darnell. 1978. Group of adenovirus type 2 mRNA's derived from a large primary transcript: probable nuclear origin and possible common 3' ends. J. Virol. 25:811-823. 29. Perricaudet, M., G. Akusjarvi, A. Virtanen, and U. Pet-

134

LE MOULLEC ET AL.

tersson. 1979. Structure of two spliced mRNAs from the transforming region of human subgroup C adenoviruses. Nature (London) 281:694-696. 30. Perricaudet, M., J. M. Le Moullec, and U. Pettersson. 1980. Predicted structure of two adenovirus tumor antigens. Proc. Natl. Acad. Sci. U.S.A. 77:3778-3782. 31. Perricaudet, M., J. M. Le Moullec, P. Tillois, and U. Pettersson. 1980. Structure of two adenovirus type 12 transforming polypeptides and their evolutionary implications. Nature (London) 288:174-175. 32. Persson, H., H. Jornvali, and J. Zabielski. 1980. Multiple mRNA species for the precursor to an adenovirus-encoded glycoprotein: identification and structure of the signal sequence. Proc. Natl. Acad. Sci. U.S.A. 77:63496353. 33. Persson, H., and L. Philipson. 1982. Regulation of adenovirus gene expression. Curr. Top. Microbiol. Immunol. 97:157-203. 34. Proudfoot, N. J., and G. C. Brownlee. 1976. 3' Non-coding region sequences in eukaryotic mRNA. Nature (London) 263:211-214. 35. Rigby, P. W. I., M. Dieckman, C. Rodes, and P. Berg. 1977. Labelling deoxyribonucleic acid to high specific activity in vitro by nick-translation with DNA polymerase. J. Mol. Biol. 113:237-252. 36. Shani, M., U. Nudel, D. Zevin-Sonkin, R. Zakut, D. Givol, F. Katcoff, Y. Carmon, J. Reiter, A. M. Frischauf, and D. Yappe. 1981. Skeletal muscle atin mRNA. Characterization of the 3' untranslated region. Nucleic Acids Res. 9:579-589. 37. Shaw, A. R., and E. B. Ziff. 1980. Transcripts from the adenovirus-2 major late promoter yield a single early family of 3' coterminal mRNAs and five late families. Cell 22:905-916. 37a.Stalhandske, P., H. Persson, M. Perricaudet, L. Philipson, and U. Pettersson. 1983. Structure of three spliced mRNAs from region E3 of adenovirus type 2. Gene 22:157-165.

J. VIROL. 38. Stenlund, A., M. Perricaudet, P. Tiollais, and U. Pettersson. 1980. Construction of restriction enzyme fragment libraries containing DNA from human adenovirus types 2 and 5. Gene 10:47-52. 39. Stillman, B. W., J. B. Lewis, L. T. Chow, M. B. Mathews, and J. E. Smart. 1981. Identification of the gene and mRNA for the adenovirus terminal protein precursors. Cell 23:497-508. 40. Sugisaki, H., K. Sugimoto, M. Takanami, K. Shhirok, I. Saito, H. Shimojo, Y. Sawada, Y. Uemizv, S. Uesugi, and K. Fujinaga. 1980. Structure and gene organization in the transforming Hind III-G fragment of adl2. Cell 20:777786. 41. Thomas, G. P., and M. B. Mathews. 1981. DNA replication and the early to late transition in adenovirus infection. Cell 22:523-532. 42. Van Ormondt, H., J. Maat, A. de Waard, and A. J. van der Eb. 1978. The nucleotide sequence of the transforming Hpa I E fragment of adenovirus type 5 DNA. Gene 4:309328. 43. Van Ormondt, H., J. Maat, and C. P. van Beveren. 1980. The nucleotide sequence of the transforming early region El of adenovirus type 5 DNA. Gene 11:299-309. 44. Virtanen, A., U. Pettersson, J. M. Le Moullec, P. Tillois, and M. Perricaudet. 1982. Different mRNAs from the transforming region of highly oncogenic and nononcogenic human adenoviruses. Nature (London) 295:705-707. 45. Zeevi, M., J. R. Nevins, and J. E. Darnell. 1981. Nuclear RNA is spliced in the absence of poly (A) addition. Cell 26:39-46. 46. Zeevi, M., J. R. Nevins, and J. E. Darnell, Jr. 1982. Newly formed mRNA lacking polyadenylic acid enters the cytoplasm and the polyribosomes but has a shorter half-life in the absence of polyadenylic acid. Mol. Cell. Biol. 2:517525. 47. Ziff, E. B., and R. M. Evans. 1978. The promoter and capped 5'-terminus of RNA from the adenovirus-2 major late transcription unit. Cell 15:1463-1475.