Sequence Requirements for Sindbis Virus ... - Journal of Virology

7 downloads 0 Views 1MB Size Report
Nov 7, 2000 - AND HENRY V. HUANG1*. Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri,1 and.
JOURNAL OF VIROLOGY, Apr. 2001, p. 3509–3519 0022-538X/01/$04.00⫹0 DOI: 10.1128/JVI.75.8.3509–3519.2001 Copyright © 2001, American Society for Microbiology. All Rights Reserved.

Vol. 75, No. 8

Sequence Requirements for Sindbis Virus Subgenomic mRNA Promoter Function in Cultured Cells MATTHEW M. WIELGOSZ,1 RAMASWAMY RAJU,2

AND

HENRY V. HUANG1*

Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri,1 and Department of Microbiology, School of Medicine, Meharry Medical College, Nashville, Tennessee2 Received 7 November 2000/Accepted 10 January 2001

The Sindbis virus minimal subgenomic mRNA promoter (spanning positions ⴚ19 to ⴙ5 relative to the subgenomic mRNA start site) is approximately three- to sixfold less active than the fully active ⴚ98 to ⴙ14 promoter region. We identified two elements flanking the ⴚ19 to ⴙ5 region which increase its transcription to levels comparable to the ⴚ98 to ⴙ14 region. These elements span positions ⴚ40 to ⴚ20 and ⴙ6 to ⴙ14 and act synergistically to enhance transcription. Nine different virus libraries were constructed containing blocks of five randomized nucleotides at various positions in the ⴚ40 to ⴙ14 region. On passaging these libraries in mosquito cells, a small subset of the viruses came to dominate the population. Sequence analysis at the population level and for individual clones revealed that in general, wild-type bases were preferred for positions ⴚ15 to ⴙ5 of the minimal promoter. Base mutagenesis experiments indicated that the selection of wild-type bases in this region was primarily due to requirements for subgenomic mRNA transcription. Outside of the minimal promoter, the ⴚ35 to ⴚ29 region contained four positions which also preferred wildtype bases. However, the remaining positions generally preferred non-wild-type bases. On passaging of the virus libraries on hamster cells, the ⴚ15 to ⴙ5 region again preferred the wild-type base but most of the remaining positions exhibited almost no base preference. The promoter thus consists of an essential central region from ⴚ15 to ⴙ5 and discrete flanking sites that render it fully active, depending on the host environment. the aquatic alphaviruses is at positions ⫺8 to ⫺6. Promoter sequences between ⫺5 and ⫹4 of these viruses could potentially change in the absence of nsP4 coding constraints, but they remain conserved. In Semliki Forest virus, the nsP4 termination codon is at positions ⫹11 to ⫹13, and so the length of the carboxyl terminus of alphavirus nsP4 does not appear to be strongly constrained. Additionally, the wobble positions for nsP4 amino acid codons should be able to change in the absence of other constraints, but they remain conserved. The contribution of other functions of the region to sequence conservation remains to be determined. Positions ⫹1 onward encode the 5⬘ end of the subgenomic mRNA, which might affect its capping, stability or translation efficiency. The viral RNA II (believed to be a stalled replication product) terminates at position ⫺4, and the amount of RNA II that is made appears to depend on the junction region (46). If RNA II does play a functional role during the virus life cycle, perhaps the regulation of the relative levels of viral replication versus transcription (46) might also account for some of the sequence conservation. In vivo evolution experiments suggested that the conserved nucleotides of the alphavirus junction region may in fact be optimized for promoter function (9, 10). The ⫺13 to ⫺9 region was randomized, in the absence of nsP4 coding constraints, to create a library of promoter variants. Viruses in the library were competed against each other to identify those that grew best. Even though some non-wild-type promoter sequences were found that could support transcription and virus growth, the wild-type promoter sequence was strongly preferred (9, 10). By itself, the ⫺19/⫹5 promoter region is three- to sixfold less active than the ⫺98/⫹14 promoter region (32), suggesting that regions upstream and/or downstream of this element may be

The alphaviruses have similar replication and transcription strategies in that all synthesize a minus-strand RNA that is complementary to the genomic plus-strand RNA (25, 27, 37, 41, 42). The minus strand is then used as template for genomic replication and subgenomic mRNA transcription (14, 19, 20, 28, 29, 37). With respect to transcription, Ou et al. identified a highly conserved region at the junction of the alphavirus nonstructural protein (nsP) and structural (STR) coding regions, 19 nucleotides (nt) upstream and 2 nt downstream of the subgenomic mRNA start site, that was hypothesized to serve as the promoter for alphavirus subgenomic mRNA transcription (26). Indeed, a minimal promoter region 19 nt upstream and 5 nt downstream of the subgenomic mRNA start site (⫺19/⫹5, encompassing the conserved sequence element identified by Ou et al. (26) was sufficient for detectable levels of subgenomic mRNA transcription (19) and a 3-nt insertion after position ⫺6 dramatically reduced transcription levels and virus growth (8). Heterologous alphavirus minimal promoter sequences can be recognized quite efficiently by Sindbis virus (SIN) for transcription, suggesting that transcriptional requirements might be a major contributor to the conservation of the junction region (11). While the junction region of SIN also encodes the C terminus of nsP4 (whose termination codon spans positions ⫹2 to ⫹4), its contribution to the conservation of the junction region may be smaller, at least in the minimal promoter: The nsP4 termination codon of the equine encephalitis viruses and * Corresponding author. Mailing address: Campus Box 8230, Department of Molecular Microbiology, Washington University School of Medicine, 660 South Euclid Ave., Saint Louis, MO 63110-1093. Phone: (314) 362-2755. Fax: (314) 362-1232. E-mail: [email protected] .edu. 3509

3510

WIELGOSZ ET AL.

required for full promoter activity. In this study, we used deletion analysis of the ⫺98/⫹14 promoter region to identify two regions about the SIN minimal promoter, at positions ⫺40 to ⫺20 and ⫹6 to ⫹14, that were sufficient to achieve promoter activity comparable to the ⫺98/⫹14 promoter region. In addition, we used the in vivo evolution method (9, 10) to identify the sequence preference and the location of positions within the ⫺40 to ⫹14 promoter region required for promoter function and virus growth. MATERIALS AND METHODS Cell lines. Baby hamster kidney (BHK-21) (ATCC CCL10; between passages 9 and 20) cells were grown at 30 or 37°C in minimal essential medium with Earle’s salts, supplemented with 10% heat-inactivated fetal calf serum. C7-10 cells (between passages 30 and 50) from Aedes albopictus larvae (34) were grown at 30°C in the same medium supplemented with 10% tryptone phosphate buffer. Deletion analysis. The TCS clone (⬃12.6 kb) (9, 10) contains the nonstructural (nsP) and structural (STR) coding regions of SIN, separated by the chloramphenicol acetyltransferase (CAT) gene. It contains two ⫺98/⫹14 subgenomic promoters (9). The first, designated the CAT promoter, drives the expression of the CAT gene. The second, designated the STR promoter, drives the expression of the STR genes and is downstream of the CAT promoter. The ⫺19/⫹5 (9) ⫺19/⫹14, ⫺40/⫹14, ⫺40/⫹5, and ⫺98/⫹5 (46) clones are identical to TCS except that the STR promoter of TCS was replaced by the respective ⫺19/⫹5 (9–11), ⫺19/⫹14, ⫺40/⫹14, ⫺40/⫹5, or ⫺98/⫹5 promoters. Promoters with single point mutations. Two 48-nt oligonucleotides were synthesized that contained the SIN ⫺19/⫹5 minimal promoter. During the synthesis of one oligonucleotide, each purine position of the minimal promoter was replaced by a mixture of ⬃90% wild-type nucleotide and ⬃3% each of the remaining nucleotides. The second oligonucleotide was made similarly, except that each pyrimidine position of the minimal promoter was replaced by the mixture. In each, the minimal promoter sequence was flanked by 4-nt spacers and KpnI and XbaI restriction sites for cloning. Each oligonucleotide library was made double stranded via PCR, using primers complementary to the ends of each oligonucleotide library. The PCR products were digested with KpnI and XbaI and directionally cloned into the JUNCAT plasmid (11). Clones were screened using low-stringency colony hybridization with an end-labeled negative-sense oligonucleotide spanning the wild-type ⫺19/⫹5 minimal promoter sequence, and 162 clones that contain mutant promoters were identified. Of these, 22 had point mutations. They were digested with ApaI and PvuI and directionally cloned into the DIC20e plasmid (32), replacing the 3⬘ promoter sequence with the mutant promoters. The DIC20e plasmid contains the cDNA sequence of a double subgenomic mRNA promoter construct of a defective interfering (DI) genome with all the cis-acting sequences required for replication and packaging (20). The 5⬘ promoter of the DIC20e plasmid consists of the wild-type SIN minimal promoter, which serves as an internal control. It is required for the synthesis of a 1.8-kb mRNA (0.35 kb of spacer sequence and the 3⬘ promoter, followed by the CAT sequence). The 3⬘ promoter is the test promoter used to transcribe a 1.45-kb CAT mRNA. The DIC20a clone is identical to the DIC20e derivatives, except that its 3⬘ promoter is the SIN wild-type minimal promoter sequence (⫺19/⫹5), used as a control to compare test promoters of interest. In vitro transcription, transfection, isolation of total RNA, and RT-PCR. Virus clones or libraries were linearized with SstI and transcribed in vitro using the Epicentre (Madison, Wis.) SP6 RNA-dependent RNA polymerase (46). Transfections were performed as described previously (22, 23) unless indicated otherwise. Approximately 1 ␮g of total RNA isolated from Trizol (Life Technologies, Inc.)-solubilized cell samples was subjected to reverse transcription (RT) with SuperScript II (Life Technologies, Inc.) in a 10-␮l reaction volume as specified by the manufacturer. PCR used the MasterAmp Tth DNA Polymerase (Epicentre) as specified by the manufacturer. Viral RNA labeling and analysis. Approximately 1 ⫻ 106 to 3 ⫻ 106 C7-10 cells or 7 ⫻ 105 BHK-21 cells were infected at a multiplicity of infection (MOI) of 1 to 5, and viral RNAs were labeled with [32P]orthophosphate, as described previously (46). Approximately one-fifth of the total RNA from each sample was denatured (3), electrophoresed through 1% agarose gels in 1⫻ TBE (33), and exposed to preflashed film. Exposures were performed at ⫺20°C for 2 to 5 days with (C7-10) or without (BHK-21) an intensifying screen. Radiolabeled RNA was quantitated on a phosphorimager (Bio-Rad) by integrating the area under each radioactive peak. Baselines were set at the valley between genomic and CAT subgenomic RNA. The relative molar ratios of STR subgenomic mRNA

J. VIROL. versus CAT subgenomic mRNA was calculated as 1.2 ⫻ (STR mRNA counts/ CAT mRNA counts), since the length of CAT mRNA (4,991 nt) is 1.2 times greater than that of the STR mRNA (4,156 nt). To obtain DI stocks, BHK-21 cells in 35-mm dishes were first infected with SIN (MOI ⫽ 5) for 1 h at 30°C. The inoculum was removed, and the cells were washed twice with 3 ml of Ca2⫹- and Mg2⫹-free phosphate-buffered saline (PBS). The cells were then transfected with ca. 1 ␮g of in vitro transcripts of the DIC20a or DIC20e derivatives with point-mutagenized promoters in 200 ␮l of PBS and 12 ␮g of Lipofectin reagent (Life Technologies, Inc.). Samples were rotated for 10 min at room temperature, the transfection mixtures were removed, and the cells washed twice with 3 ml of PBS. Cell samples then received 1 ml of BHK-21 medium and were incubated at 37°C for 20 h before the DI stocks were collected. For analysis of [3H]uridine-labeled DI RNAs, 100 ␮l of each DI stock was diluted with 100 ␮l of PBS and adsorbed onto 75% confluent BHK-21 cells at 37°C for 1 h. The cells were washed twice with 3 ml of PBS, 1 ml of BHK-21 medium was added, and the cells were incubated for 3 h. Dactinomycin was added to 1 ␮g/ml, and 200 ␮Ci of [3H]uridine was added 5 to 10 min later. At the end of 4 h, the cells were washed with PBS and total RNA was extracted with RNAzol reagent (Cinna Biotecz). Approximately 7 ␮g of each RNA sample was denatured and electrophoresed through 1% agarose gels. The gels were processed for fluorography, and the autoradiographs were used as guides to excise the DI subgenomic mRNA bands for liquid scintillation counting. Background radioactivity was determined by counting the region immediately above or below each band of interest. The relative strength of a mutant promoter was determined by measuring the molar ratio of mRNA produced by it to the mRNA synthesized from the 5⬘ wild-type promoter (32). Each ratio was then normalized by dividing by the ratio obtained for the DIC20a clone which contains two wild-type minimal promoters (32). Competition assays. Defined molar ratios of in vitro-transcribed RNA (⬃5 ␮g total) from TCS, STR promoter deletion clones, or individual promoter clones were mixed and transfected into C7-10 cells. The media were harvested at 26 h (individual promoter clones), 30 h (STR promoter deletion clones), or 48 h (for ⫺19/⫹5 and ⫺19/⫹14 competition) postelectroporation (p.e.). Virus stocks obtained after transfection were designated passage 1 (P1) virus, while viral RNA contained within transfected cells was designated P0 RNA. P1 virus mixtures were subjected to titer determination on C7-10 cells and used to infect fresh C7-10 cells at an MOI of 0.1. At 24 h (30 h for the ⫺19/⫹5 and ⫺19/⫹14 competition) postinfection (p.i.), the medium (P2 virus) and infected cells (P1 RNA) were harvested. Virus populations were passaged on fresh C7-10 cells until P2 or P3 RNA was obtained. Total RNA from each Trizol-solubilized sample was reverse transcribed using the 1926 primer (Table 1). PCR was performed on 10 to 15% of each cDNA sample using the 1941 and 1926 primers (Table 1) for 10 cycles and then held at 94°C for the addition of 2 pmol of [5⬘-32P]labeled 1941 primer (Table 1), and PCR was performed for 10 more cycles. The PCR products were resolved on 8% denaturing polyacrylamide gels (33), visualized by autoradiography, and quantitated on a Bio-Rad phosphorimager. cDNA libraries. The previous cloning strategy (9) was used to make nine different full-length cDNA libraries named according to the positions of the randomized nucleotides within the ⫺40/⫹14 promoter region. Oligonucleotides containing a block of five random nucleotides at various positions within the ⫺40/⫹14 promoter region were synthesized by Integrated DNA Technologies (Coralville, Iowa) for the ⫺35, ⫺25, ⫺20, ⫺15, ⫺5, and ⫹1 oligonucleotides or by Life Technologies, Inc., for the ⫺30, ⫺10, and ⫹5 oligonucleotides (Table 1). The quality of the randomized oligonucleotides was assessed by sequencing them after PCR amplification using the appropriate 5⬘ or 3⬘ primers (Table 1). Based on previous sequence analyses, detection of a given base at any position in a mixture required that it be present at an abundance of ⱖ0.2 (9). PCR products which appeared random by this criterion were used to construct the libraries. Using the ⫺35 library as an example, the ⫺35 oligonucleotide, with positions ⫺35 to ⫺31 randomized, was used as the 5⬘ primer and the AatII oligonucleotide was used as the 3⬘ primer in a PCR with TCS as the template (Table 1). The ⫺35 oligonucleotide hybridizes to positions 8387 to 8427 (within the STR promoter region), and the AatII oligonucleotide hybridizes to positions 8845 to 8825 (within the capsid coding region on the coding strand) of TCS. The 468-bp PCR product was isolated and digested with XhoI (located immediately upstream of the ⫺40 position)-XbaI (located immediately downstream of the ⫹14 position) to release a 66-bp fragment containing the SIN ⫺40/⫹14 promoter region in which positions ⫺35 to ⫺31 were randomized. This fragment was directionally cloned into the PneoS vector (9), placing it immediately upstream of the remaining STR 5⬘ untranslated sequences and the entire STR coding region, to generate the ⫺35 sublibrary, which still lacks the viral nsP region. The final ⫺35 cDNA library, containing the full-length, double-promoter viral genome, was made by

VOL. 75, 2001

SEQUENCE REQUIREMENTS FOR THE SIN PROMOTER

3511

TABLE 1. Oligonucleotides and templates used to generate the cDNA libraries Oligonucleotide

Sequence (5⬘ to 3⬘)

Template

Hybridize to a:

⫺35 N-Block ⫺30 N-Block ⫺25 N-Block ⫺20 N-Block ⫺15 N-Block ⫺10 N-Block ⫺5 N-Block ⫹1 N-Block ⫹5 N-Block AatII 1926 1941 1940

CCGCTCGAGCCATCNNNNNGGAAATAAAGCATCTCTA CCGCTCGAGCCATCAGAGGNNNNNTAAAGCATCTCTACGGTGGTCCTAAA CCGCTCGAGCCATCAGAGGGGAAANNNNNCATCTCTACGGTGGTCCTAA CCGCTCGAGCCATCAGAGGGGAAATAAAGNNNNNCTACGGTGGTCCTAAATAGTCA GCTCTAGAACTATGCTGACTATTTAGGACCACNNNNNAGATGCTTTATTTCCCCTCTGA GCTCTAGAACTATGCTGCTATTTAGGNNNNNCGTAGAGATGCTTTA GCTCTAGAACTATGCTGACTATNNNNNACCACCGTAGAGATGCTTTA GCTCTAGAACTATGCTGNNNNNTTAGGACCACCGTAGAGATGCTTTA GCTCTAGAACTATNNNNNCTATTTAGGACCACCGTA CCTCGTTCTGACGTCGAACA GAGCATGTTAAAGAATCCTCTATTCA GATTACAACAGTACTGCGATGA CCCGTTTTCACCATGGGCAAATA

TCS TCS TCS TCS ⫺40/⫹14 ⫺40/⫹14 ⫺40/⫹14 ⫺40/⫹14 ⫺40/⫹14 ⫺40/⫹14 SIN TCS ⫺40/⫹14

(7558–7555) (7558–7598) (7558–7597) (7558–7604) 7611–7561 7611–7573 7611–7573 7611–7573 7611–7584 8010–7990 7673–7648b (8275–8297)c (8161–8183)c

a The position to which the oligonucleotide hybridizes to during PCR. Nucleotide positions are those of SIN, unless otherwise specified. Hybridization to the minus strand is denoted by parentheses. b In the capsid coding region. c Oligonucleotides 1941 hybridize 1940 and to all cDNA clones and libraries on the minus strand, upstream of the CAT termination codon.

digesting the ⫺35 sublibrary with XhoI-BssHII and directionally cloning the appropriate fragment into TDV (a TCS derivative that does not contain an STR promoter) cut with the same restriction enzymes (9). The ⫺30, ⫺25, and ⫺20 libraries were generated in an identical manner. The estimated number of times each promoter sequence is expected to be represented in the library at each step is listed in Table 2. Also listed are the number of sequences expected to be underrepresented or deleted in each library due to restriction digests during construction and runoff transcription of each library. The ⫺15, ⫺10, ⫺5, ⫹1, and ⫹5 libraries were constructed similarly, except for the initial PCR step. Using the ⫺15 library as an example, the 1940 oligonucleotide (Table 1) and the ⫺15 oligonucleotide (with positions ⫺15 to ⫺11 randomized [Table 1]) were used as the 5⬘ and 3⬘ primer, respectively, in a PCR with the ⫺40/⫹14 clone as template. Oligonucleotide 1940 hybridizes to positions 8161 to 8183 of the ⫺40/⫹14 clone in the carboxyl terminus of CAT, while the ⫺15 oligonucleotide hybridizes to positions ⫺37 to ⫺14 of the STR promoter. The 224-bp PCR product was digested with XhoI-XbaI, and the 66 bp fragment containing the SIN ⫺40/⫹14 promoter region was isolated. The remaining cloning steps were as described for the ⫺35 library. Passaging of virus libraries in cultured cells. Approximately 3 ␮g (C7-10) or 5 ␮g (BHK-21) of in vitro transcripts of each library was transfected in duplicate into C7-10 cells or singly into BHK-21 cells. An aliquot from each transfected C7-10 or BHK-21 cell sample was serially diluted for an infectious-center assay to determine the number of successful transfection events, from which the minimal size of the library transfected into host cells may be estimated (9) (Table

2). The first set of C7-10 plates, used to estimate the diversity of each library shortly after transfection (see below), was incubated at 30°C until 1 to 2 h p.e. (⫺35, ⫺30, ⫺25, ⫺20, ⫺15, ⫺10, ⫺5, and ⫹5 libraries) or 8 h p.e. (⫹1 library). The media were removed, and the transfected cells were solubilized with 3 ml of Trizol reagent for isolation of the P0 RNA. The second set of transfected C7-10 cells was incubated at 30°C until 30 h p.e. The transfected BHK-21 cells were incubated at 37°C until 21 h p.e. At these time, the media containing the P1 virus populations were harvested. For the next passage, approximately 9 ⫻ 105 C7-10 cells or 7 ⫻ 105 BHK-21 cells were seeded onto 35-mm-diameter wells and incubated for 1 day at 30°C for C7-10 or at both 30° and 37°C for BHK-21, at which time they were ⬃85% confluent. Cells were infected with the P1 virus population at a MOI of ⱕ0.1. At 1 h p.i., the inoculum was removed and replaced with 1 ml of the appropriate media. At 24 h p.i. for C7-10 cells, 5 h p.i. for BHK-21 cells (37°C), or 8 h p.i. for BHK-21 cells (30°C), the media containing the P2 virus were collected. Cells remaining in the plate were solubilized with 1 ml of Trizol reagent for isolation of the P1 RNA. Passaging continued in this manner until P3 RNA for C7-10 cells or P4 RNA for BHK-21 cells was obtained. The promoter region of the P0 RNA was reverse transcribed using the AatII primer, and 1/10 of each cDNA sample was PCR amplified using the 1940 and AatII primers (Table 1). The 621-bp PCR product containing the STR ⫺40/⫹14 promoter region of each virus population was sequenced as described previously (46). The 621-bp product was also digested with XhoI-XbaI, and the promoter fragment was cloned into PneoS for sequence analysis of 16 or 17 isolates from each population for comparison with results from subsequent passages. The

TABLE 2. Number of times each promoter sequence is represented in the libraries Estimated diversitye at:

Library Name

No. of sequences deleted or underrepresenteda

Sublibrary

Library

Transfectionc

Minimumd

P0

P3

⫺35 ⫺30 ⫺25 ⫺20 ⫺15 ⫺10 ⫺5 ⫹1 ⫹5

6 2 2 4 22 4 2 2 3

9 17 142 12 82 180 10 50 797

7 304 357 28 68 11 11 49 80

586 527 791 371 98 98 15 27 1270

7 17 142 12 68 11 10 27 80

520 670 426 571 417 433 496 688 589

6 49 142 46 5 1 2 11 71

Library size/1,024b

a The number of sequences expected to be underrepresented or deleted, due to the use of XbaI, XhoI, BssHII, and SstI during construction and runoff transcription of the libraries. b The observed number of clones or infective centers, divided by 1024, the size of an ideal library with five randomized positions. c The number of infectious centers obtained after transfection (7) was used as a minimal estimate of the library size in transfected cells. d Smallest library size observed, up to and including the transfection step, equivalent to the minimum average number of times a promoter sequence is expected to be in the virus population at P0. In all cases, there is a 0.999 or better probability (by the Poisson distribution, 1 ⫺ e␭, where ␭ ⫽ the smallest library size observed) that all possible sequences in the randomized region are present in the P0 population. Subsequent passages, at a MOI of ⱕ0.1, used a minimum of 2,000 PFU. e The diversity (8) of each population at P0, immediately after transfection, and passage 3 is estimated by analysis of the sequence of individual isolates (see Materials and Methods).

3512

WIELGOSZ ET AL.

J. VIROL.

FIG. 1. Subgenomic mRNA promoter region of alphaviruses. The numbering denotes positions relative to the subgenomic mRNA initiation site. Identity to the SIN sequence is denoted by dashes. The boxes enclose the two conserved regions. The nsP4 sequence of SIN is shown above the sequence alignment. Abbreviations: A86, GIR, YN8, OCK, XJ1, Sindbis-like S.A.AR86, Girdwood, YN87448, Ocklebo and XJ-160 viruses, respectively; BFV, Barma Forest virus; AUR, Aura virus; SFV, Semliki Forest virus; MBV, Middelburg virus; RRV, Ross River virus; SAG, Sagiyama virus; ONN, O’nyong nyong virus; IGB and SG6, O’nyong nyong virus-like Igbo Ora and SG650 viruses; WEE, western equine encephalitis virus; EEE, eastern equine encephalitis virus; VEE, Venezuelan equine encephalitis virus; SPD, salmon pancreas disease virus; SDV, rainbow trout sleeping disease virus (4, 13, 16, 17, 21, 24, 26, 35, 38, 40, 43, 45).

estimated diversity (2E, where E ⫽ ⌺-b log2 b and b is the frequency of each of the four bases, summed over the five randomized positions) (10) of the P0 populations ranged from 417 to 688 (Table 2). Computer simulations of sampling 16 clones from an ideal library predict that ⬎90% of such samplings will yield estimated diversities from 400 to 900. (The complexity of a library with five random positions is 1,024 if the bases are equally represented at each position.) Some clones are likely to be underrepresented or lost because some sequences in the randomized positions, along with adjacent positions, constitute recognition sites for the restriction enzymes used during library construction and runoff transcription (Table 2). Of the 16 isolates from the P0 ⫺20 population, 2 were wild type at positions ⫺20 to ⫺16 but had non-wild-type sequences between positions ⫺5 and ⫺1. Despite this, the ⫺20 to ⫺16 wild-type sequence was not observed at the population level or recovered as individual clones at P3. This indicated that if there were contaminating clones in the virus population, they were less fit and did not contribute materially to the ⫺20 P3 population. We also sampled 6 to 17 clones from the C7-10 P3 populations to compare the estimated diversities of these populations to those observed at P0 (Table 2).

RESULTS Sequence elements required for full promoter activity. The alphavirus junction region contains two regions of conservation (Fig. 1): one identified by Ou et. al. (26), between positions ⫺19 and ⫹2 (with respect to the start site of alphavirus subgenomic mRNA) and a smaller one at positions ⫺35 to ⫺30. Previous studies demonstrated that the ⫺19/⫹5 region of SIN (11, 19) was sufficient for directing mRNA transcription. However, it was approximately three- to sixfold less active in subgenomic mRNA synthesis than the ⫺98/⫹14 region was (32, 46) and supported slower virus growth (10). Sequences required for the higher levels of transcription appear to be confined to the ⫺40/⫺20 and ⫹6/⫹14 regions immediately adjacent to the minimal promoter in mammalian cells (46) and also in mosquito cells (Fig. 2A [values under each lane designate promoter activity relative to the wild-type CAT promoter

serving as an internal control in each clone and normalized to the TCS wild-type clone] [32]). While the activity of the ⫺40/ ⫹14 region is comparable to that of the ⫺98/⫹14 region of TCS (Fig. 2A), absence of the ⫹6/⫹14 region (clones ⫺98/⫹5, ⫺40/⫹5, and ⫺19/⫹5) or the ⫺40/⫺20 region (clone ⫺19/ ⫹14) decreased STR mRNA transcription by about two- to fourfold, respectively. We used competition assays to determine if viral fitness is affected by the change in promoter activities. Viruses with weaker promoters transcribe less STR mRNA and thus have less structural proteins available for genomic RNA packaging and assembly of progeny virions. Viruses with excessively strong promoters may also be less fit because the cost of transcribing excessive amounts of subgenomic mRNA may come at the expense of decreased replication (10). Any difference in progeny virus yields is amplified upon passaging, such that viruses with better promoters should come to dominate the virus population. The competition assay used C7-10 mosquito cells, since growth in these cells imposes more stringent demands on promoter activity than in BHK-21 mammalian cells (10), thus providing better discrimination between viruses with similar promoter activities. The wild-type virus used was TCS, which contains two promoters. The first is the wild-type promoter, designated the CAT promoter, which maintains the integrity of the nsP4 coding region. It is used for transcription of the CAT mRNA. A second, ⫺98/⫹14 promoter, designated the STR promoter, is placed downstream and independent of nsP4 coding requirements. It is used for transcription of the STR mRNA for the production of viral structural proteins. The other clones are identical to TCS, but their respective STR promoters are replaced with subsets of the ⫺98/⫹14 region.

VOL. 75, 2001

SEQUENCE REQUIREMENTS FOR THE SIN PROMOTER

3513

FIG. 2. Regions flanking the minimal SIN promoter which enhance transcription and virus fitness. (A) The TCS, ⫺98/⫹5, ⫺40/⫹14, ⫺40/⫹5, ⫺19/⫹14, and ⫺19/⫹5 viruses were used to infect C7-10 cells at a MOI of 3. Viral RNA was labeled in vivo with [32P]orthophosphate between 21.5 and 24 h p.i. Total RNA was resolved on a 1% agarose gel for analysis via autoradiography. The relative STR promoter activities shown below each lane were calculated by dividing the abundance of CAT subgenomic mRNA by that of STR subgenomic mRNA for each virus, and normalizing these values to that obtained for TCS. ⴱ, RNA II terminating at the STR promoter; #, RNA II terminating at the CAT promoter. (B) Effect of the ⫺40/⫺20 and ⫹6/⫹14 regions on viral fitness. Mixtures of the in vitro transcripts listed in each panel (I ⫽ ⫺98/⫹5, ⫺40/⫹5, and ⫺19/⫹5 mixture; II ⫽ ⫺98/⫹14 [TCS], ⫺40/⫹14, and ⫺19/⫹14 mixture; III ⫽ ⫺19/⫹14 and ⫺19/⫹5 mixture; IV ⫽ ⫺40/⫹14 and ⫺40/⫹5 mixture) were transfected into C7-10 cells and passaged two or three times using a MOI of ⱕ0.1 (P0 ⫽ transfected cells). Total-cell RNA was isolated at each passage, and the viral STR promoter regions were 32P radiolabeled during RT-PCR. The PCR products were resolved on an 8% sequencing gel for analysis by autoradiography. The relative abundance of each virus in the mixes is listed below each lane, ranked by the size of their PCR products, in bases: TCS (233 bases), ⫺98/⫹5 (224 bases), ⫺40/⫹14 (175 bases), ⫺40/⫹5 (166 bases), ⫺19/⫹14 (154 bases), and ⫺19/⫹5 (147 bases).

Their names reflect the STR promoter regions they contain; e.g., the ⫺40/⫹14 clone has a SIN STR promoter region consisting of 40 nt upstream and 14 nt downstream of the STR subgenomic mRNA start site. To test the contribution of the ⫺40 to ⫺20 region, we used the following mixtures: (i) the ⫺98/⫹5, ⫺40/⫹5, and ⫺19/⫹5 clones (Fig. 2B, panel I) and (ii) the TCS (⫺98/⫹14), ⫺40/ ⫹14, and ⫺19/⫹14 clones (panel II). The relative abundance of each virus after each passage is represented by its RT-PCR product of a specific size (Fig. 2B). Although the ⫺98/⫹5, ⫺40/⫹5, and ⫺19/⫹5 viruses have comparable STR promoter activities, the ⫺98/⫹5 and ⫺40/⫹5 viruses outcompeted the ⫺19/⫹5 virus after just one passage (panel I). The same result was observed when the ⫹6 to ⫹14 region was also present, where the ⫺19/⫹14 clone became barely detectable after one passage in competition with TCS and the ⫺40/⫹14 clone (panel II). Thus, the ⫺40 to ⫺20 region does confer higher fitness in mosquito cells.

We then tested the contribution of the ⫹6 to ⫹14 region. In one context, as part of the ⫺40/⫹14 clone, it clearly improved viral fitness over that of the ⫺40/⫹5 clone (Fig. 2B, panel IV), consistent with the twofold difference in their promoter activities. In the other context, as part of the ⫺19/⫹14 clone, it paradoxically decreased viral fitness, since the ⫺19/⫹14 clone decreased in relative abundance from 33-fold to 2-fold that of the ⫺19/⫹5 clone after three passages (panel III). Although the lower fitness of the ⫺19/⫹14 clone is consistent with its possibly lower promoter activity than that of the ⫺19/⫹5 clone (Fig. 2A), the difference is sufficiently subtle that some other, unidentified effect on viral growth cannot be excluded. Nonetheless, the results clearly show that both the ⫺40 to ⫺20 and the ⫹6 to ⫹14 regions are required for wild-type levels of transcription and higher viral fitness. Sequence preference for promoter function. The importance of only a few positions in the minimal promoter has been identified to date (6, 8–10). Since the ⫺40/⫹14 region appears

3514

WIELGOSZ ET AL.

to be functionally similar to the ⫺98/⫹14 region, we focused on it to identify the sequences required for promoter function. We used an in vivo evolution method (9) that efficiently samples the functions of thousands of different promoter sequences in parallel in the context of the normal infection cycle. Nine virus libraries were generated. Like the STR promoter deletion clones described above, viruses in the libraries contain two subgenomic mRNA promoters, the CAT and the STR promoters. Each library consists of a randomized sequence of five contiguous nucleotides located at various positions within the ⫺40/⫹14 STR promoter. The libraries are named according to the location of the randomized sequences (e.g., positions ⫺35 to ⫺31 are randomized in the ⫺35 library [Table 1]). In vitro-transcribed RNA from each library was transfected into C7-10 and BHK-21 cells and serially passaged three (C7-10) or four (BHK-21) times. At each passage, the sequence of the promoter region of the virus population as a whole was determined (Fig. 3). The promoter sequences shown are those of the minus strand. P0 represents viral RNAs in cells transfected with the libraries, before selection. As expected, all four bases were present at each of the randomized position (Fig. 3). When the libraries were passaged on C7-10 cells (Fig. 3A), specific bases at some positions (e.g., those from ⫺11 to ⫹5) were already selected for after a single passage (P1), as judged by their increased abundance relative to the alternate bases at these positions. The rapidity of selection suggest that these positions, and the particular bases selected for, are functionally the most important. Since the positions with rapid selection are primarily in the minimal promoter, we infer that some minimal level of promoter function is the primary selective force operating on the region. After two passages (P2), other positions, e.g., ⫺35 to ⫺31 and ⫺15 to ⫺12, began to exhibit preference for particular bases. By the third passage (P3), most positions demonstrated a clear base preference (with G being clearly depleted at positions ⫺18 and ⫺20). The only exception is position ⫺17, where no preference is seen. This suggests that most positions in the ⫺35/⫹9 region are required for full promoter activity. Figure 4A summarizes the results. Remarkably, the positions that evolved most rapidly, principally positions ⫺15 to ⫹5 in the minimal promoter, also predominantly converged to the wild-type sequence. This suggests that optimal promoter activity requires the wild-type sequence at these positions. In contrast, positions that evolved more slowly tended to converge to non-wild-type bases. Positions ⫺19 to ⫺16 of the minimal promoter behaved like this, suggesting that the sequence requirement at these positions is not very stringent. Similarly, of the 20 positions outside the minimal promoter, most either remained ambiguous (positions ⫺32, ⫺28, ⫺27, ⫺26, ⫺20, and ⫹6) or converged to a non-wild-type base (⫺34, ⫺33, ⫺31, ⫺25, ⫺23, ⫺22, and ⫹7 to ⫹9). Only five of them converged to a wild-type base (⫺35, ⫺30, ⫺29, ⫺24, and ⫺21) (Fig. 3A and 4A). We also sampled 6 to 17 isolates from each P3 virus population (Table 3). Figure 4B summarizes the results. Despite the small sample sizes, the consensus deduced from the individual clones (Table 3; Fig. 4B) correlated well with the consensus sequence of the population as a whole (Fig. 3A and 4A). All positions exhibited marked base preference, with the frequency of the predominant base being ⱖ50%. A total of 18 of

J. VIROL.

24 positions in the minimal promoter and 5 of 20 positions outside of the minimal promoter strongly prefer the wild-type base (Fig. 4B), in excellent agreement with results at the population level. Clones with the wild-type base at all five initially randomized positions were repeatedly isolated from libraries covering the minimal promoter (Table 3), confirming the strong preference for the wild-type base at most positions of the minimal promoter. Promoter preference in mammalian cells. Previous studies indicated that growth in mosquito cells poses more stringent demands on promoter activity than does growth in hamster cells (10). Consequently, it is possible that some positions in the ⫺40/⫹14 promoter region might exhibit different selection intensities or preferences for particular bases in alternate hosts. To test this, transcripts of the libraries were transfected into BHK-21 cells and passaged at 37°C (Fig. 3B and 4C). As observed in C7-10 cells, wild-type bases were selected for in BHK-21 cells in most of the ⫺15 to ⫹5 region. Thus, at most positions in the minimal promoter, the wild-type base appears to be optimal for subgenomic mRNA transcription in both insect and mammalian cells. Selection for promoter function does appear less stringent in hamster cells, since higher levels of non-wild-type bases were found, e.g., at positions ⫺15, ⫺12, ⫺6, and ⫹4, compared to that observed in C7-10 cells (Fig 3A and B and 4A and C). Furthermore, there was no obvious selection at many positions in the ⫺35 to ⫺16 and ⫹6 to ⫹9 regions in BHK-21 cells compared to what is seen in C7-10 cells (Fig. 3A and 4A). Where selection can be perceived, e.g., at positions ⫺30 to ⫺28, the bases selected for are generally the same as those selected for in C7-10 cells. The only exception is at position ⫺34, where U is preferred in BHK-21 cells and A is preferred in C7-10 cells. One variable possibly affecting the stringency of selection could be passaging at 30°C (C7-10) versus 37°C (BHK-21), although previous results indicated that temperature had no effect on base preferences for positions ⫺13 to ⫺9 (9, 10). To determine if this was also true for positions outside the minimal promoter region, we passaged the ⫺35 and ⫹5 libraries on BHK-21 cells at 30°C (Fig. 3B and 4D). The results show that some positions do exhibit clearer base preferences at 30°C than at 37°C, since a readable consensus sequence for positions ⫺35 to ⫺31 appeared to be 3⬘AUNAU at 30°C but only U⫺34 seem enhanced at 37°C. While no selection was detectable at positions ⫹8 and ⫹9 at 37°C, both converged to U at 30°C. Thus, passaging at 30°C appears to impose more stringent selection in these regions. The results also revealed additional differences between growth in mosquito and mammalian cells. The preference was for A⫺35, U⫺34, and A⫺32 in BHK-21 cells at 30°C but for U⫺35, A⫺34, and Y⫺32 in C7-10 cells (Fig. 3 and 4). Similarly, U⫹8 was preferred in BHK-21 cells, while A⫹8 was preferred in C7-10 cells. These results indicate that promoter sequence preference does seem to depend on both culture temperature and host cell. Promoter activity and the fitness of P3 viruses. The in vivo evolution experiments showed that the wild-type base was generally preferred at positions ⫺15 to ⫹5. In contrast, positions ⫺35 to ⫺16 and ⫹6 to ⫹9 generally preferred non-wild-type bases or exhibited relaxed preferences. This suggests that some non-wild-type sequences function as well as the wild type does. If this is true, the activity of these promoters should be com-

VOL. 75, 2001

SEQUENCE REQUIREMENTS FOR THE SIN PROMOTER

3515

FIG. 3. Evolution of the ⫺35/⫹9 promoter region during passaging in cultured cells. In vitro-transcribed RNA from each library was transfected into C7-10 cells (A) or BHK-21 cells (B). The virus produced by the transfected cells were then passaged three or four times at a MOI ⱕ 0.1 (see Materials and Methods). Intracellular viral RNA during each passage was isolated, and the STR promoter region was RT-PCR amplified. The RT-PCR product was purified and sequenced directly to obtain the consensus sequence of each population after each passage (P0 ⫽ transfected cells). The sequences shown are those of the minus strand.

parable to that of the wild type. Additionally, they should support virus growth as well as the wild-type promoter does. To test this, we cloned the most frequently sampled promoter (Table 3) from each P3 population into the wild-type background to ensure that the non-wild-type promoter sequences isolated were not accompanied by compensatory mutations outside of the ⫺40/⫹14 promoter region which allowed the virus to grow well. The resulting clones are identical to the ⫺40/⫹14 clone except at the indicated positions (Table 3). The recloned promoters supported high-titer virus growth after transfection (data not shown), comparable to that of TCS and ⫺40/⫹14 and better than that of the ⫺19/⫹5 clone, especially

in C7-10 cells (data not shown). This suggests that new mutations were not required for good growth. The activity of the promoters in these clones was then measured in C7-10 and BHK-21 cells (Fig. 5A). All had activity comparable to that of the ⫺40/⫹14 and ⫺98/⫹14 promoters and clearly higher than that of the ⫺19/⫹5 promoter in both host cells. We also tested a randomly chosen promoter from the ⫺20 P0 population. Its sequence is entirely non-wild type, but it had promoter activity comparable to that of the ⫺98/⫹14 and ⫺40/⫹14 promoters. This provides further evidence that the wild-type sequence at positions ⫺20 to ⫺16 is not required for promoter function. The competition assay is a more sensitive way to test

3516

WIELGOSZ ET AL.

J. VIROL.

FIG. 4. SIN promoter. (A) Consensus sequence of the virus populations after three passages on C7-10 cells at 30°C (Fig. 3A). Consensus bases are those that show obvious enrichment over the other bases (Fig. 3A). Bases that were found to be wild type are underlined. Positions that appeared to prefer more than one base are designated by the standard single-letter code (R ⫽ G or A; Y ⫽ U or C; S ⫽ G or C; W ⫽ A or U; K ⫽ G or U; H ⫽ A, U, or C; D ⫽ A, G, or U; N ⫽ A, C, G, or U). (B) Consensus sequence of individual clones sampled from the C7-10 P3 virus population (Table 3). Unambiguous bases are those found at a frequency of ⱖ0.6, except for positions ⫺25 and ⫺16, where the dominant base had a frequency of 0.5 but the other bases each had frequency of ⱕ0.2. (C). Consensus sequence of the virus populations after four passages on BHK-21 cells at 37°C (Fig. 3B). (D) Consensus sequence of the ⫺35 and ⫹5 virus populations after four passages on BHK-21 cells at 30°C (Fig. 3B). Blanks indicate that populations were not tested.

point mutations (Fig. 6) using double promoter constructs (32) derived from a SIN DI genome (18, 20) that include the wildtype SIN ⫺19/⫹5 promoter as internal reference (32) (see Materials and Methods). None of the 14 point mutations resulting in promoter activities of ⱕ60% were present in the promoters sampled after three passages in C7-10 cells (Tables 3), and most are clearly depleted in the P3 population as a whole (Fig. 3A). For example, mutations ⫺6C and ⫺11U result in promoter activities approximately 60% that of the wild type (Fig. 6). They were at best faintly present in the population after three passages (Fig. 3A). In comparison, five of the eight point mutations resulting in promoter activities ⱖ80% of the wild-type activity (Fig. 6) were found among the clones isolated after three passages in mosquito cells (Table 3). This suggests that the promoter must be ⱖ80% as active as the wild type for viral fitness to be high enough to survive three cycles of selection in mosquito cells. Comparable results were found in BHK-21 cells; for example, ⫺11U persisted at relatively low frequencies to at least four passages (9, 10). Four point mutations (⫺16G, ⫺2A, ⫺1A, and ⫹4U) led to promoters with higher activity than the wild type (Fig. 6). The ⫺16A and ⫺1A changes were abundant among the P3 clones (Table 3), but ⫺1A and ⫹4U were not sampled and did not appear enriched in the population as a whole (Fig. 3). Thus levels of promoter activity greater than wild-type levels do not necessarily increase viral fitness and may even be deleterious.

whether one virus grows better than another. We therefore competed the clones against TCS (Fig. 5B). TCS was chosen as the wild-type control (versus the ⫺40/⫹14 clone), since its promoter contains longer additional bases that makes it easy to distinguish from the others. With the exception of the ⫺19/⫹5 and ⫺25 clones, all had fitness comparable to the ⫺40/⫹14 clone, i.e., slightly more fit than TCS. (It is curious that the ⫺40/⫹14 clone and the other clones, except ⫺25 and ⫺19/⫹5, are slightly more fit than TCS; the reason for this is unknown.) Thus, the presence of non-wild-type bases in the promoter of these clones was not deleterious. The much lower fitness of the ⫺19/⫹5 clone relative to TCS was expected, since the activity of its promoter is approximately twofold lower than that of TCS. The slightly lower fitness of the ⫺25 clone, a twofold decrease relative to TCS over four passages, suggests that its promoter is not the most fit in the ⫺25 population, even though it was isolated four times among the 10 isolates characterized. In this regard, the estimated diversity of the ⫺25 P3 population is by far the highest (Table 2), and it may be that a better promoter, more fit and possibly equally abundant in the population, was by chance not sampled. The effect of point mutations on promoter activity. The in vivo evolution studies show that the wild-type base is strongly preferred at most positions in the minimal promoter (Fig. 3 and 4). The apparent superiority of the wild-type base was verified by measuring the activity of 22 minimal promoters with

TABLE 3. Sequence of random isolates from the P3 populations in C7-10 cells No. and sequence of clones in librarya ⫺35 No.

0 8 1 1

b

a

Sequence

UCUCC —AC—U —U——U A——UU

⫺30

⫺25

⫺20

⫺15

⫺10

⫺5

⫹1

⫹5

No.

Sequence

No.

Sequence

No.

Sequence

No.

Sequence

No.

Sequence

No.

Sequence

No.

Sequence

No.

Sequence

0 3 2 2 2 1

CCUUU ——CGC G—C—C ——G—— —AGG— A———A

0 4 2 1 1 1 1

AUUUC U—CCA GG—CU ——CC— CGACG —A——U U—AGA

0 5 1 1 1 1 1

GUAGA AC—AG AC——— A—CA— UC-AU CCU—— AGCC—

2 5 1 1 1

GAUGC —G——— AG——— AU——— —C———

0

CACCA

1 5

GGAUU ———A—

4 4 1

UAUCA ————U AUCAC

1 11 2 1 1 1

AGUCG UCGAU —CA—— —UAU— —AAU— UCG—C

a Six or more clones were sequenced from each virus population after three passages in C7-10 mosquito cells. The wild-type sequence is at the top of each list and is underlined. For the other sequences, bases identical to the wild-type sequence are designated by a dash. b Number of times that particular sequence was found. The wildtype promoter sequence (underlined) for each 5 nucleotide block is the first listed. Numbers in bold indicate the sequences chosen for analysis of promoter activity and virus fitness.

VOL. 75, 2001

SEQUENCE REQUIREMENTS FOR THE SIN PROMOTER

3517

FIG. 5. Promoters of P3 clones. (A) Promoter activity. C7-10 or BHK-21 cells were infected with the most frequently isolated virus from each P3 population (Table 3) and an isolate from the ⫺20 library at P0 with promoter sequence of 3⬘AAGCU. Viral RNAs were radiolabeled in vivo with [32P]orthophosphate at 3 h (BHK-21) or 21.5 h (C7-10) p.i. and isolated at 5 or 24 h p.i., respectively. Total RNA was denatured and resolved on a 1% agarose gel. [32P]orthophosphate-radiolabeled RNA was examined via autoradiography and phosphorimager analyses. G denotes the genomic RNA of each clone; CAT denotes CAT mRNA transcribed by the ⫺98/⫹14 CAT subgenomic mRNA promoter; and STR denotes STR mRNA transcribed by the STR promoter region of each virus. Relative promoter activity, listed below each lane, was determined by finding the ratio of STR to CAT mRNA for each sample and normalizing the values against that calculated for TCS. (B) Fitness of the viruses. The viruses were competed against TCS by transfecting approximately equal amounts of the respective in vitro transcripts into C7-10 cells. The resulting virus populations were passaged three more times. The promoter region of viruses in each mix is amplified by RT-PCR, labeled, and gel resolved as described in the legend to Fig. 2B. The relative abundance of each P3 virus relative to that of TCS is listed below each lane. The size of the PCR product from TCS is 233 bp, that from the P3 clones is 175 bp, and that from the ⫺19/⫹5 virus is 147 bp.

DISCUSSION The complete SIN promoter was mapped to within the ⫺40/ ⫹14 region encompassing the subgenomic mRNA start site. In vivo evolution was used to identify positions in the ⫺35/⫹9 region, as well as the specific bases at each position that are preferred for promoter function (Fig. 4). We reasoned that

positions where specific bases were most rapidly selected for are likely to be essential for promoter function while those that evolve more slowly probably contribute to achieving full promoter activity. By these criteria, the promoter consists of an essential region from ⫺15 to ⫹5 required for promoter function and a number of sites flanking it that contribute to full promoter activity in mosquito cells (Fig. 3). The essential re-

3518

WIELGOSZ ET AL.

J. VIROL.

FIG. 6. Effect of point mutations in the ⫺19/⫹5 region on promoter activity. DI constructs with double promoters were used to measure the activity of each mutant promoter relative to the wild-type internal control (see Materials and Methods). The values obtained were then normalized against that obtained for the DIC20a construct, both of whose promoters were the wild-type ⫺19/⫹5 promoter, and expressed as a percentage of wild-type activity. The top line shows the wild-type SIN sequence. Each value in subsequent lines corresponds to the promoter activity due to the base change at the indicated nucleotide position. For example, an A-to-G mutation at position ⫺18 has 90% of wild-type activity.

gion is the same in mammalian cells, but few of the flanking sites are required, some of which also depend on the culture temperature. These conclusions are consistent with previous studies showing that the ⫺19/⫹5 region is the smallest region still giving detectable promoter activity and that full promoter activity could be obtained with the ⫺98/⫹14 region (19, 32). For example, deletion of downstream sequences, between ⫹6 and ⫹14, should diminish promoter activity, but deletions into the essential region, beyond position ⫹5, should abolish promoter function. Similarly, deletion of upstream sequences, in the ⫺40 to ⫺16 interval, should progressively reduce promoter activity, eventually rendering promoter activity undetectable even if the deletions did not extend into the essential region. This model of the promoter is verified by examining the properties of mutant promoters. Point mutagenesis experiments showed that base changes at many sites in the essential region result in defective promoter function (Fig. 6). The available data suggest that even a small decrease in promoter activity, ca. 40%, significantly decreases viral fitness. Complementary information is provided by characterizing promoters from the P3 populations, which are essentially mutants with one to five base changes. The results identify the mutations at specific positions that are consistent with normal promoter function (Table 3; Fig. 5). Notably, all but three of these positions are outside of the ⫺15 to ⫹5 essential region. Examination of the consensus sequence (Fig. 4A) and the sequence of individual isolates (Table 3) did not reveal any obvious secondary structure in either the essential or the flanking regions. In this respect, the SIN promoter resembles the brome mosaic virus subgenomic promoter (5) and is unlike the cucumber mosaic virus (5) or rubella virus (7) promoters, which are members of the alphavirus-like supergroup (39). In theory, positions unrelated to promoter function should exhibit no base preference. However, most positions outside of the essential region do exhibit some base preference, but they typically prefer the non-wild-type base. In addition to contributing to promoter function, they may have other roles. The 5⬘ end of the subgenomic mRNA is encoded by the region studied, and selection for expression of optimal amounts of the structural proteins may also operate on the quality of the mRNA. Selection might also operate on the viral RNA II (46), which terminates at position ⫺4, and the amount of RNA II that is made appears to depend on the ⫺40 to ⫹20 region (Fig. 2A). Further studies are needed to disentangle the possibly overlapping sequence requirements of these functions from those required just for promoter function. The junction region also encodes the carboxyl terminus of nsP4. The genome of the

viruses used in this study was designed to remove this constraint; therefore, the observed base preference is not due to selection for nsP4 function. The fact that positions in the ⫺35 to ⫺16 region generally prefer non-wild-type bases shows that nsP4 coding constraints in the normal genome context play a major role in limiting the spectrum of allowable changes, even when the changes were preferred for promoter function in its absence. Thus, it is likely that the wild-type sequence outside the essential region reflects a compromise that best satisfies the potentially conflicting demands of the several functions of the region. When provided with a choice of all four bases, the wild-type base is strongly preferred in both mosquito and mammalian cells, especially within the ⫺15/⫹5 essential region (Fig. 4), suggesting that the wild-type sequence is optimal for promoter function and that the optimum is the same in both types of hosts. Similarly, point mutagenesis studies of the 3⬘ conserved sequence of alphaviruses, presumably required for initiation of minus-strand synthesis, showed that all mutations but one examined were deleterious (14), suggesting that this sequence too had been optimized. For the positions that are conserved among the alphaviruses (Fig. 1), sequence optimization probably preceded the divergence of the viruses. For the positions that are different among the alphaviruses, optimization probably occurred in parallel with sequence divergence. For example, the point mutagenesis results (Fig. 6) show that ⫺10A of salmon pancreas disease virus, ⫺7U of salmon pancreas disease virus, rainbow trout sleeping disease virus, and Venezuelan equine encephalitis virus and ⫹3A of Semliki Forest virus are individually deleterious in the SIN context. All these viruses probably have high mutation rates, and their promoters are likely to be as important for their life cycle as it is for SIN. We therefore predict that promoter sequence changes during the divergence of the alphaviruses were accompanied by compensatory changes elsewhere in the promoter or the cognate viral transcription factor, since any host factors involved would evolve much too slowly. While the compensatory changes remain to be identified, it is likely that promoter recognition was optimized rapidly. Variants generated by mutation during replication have an initial abundance of ca. 10⫺4 to 10⫺5, about 10 to 100 times lower than the abundance of the wild-type base in our libraries. Since only a few passages were sufficient for the wild-type base to be selected for in the in vivo selection experiments, optimization in nature probably requires only a modest number of replication cycles. The cis-acting sequences of many other RNA virus families are also very well conserved (see reference 12 for a review).

VOL. 75, 2001

SEQUENCE REQUIREMENTS FOR THE SIN PROMOTER

The available evidence suggests that they too have been optimized. For example, point mutagenesis studies of the influenza A virus vRNA promoter (30), the transcription signals of vesicular stomatitis virus (1, 2) and human respiratory syncytial virus (15), the 3⬘ transcription signal of Rift Valley virus (31), and the replication signal of a group A rotavirus (44) show that most if not all point mutations disrupt normal functioning. Sequence optimization might not be confined to viral cisacting sequences. Steinhauer and Holland (36) proposed that the sequence of RNA virus populations can remain stable during extended periods of environmental stability despite their high mutation rates, because none of the variants is more fit than the wild type; i.e., the wild-type sequence is optimal. Indeed, the high mutation rates of RNA viruses might actually expedite genome optimization whenever the environment changes, by generating an abundant diversity of variants to be tested by selection. It will be interesting to determine if many other parts of the virus genome are also optimized. ACKNOWLEDGMENT This work was supported by Public Health Service grant AI26763. REFERENCES 1. Barr, J. N., S. P. Whelan, and G. W. Wertz. 1997. cis-Acting signals involved in termination of vesicular stomatitis virus mRNA synthesis include the conserved AUAC and the U7 signal for polyadenylation. J. Virol. 71:8718– 8725. 2. Barr, J. N., S. P. Whelan, and G. W. Wertz. 1997. Role of the intergenic dinucleotide in vesicular stomatitis virus RNA transcription. J. Virol. 71: 1794–1801. 3. Carmichael, G. G., and G. K. McMaster. 1980. The analysis of nucleic acids in gels using glyoxal and acridine orange. Methods Enzymol. 65:380–391. 4. Chang, G. J., and D. W. Trent. 1987. Nucleotide sequence of the genome region encoding the 26S mRNA of eastern equine encephalomyelitis virus and the deduced amino acid sequence of the viral structural proteins. J. Gen. Virol. 68:2129–2142. 5. Chen, M. H., M. J. Roossinck, and C. C. Kao. 2000. Efficient and specific initiation of subgenomic RNA synthesis by cucumber mosaic virus replicase in vitro requires an upstream RNA stem-loop. J. Virol. 74:11201–11209. 6. Durbin, R., A. Kane, and V. Stollar. 1991. A mutant of Sindbis virus with altered plaque morphology and a decreased ratio of 26S:49S RNA synthesis in mosquito cells. Virology 183:306–312. 7. Frey, T. K. 1994. Molecular biology of rubella virus. Adv. Virus Res. 44:69–160. 8. Grakoui, A., R. Levis, R. Raju, H. V. Huang, and C. M. Rice. 1989. A cis-acting mutation in the Sindbis virus junction region which affects subgenomic RNA synthesis. J. Virol. 63:5216–5227. 9. Hertz, J. M., and H. V. Huang. 1995. Evolution of the Sindbis virus subgenomic mRNA promoter in cultured cells. J. Virol. 69:7768–7774. 10. Hertz, J. M., and H. V. Huang. 1995. Host-dependent evolution of the Sindbis virus promoter for subgenomic mRNA synthesis. J. Virol. 69:7775– 7781. 11. Hertz, J. M., and H. V. Huang. 1992. Utilization of heterologous alphavirus junction sequences as promoters by Sindbis virus. J. Virol. 66:857–864. 12. Huang, H. 1997. Evolution of the alphavirus promoter and the cis-acting sequences of RNA viruses, p. 65–79. In J.-F. Saluzzo and B. Dodet (ed.), Factors in the emergence of arbovirus diseases. Elsevier, Paris, France. 13. Kinney, R. M., B. J. Johnson, V. L. Brown, and D. W. Trent. 1986. Nucleotide sequence of the 26 S mRNA of the virulent Trinidad donkey strain of Venezuelan equine encephalitis virus and deduced sequence of the encoded structural proteins. Virology 152:400–413. 14. Kuhn, R. J., Z. Hong, and J. H. Strauss. 1990. Mutagenesis of the 3⬘ nontranslated region of Sindbis virus RNA. J. Virol. 64:1465–1476. 15. Kuo, L., R. Fearns, and P. L. Collins. 1997. Analysis of the gene start and gene end signals of human respiratory syncytial virus: quasi-templated initiation at position 1 of the encoded mRNA. J. Virol. 71:4944–4953. 16. Lanciotti, R. S., M. L. Ludwig, E. B. Rwaguma, J. J. Lutwama, T. M. Kram, N. Karabatsos, B. C. Cropp, and B. R. Miller. 1998. Emergence of epidemic O’nyong-nyong fever in Uganda after a 35-year absence: genetic characterization of the virus. Virology 252:258–268. 17. Levinson, R. S., J. H. Strauss, and E. G. Strauss. 1990. Complete sequence of the genomic RNA of O’nyong-nyong virus and its use in the construction

3519

of alphavirus phylogenetic trees. Virology 175:110–123. 18. Levis, R., H. Huang, and S. Schlesinger. 1987. Engineered defective interfering RNAs of Sindbis virus express bacterial chloramphenicol acetyltransferase in avian cells. Proc. Natl. Acad. Sci. USA 84:4811–4815. 19. Levis, R., S. Schlesinger, and H. V. Huang. 1990. Promoter for Sindbis virus RNA-dependent subgenomic RNA transcription. J. Virol. 64:1726–1733. 20. Levis, R., B. G. Weiss, M. Tsiang, H. Huang, and S. Schlesinger. 1986. Deletion mapping of Sindbis virus DI RNAs derived from cDNAs defines the sequences essential for replication and packaging. Cell 44:137–145. 21. Liang, G. D., L. Li, G. L. Zhou, S. H. Fu, Q. P. Li, F. S. Li, H. H. He, Q. Jin, Y. He, B. Q. Chen, and Y. D. Hou. 2000. Isolation and complete nucleotide sequence of a Chinese Sindbis-like virus. J. Gen. Virol. 81:1347–1351. 22. Liljestrom, P., and H. Garoff. 1991. A new generation of animal cell expression vectors based on the Semliki Forest virus replicon. Bio/Technology 9:1356–1361. 23. Liljestrom, P., S. Lusa, D. Huylebroeck, and H. Garoff. 1991. In vitro mutagenesis of a full-length cDNA clone of Semliki Forest virus: the small 6,000-molecular-weight membrane protein modulates virus release. J. Virol. 65:4107–4113. 24. Netolitzky, D. J., F. L. Schmaltz, M. D. Parker, G. A. Rayner, G. R. Fisher, D. W. Trent, D. E. Bader, and L. P. Nagata. 2000. Complete genomic RNA sequence of western equine encephalitis virus and expression of the structural genes. J. Gen. Virol. 81:151–159. 25. Niesters, H. G., and J. H. Strauss. 1990. Defined mutations in the 5⬘ nontranslated sequence of Sindbis virus RNA. J. Virol. 64:4162–4168. 26. Ou, J. H., C. M. Rice, L. Dalgarno, E. G. Strauss, and J. H. Strauss. 1982. Sequence studies of several alphavirus genomic RNAs in the region containing the start of the subgenomic RNA. Proc. Natl. Acad. Sci. USA 79:5235–5239. 27. Ou, J. H., E. G. Strauss, and J. H. Strauss. 1983. The 5⬘-terminal sequences of the genomic RNAs of several alphaviruses. J. Mol. Biol. 168:1–15. 28. Ou, J. H., E. G. Strauss, and J. H. Strauss. 1981. Comparative studies of the 3⬘-terminal sequences of several alphavirus RNAs. Virology 109:281–289. 29. Ou, J. H., D. W. Trent, and J. H. Strauss. 1982. The 3⬘-non-coding regions of alphavirus RNAs contain repeating sequences. J. Mol. Biol. 156:719–730. 30. Piccone, M. E., A. Fernandez-Sesma, and P. Palese. 1993. Mutational analysis of the influenza virus vRNA promoter. Virus Res. 28:99–112. 31. Prehaud, C., N. Lopez, M. J. Blok, V. Obry, and M. Bouloy. 1997. Analysis of the 3⬘ terminal sequence recognized by the Rift Valley fever virus transcription complex in its ambisense S segment. Virology 227:189–197. 32. Raju, R., and H. V. Huang. 1991. Analysis of Sindbis virus promoter recognition in vivo, using novel vectors with two subgenomic mRNA promoters. J. Virol. 65:2501–2510. 33. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 34. Sarver, N., and V. Stollar. 1977. Sindbis virus-induced cytopathic effect in clones of Aedes albopictus (Singh) cells. Virology 80:390–400. 35. Simpson, D. A., N. L. Davis, S. C. Lin, D. Russell, and R. E. Johnston. 1996. Complete nucleotide sequence and full-length cDNA clone of S.A.AR86 a South African alphavirus related to Sindbis. Virology 222:464–469. 36. Steinhauer, D. A., and J. J. Holland. 1987. Rapid evolution of RNA viruses. Annu. Rev. Microbiol. 41:409–433. 37. Strauss, E. G., and J. H. Strauss. 1986. Structure and replication of the alphavirus genome, p. 35–90. In M. J. Schlesinger and S. Schlesinger (ed.), The Togaviridae and Flaviviridae. Plenum Publishing Corp., New York, N.Y. 38. Strauss, E. G., R. Levinson, C. M. Rice, J. Dalrymple, and J. H. Strauss. 1988. Nonstructural proteins nsP3 and nsP4 of Ross River and O’Nyongnyong viruses: sequence and comparison with those of other alphaviruses. Virology 164:265–274. 39. Strauss, J. H., and E. G. Strauss. 1994. The alphaviruses: gene expression, replication, and evolution. Microbiol. Rev. 58:491–562. 40. Takkinen, K. 1986. Complete nucleotide sequence of the nonstructural protein genes of Semliki Forest virus. Nucleic Acids Res. 14:5667–5682. 41. Tsiang, M., S. S. Monroe, and S. Schlesinger. 1985. Studies of defective interfering RNAs of Sindbis virus with and without tRNAAsp sequences at their 5⬘ termini. J. Virol. 54:38–44. 42. Tsiang, M., B. G. Weiss, and S. Schlesinger. 1988. Effects of 5⬘-terminal modifications on the biological activity of defective interfering RNAs of Sindbis virus. J. Virol. 62:47–53. 43. Villoing, S., M. Bearzotti, S. Chilmonczyk, J. Castric, and M. Bremont. 2000. Rainbow trout sleeping disease virus is an atypical alphavirus. J. Virol. 74:173–183. 44. Wentz, M. J., J. T. Patton, and R. F. Ramig. 1996. The 3⬘-terminal consensus sequence of rotavirus mRNA is the minimal promoter of negative-strand RNA synthesis. J. Virol. 70:7833–7841. 45. Weston, J. H., M. D. Welsh, M. F. McLoughlin, and D. Todd. 1999. Salmon pancreas disease virus, an alphavirus infecting farmed Atlantic salmon, Salmo salar L. Virology 256:188–195. 46. Wielgosz, M. M., and H. V. Huang. 1997. A novel viral RNA species in Sindbis virus-infected cells. J. Virol. 71:9108–9117.