Solution structure of the general transcription factor 2I domain in ...

4 downloads 4583 Views 379KB Size Report
The general transcription factor TFII-I, with the corresponding gene name GTF2I, is an unusual ... structure of the fifth repeat of the six GTF2I repeat domains from murine TFII-I, which .... acids were used in the Escherichia coli cell-free protein.
PROTEIN STRUCTURE REPORT

Solution structure of the general transcription factor 2I domain in mouse TFII-I protein YUKIKO DOI-KATAYAMA,1,4 FUMIAKI HAYASHI,1 MAKOTO INOUE,1 TAKASHI YABUKI,1 MASAAKI AOKI,1 EIKO SEKI,1 TAKAYOSHI MATSUDA,1 TAKANORI KIGAWA,1 MAYUMI YOSHIDA,1 MIKAKO SHIROUZU,1 TAKAHO TERADA,1 YOSHIHIDE HAYASHIZAKI,1 SHIGEYUKI YOKOYAMA,1,2 AND HIROSHI HIROTA1,3 1

RIKEN Genomic Sciences Center, Tsurumi-ku, Yokohama 230-0045, Japan Graduate School of Science, The University of Tokyo, Tokyo 113-0033, Japan 3 Graduate School, Yokohama City University, Tsurumi-ku, Yokohama 230-0045, Japan 2

(R ECEIVED January 25, 2007; F INAL R EVISION May 1, 2007; ACCEPTED May 3, 2007)

Abstract The general transcription factor TFII-I, with the corresponding gene name GTF2I, is an unusual transcriptional regulator that associates with both basal and signal-induced transcription factors. TFII-I consists of six GTF2I repeat domains, called I-repeats R1–R6. The structure and function of the GTF2I domain are not clearly understood, even though it contains a helix-loop-helix motif, which is considered to be the protein–protein interaction area, based on biochemical analyses. Here, we report the solution structure of the fifth repeat of the six GTF2I repeat domains from murine TFII-I, which was determined by heteronuclear multidimensional NMR spectroscopy (PDB code 1Q60). The three-dimensional structure of the GTF2I domain is classified as a new fold, consisting of four helices (residues 8–24, 34–39, 63–71, and 83–91), two antiparallel beta strands (residues 44–47 and 77–80), and a well-defined loop containing two b-turns between sheet 1 and helix 3. All of the repeats probably have similar folds to that of repeat 5, because the conserved residues in the GTF2I repeat domains are assembled on the hydrophobic core, turns, and secondary structure elements, as revealed by a comparison of the sequences of the first through the sixth GTF2I repeats in TFII-I. Keywords: GTF2I domain; transcription factor; nuclear magnetic resonance; protein structure Supplemental material: see www.proteinscience.org

Transcription factor II-I (TFII-I), which consists of six GTF2I repeat domains, is a multifunctional transcription factor that binds to both a core promoter element and various upstream elements, thereby assisting in the communication between the basal machinery and upstream activators (Roy 2001). These transcription functions are

4

Present address: Nikon Corporation, Shinagawa, Tokyo, Japan. Reprint requests to: Hiroshi Hirota, RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan; e-mail: [email protected]; fax: 81-45-503-9210. Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.072792007.

1788

regulated through its phosphorylation of some tyrosine or serine kinases, such as Bruton’s tyrosine kinase (Btk) (Yang and Desiderio 1997), c-Src (Cheriyath et al. 2002), JAK2 (Kim and Cochran 2001), ERK/MAPK (Kim and Cochran 2000), and cGMP-dependent protein kinase (Casteel et al. 2002). The GTF2I and GTF2IRD1 genes, which encode the TFII-I family of proteins, are located within a critical region of the Williams-Beuren syndrome (WBS, also known as Williams syndrome) deletion. WBS is a neurodevelopmental disorder caused by a 1.55- to 1.84-megabase deletion at chromosomal subband 7q11.23 (Korenberg et al. 2000; Mervis 2003; Perez Jurado 2003; Tassabehji 2003). Recently, it was reported that TFII-I

Protein Science (2007), 16:1788–1792. Published by Cold Spring Harbor Laboratory Press. Copyright Ó 2007 The Protein Society

Solution structure of the GTF2I domain in TFII-I

down-regulates estrogen-responsive genes containing Inr elements, by recruiting ERa and corepressors to these promoters (Ogura et al. 2005). Thus, TFII-I is viewed as an important positive and negative regulator of transcription and cell proliferation. Genome analyses revealed that TFII-I consists of six GTF2I repeat domains, called I-repeats R1–R6, and an electrostatically basic region, which contains a sequencespecific DNA-binding domain, located between R1 and R2 (Ferre-D’Amare et al. 1993; Grueneberg et al. 1997; Roy et al. 1997; Cheriyath et al. 2002). The structure of the GTF2I domain has not been reported, and its function is not well understood. It has been suggested that the GTF2I domain contains a helix-loop-helix (HLH) motif,

which may function in DNA binding or protein–protein interactions, as suggested by genetic and biological analyses (Roy et al. 1997; Cheriyath and Roy 2001). Here, we describe the first three-dimensional structure of the fifth GTF2I domain (R5) from the murine TFII-I protein, which is also known as Btk-associated protein (BAP) -135, using multidimensional NMR spectroscopy. The three-dimensional structure of the GTF2I domain is classified as a new fold. Results and Discussion The superposition of 20 conformers, depicted in Figure 1A, shows that the structure is well defined. The

Figure 1. Overall structure of the GTF2I domain. (A) Stereoview of a trace of the backbone atoms for the ensemble of the 20 lowest energy structures (residues 8–93). (B) Ribbon diagram of the GTF2I domain. A view of the left edge is in the same orientation as in A. As going to the right, the views are rotated by 90° around the vertical axis, respectively. The residues forming the core of the protein are shown for the side-chain heavy atoms. (C) Molecular surface representations of the electrostatic potential (blue, positive; red, negative) of the GTF2I domain, calculated by MOLMOL. The view is in the same orientation as those in B and C.

www.proteinscience.org

1789

Doi-Katayama et al.

three-dimensional structure of the GTF2I domain is composed of four helices, which include residues 8–24 (helix 1), 34–39 (helix 2), 63–71 (helix 3), and 87–91 (helix 4), and a two-stranded antiparallel beta sheet composed of residues 44–47 (sheet 1) and 77–80 (sheet 2). The four helices are on one side, while the b-strands are on the other side, with an aababa topology. Helix 1 is a four-and-a-half turn helix, which is the longest helix in this molecule forming an HLH motif, with helix 2 arranged at an angle of about 150° relative to helix 1. Helix 3 lies by the N-terminal side of helix 1, at a dihedral angle of 44°. Helix 4, which is found in only R5 of this isoform of TFII-I (Fig. 2B), faces helix 2. A two-stranded antiparallel beta sheet, which is supported by an internal hydrogen bond network between sheet 1 and sheet 2, faces the HLH motif, consisting of helices 1 and 2. The downfield-shifted resonances of the amide protons of Lys81, Ile79, Lys77, Glu47, and Tyr45 suggest the existence of interactions between the two antiparallel beta strands; that is, those of Lys81 HN–Asp43 CO, Ile79 HN–Tyr45 CO, Lys77 HN–Glu47 CO, and Glu47 HN– Lys77 CO. There is a long, well-defined loop between sheet 1 and helix 3, where the 50PEG53V and 58PST61F regions can be regarded as unique type II and type I b-turns (turns 1 and 2), respectively, anchored by Leu49, Val53, Phe55, and Phe61. Nineteen residues, Val12, Leu15, Phe16, Leu24, Val32, Phe34, Phe37, Phe44, Val46, Leu49, Val53, Phe55, Phe61, Leu66, Ile69, Ile76, Phe78, Ile80, and Phe86, form the hydrophobic core of this protein. The distribution of the electrostatic surface potential reveals a negative surface on the whole. There are some broad positive patches on the surface of the long loop, helix 3, the two antiparallel beta strands and the N terminus of helix 1 (Fig. 1C). Any structurally related protein to this repeat 5 of GTF2I

could not be found from the DALI database (Holm and Sander 1993). Figure 2 shows the alignment of the first through sixth GTF2I repeats in TFII-I by CLUSTAL X (Thompson et al. 1997). A comparison of the sequences of the GTF2I domains in this protein suggests that all of the repeats have similar folds to that of repeat 5. However, each domain has several distinguishing characteristics on the surface electrostatic charge distribution, because there are many conserved residues corresponding to the hydrophobic core, turns, and secondary structure elements, but the charged residues are not absolutely conserved throughout these GTF2I repeats. This structure elucidation shows that the 83PEMFETAI91K, which is unique to repeat 5 of BAP135 in the TFII-I family, forms helix 4. The Ser59 residue corresponding to Ser784 in repeat 5 in the BAP-135 protein, which is phosphorylated by cGMP-dependent protein kinase Ib just as Ser412 in GTF2I repeat 2 (Casteel et al. 2002). This structural analysis revealed that Ser59 is located on turn 2, and that the charged residues are clustered in distinctly polarized patches surrounding the residue. The sequence homology and the characteristic distribution of the electrostatic surface potential suggests that the type I b-turn might have some functional role by interacting with the active site of G-kinase Ib. Materials and Methods Sample preparation and NMR experiments The DNA encoding the GTF2I domain of mouse TFII-I was subcloned by PCR from the mouse full-length cDNA clone (RIKEN cDNA ID 2810454O07) (Kawai et al. 2001). This DNA fragment was cloned into the expression vector pCR2.1 (Invitrogen) as a fusion protein with an N-terminal 6-His affinity tag and a TEV protease cleavage site. The 13C/15N labeled amino

Figure 2. (A) Schematic diagram of the mouse BAP-135 protein. The dotted boxes represent the GTF2I domains (denoted as R1–R6). (B) Sequence alignment of each GTF2I repeat domain (R1–R6). The information on ‘‘buried’’ and ‘‘accessible’’ are created by Procheck. The sequence was aligned by ClustalX and displayed by ESPript. The colors are chosen according to the similarity (red box and white character, strict identity; red character, similarity in a group; blue frame, similarity across groups). cGMP-dependent protein kinase Ib-phosphorylated serine residues are marked by green boxes.

1790

Protein Science, vol. 16

Solution structure of the GTF2I domain in TFII-I

acids were used in the Escherichia coli cell-free protein expression system (Kigawa et al. 1999). The synthetic mixture was first adsorbed to a HisTrap column (Amersham Biosciences), which was washed with buffer A (20 mM sodium phosphate buffer at pH 7.2, containing 1 M sodium chloride and 20 mM imidazole), and was eluted with buffer B (20 mM sodium phosphate buffer at pH 7.2, containing 500 mM imidazole). The eluted protein was incubated at 30°C for 1 h with the TEV protease to cleave the His-tag. The sample was desalted by a dialysis with buffer C (20 mM sodium phosphate buffer at pH 7.2, containing 1 mM ditiothreitol). The sample was applied to a HiTrap SP column, and the flow-through fraction was fractionated on a HiTrap Q column by a concentration gradient of buffer C and buffer D (20 mM sodium phosphate buffer at pH 7.2, containing 1 M sodium chloride and 1 mM ditiothreitol). Fractions containing the purified labeled GTF2I domain (4.5 mg) was obtained. The uniformly 15N- and 13Clabeled GTF2I domain includes a cloning artifact from the vector, with seven amino acids ‘‘GSSGSSG’’ at the N terminus and six amino acids ‘‘SGPSSG’’ at the C terminus. All NMR spectra were measured at 25°C, with 0.9 mM protein dissolved in 1H2O/2H2O (9/1) 20 mM sodium phosphate buffer (pH 7.2), containing 100 mM NaCl and 1 mM d-DTT, on Varian INOVA 600 and 800 spectrometers equipped with a 5-mm F, triple resonance, three-axes pulsed-field gradient probe. A series of double and triple resonance experiments (Bax 1994; Cavanagh et al. 1996) were taken for the assignment of the resonances of the GTF2I repeat domain. HNCO, HNCA, HNCACB, CBCA(CO)NH, and (HCA)CO(CA)NH experiments were performed for sequential assignment of the peptide backbone. For the side-chain assignments, HBHA(CO)NH, HBHANH, CC(CO)NH, HCCH-TOCSY, CCH-TOCSY, and two-dimensional 1H-15N HSQC, constant-time1 H-13C HSQCs were performed. NOE data for structure determination were extracted from three-dimensional 15N- and 13C-edited NOESY spectra, recorded with a mixing time of 75 msec. All data were processed with NMRPipe, version 20020425 (Delaglio et al. 1995). The processed data were analyzed by the program Kujira, version 0.816 (N. Kobayashi, pers. comm.), created on the basis of NMRView (Johnson and Blevins 1994).

Structure determination The calculation and structural analysis of the obtained NMR data were performed with the program CYANA, version 1.0.7 (Gu¨ntert 2003), starting from 100 random conformers. The 20 conformers with the lowest residual restraint violations were selected from 100 calculated structures. For the final structure calculation, 1818 distance constraints were used. Backbone dihedral angle restraints (F and C angles) from TALOS (Cornilescu et al. 1999) were used in the region with secondary structure. The calculation statistics are summarized in Table 1. The atomic coordinates have been deposited in the Protein Data Bank (accession code 1Q60). The Ramachandran plot was analyzed using PROCHECK-NMR (Laskowski et al. 1993). Secondary structure elements were calculated using the program MOLMOL (Koradi et al. 1996), which was also used to create figures of the structures.

Sequence alignment CLUSTAL X 1.81 was used for the sequence alignment. A structure-based sequence alignment was performed by modifying the gap penalties for the secondary structure regions (four

Table 1. Structural statistics for the superposition of the 20 structures of the GTF2I domain with the fewest violations from NMR restraints Number of experimental restraints

Distance restraints from NOEs Intraresidue Medium range, (1 < jijj < 5) Long range, (jijj > ¼ 5) Total Dihedral angle restraints (F and C) ˚ 2) CYANA target function value (A Ramachandran analysisa Residues in favored regions Residues in additional allowed regions Residues in generously allowed regions Residues in disallowed regions ˚ )a Coordinate precision (A Average backbone RMSD to mean Average heavy atom RMSD to mean a

932 369 517 1818 90 0.37 6 0.09 78.3% 19.5% 2.2% 0.1% ˚ 0.38 6 0.08 A ˚ 0.81 6 0.07 A

For residues 8–93.

times larger penalties within the secondary structures and two times larger penalties at the boundaries).

Acknowledgment This work was supported by the RIKEN Structural Genomics/ Proteomics Initiative (RSGI), the National Project on Protein Structural and Functional Analyses, Ministry of Education, Culture, Sports, Science and Technology of Japan.

References Bax, A. 1994. Multidimensional nuclear magnetic resonance methods for protein studies. Curr. Opin. Struct. Biol. 4: 738–744. Casteel, D.E., Zhuang, S., Gudi, T., Tang, J., Vuica, M., Desiderio, S., and Pilz, R.S. 2002. cGMP-dependent protein kinase Ib physically and functionally interacts with the transcriptional regulator TFII-I. J. Biol. Chem. 277: 32003–32014. Cavanagh, J., Fairbrother, W.J., Palmer III, A.G., and Skelton, N.J. 1996. Protein NMR spectroscopy, principles and practice. Academic Press, San Diego, CA. Cheriyath, V. and Roy, A.L. 2001. Structure–function analysis of TFII-I. Roles of the N-terminal end, basic resion, and I-repeat. J. Biochem. 276: 8377– 8383. Cheriyath, V., Desgranges, Z.P., and Roy, A.L. 2002. c-Src-dependent transcriptional activation of TFII-I. J. Biol. Chem. 277: 22798– 22805. Cornilescu, G., Delaglio, F., and Bax, A. 1999. Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J. Biomol. NMR 13: 289–302. Delaglio, F., Grzesiek, S., Vuister, G.W., Zhu, G., Pfeifer, J., and Bax, A. 1995. Nmrpipe: A multidimensional spectral processing system based on Unix Pipes. J. Biomol. NMR 6: 277–293. Ferre-D’Amare, A.R., Prendergast, G.C., Ziff, E.B., and Burley, S.K. 1993. Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain. Nature 363: 38–45. Grueneberg, D.A., Henry, R.W., Brauer, A., Novina, C.D., Cheriyath, V., Roy, A.L., and Gilman, M. 1997. A multifunctional DNA-binding protein that promotes the formation of serum response factor/homeodomain complexes: Identity to TFII-I. Genes & Dev. 11: 2482–2493. Gu¨ntert, P. 2003. Automated NMR protein structure calculation. Prog. NMR Spectrosc. 43: 105–125.

www.proteinscience.org

1791

Doi-Katayama et al.

Holm, L. and Sander, C. 1993. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233: 123–138. Johnson, B.A. and Blevins, R.A. 1994. NMRView: A computer program for the visualization and analysis of NMR data. J. Biomol. NMR 4: 603–614. Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., et al. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409: 685–690. Kigawa, T., Yabuki, T., Yoshida, Y., Tsutsui, M., Ito, Y., Shibata, T., and Yokoyama, S. 1999. Cell-free production and stable-isotope labeling of milligram quantities of proteins. FEBS Lett. 442: 15–19. Kim, D.-W. and Cochran, B.H. 2000. Extracellular signal-regulated kinase binds to TFII-I and regulates its activation of the c-fos promoter. Mol. Cell. Biol. 20: 1140–1148. Kim, D.-W. and Cochran, B.H. 2001. JAK2 activates TFII-I and regulates its interaction with extracellular signal-regulated kinase. Mol. Cell. Biol. 21: 3387–3397. Koradi, R., Billeter, M., and Wuthrich, K. 1996. MOLMOL: A program for display and analysis of macromolecular structures. J. Mol. Graph. 14: 51–55. Korenberg, J.R., Chen, X.N., Hirota, H., Lai, Z., Bellugi, U., Burian, D., Roe, B., and Matsuoka, R. 2000. VI. Genome structure and cognitive map of Williams syndrome. J. Cogn. Neurosci. 12: 89–107. Laskowski, R.A., Rullmannn, J.A., MacArthur, M.W., Kaptein, R., and Thornton, J.M. 1993. AQUA and PROCHECK-NMR: Programs for

1792

Protein Science, vol. 16

checking the quality of protein structures solved by NMR. J. Biomol. NMR 8: 477–486. Mervis, C.B. 2003. Williams syndrome: 15 years of psychological research. Dev. Neuropsychol. 23: 1–12. Ogura, Y., Azuma, M., Tsuboi, Y., Kabe, Y., Yamaguchi, Y., Wada, T., Watanabe, H., and Handa, H. 2005. TFII -I down-regulates a subset of estrogen-responsive genes through its interaction with an initiator element and estrogen receptor a. Genes Cells 11: 373–381. Perez Jurado, L.A. 2003. Williams-Beuren syndrome: A model of recurrent genomic mutation. Horm. Res. 59: 106–113. Roy, A.L. 2001. Biochemistry and biology of the inducible multifunctional transcription factor TFII-I. Gene 274: 1–13. Roy, A.L., Du, H., Gregor, P.D., Novina, C.D., Martinez, E., and Roeder, R.G. 1997. Cloning of an Inr- and E-box-binding protein, TFII-I, that interacts physically and functionally with USF1. EMBO J. 16: 7091–7104. Tassabehji, M. 2003. Williams-Beuren syndrome: A challenge for genotypephenotype correlations. Hum. Mol. Genet. 12: R229–R237. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and Higgins, D.G. 1997. The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25: 4876–4882. Yang, W. and Desiderio, S. 1997. BAP-135, a target for Bruton’s tyrosine kinase in response to B cell receptor engagement. Proc. Natl. Acad. Sci. 94: 604–609.