An unusual intermediate filament subunit from the cytoskeletal ...

3 downloads 0 Views 377KB Size Report
Biou, V., Gibrat, J. F., Levin, J. M., Robson, B. and Garnier, J. (1988). Secondary structure prediction: Combination of three different methods. Protein Eng. 2 ...
3133

Journal of Cell Science 107,3133-3144 (1994) Printed in Great Britain © The Company of Biologists Limited 1994

An unusual intermediate filament subunit from the cytoskeletal biopolymer released extracellularly into seawater by the primitive hagfish (Eptatretus

stouti) Elizabeth A. Koch1,*, Robert H. Spitzer1, Ron B. Pithawalla1 and David A. D. Parry2 1Department 2Department

of Biological Chemistry, University of Health Sciences, The Chicago Medical School, North Chicago, IL 60064, USA of Physics and Biophysics, Massey University, Palmerston North, New Zealand

*Author for correspondence

SUMMARY Each slime gland thread cell from the primitive Pacific hagfish (Eptatretus stouti) contains a massive, conical, intermediate filament (IF)-rich biopolymer (‘thread,’ ~60 cm length, ~3 µm width). In view of the unusual ultrastructure of the thread, its extracellular role in modulation of the viscoelastic properties of mucus, and the ancient lineage of this primitive vertebrate, we report the nucleotide and deduced amino acid sequences of one major thread IF subunit, α (pI 7.5), which is coexpressed with a second polypeptide, γ (pI 5.3). These two polypeptides coassemble in vitro into ~10 nm filaments. The α-thread chain, a 66.6 kDa polypeptide, has an unusual central rod domain containing 318 residues flanked by N- and C-terminal domains of 192 and 133 residues, respectively. Each peripheral region exhibits some epidermal keratin-like features including peptide repeats and a high total content of glycine and serine residues. The terminal domains, however, lack the H1 and H2 subdomains characteristic of known keratins. Moreover, when the central rod is aligned either in relation to established homology profiles (J. F. Conway and D. A. D. Parry (1988) Int. J. Biol. Macromol. 10, 79-98) of other IF subunits (types I-V, nestin, non-neuronal invertebrate), or by computer-based homology searches of the GenBankTM/EMBL Data Bank, a low identity (Ser (~10%)>Thr (~9%). The molecular mass of α, based on its deduced sequence (66.6 kDa), is near the value estimated by SDS-PAGE (64 kDa; Spitzer et al., 1988). Alignment with other IF chains indicates that α is also an IF chain Secondary structure algorithms served to delineate a qualitative tripartite structure of N- and C-terminal domains that flank a mainly α-helical central rod (Fig. 2, legend). On this basis, the sequence of α was amenable to alignment with other IF chains in relation to homology profiles described by Conway and Parry (1988), and assignment within the central rod (Figs 2, 3) of subdomains as follows (number of residues in parenthesis): 1A (35), L1 (13), 1B (101), L12 (21), 2A (19), L2 (8), 2B (121). The 1B subdomain of α does not contain the 42 residue insert that is characteristic of lamins (Fisher at al., 1986; McKeon et al., 1986; Doring and Stick, 1990) and the non-neuronal IF from the invertebrate, Helix aspersa (Weber et al., 1988, 1989; Dodemont et al., 1990). Heptad repeats with apolar residues in the first (a) and fourth (d) position of α show correspondence to apolar sites in other chains (Fig. 3) (Steinert et al., 1983; Weber and Geisler, 1985; Franz and Franke, 1986). A discontinuity in heptad periodicity (ST) noted near the middle of 2B in α, corresponds to the ‘canonical’ stutter feature of other IF subunits (Steinert and Roop, 1988; Parry, 1990). Among the apolar sites (a,d) in helical regions of α, about 82% correspond to apolar residues (a, 82.1%; d, 82.5%), a value within the expected range of >75% established for other IFs (Steinert and Roop, 1988; Parry and Fraser, 1985; Conway and Parry, 1988). Alignment of α with the central rods representing 11 examples of the major types of IF subunits from various species (human, mouse, hamster, frog, goldfish and snail) indicates that only 12 residues (~3.8%) are

Novel IF from hagfish 3137 conserved absolutely (Fig. 3). These sites reside mainly within conserved consensus regions at the N- and C-terminal ends of the rod (Parry, 1990), are among those cited as highly conserved in IF polypeptides (Conway and Parry, 1988) and support in part the overall alignment profile. Further justification of IF alignment is based on over 70 sites along the rod that

are identical in six or more of the 12 sequences including α (Fig. 3). Although α shows 56% identity to a conserved consensus region (LNDRFASYI(D/E)KVRFLE) near the Nterminal end within rod 1A (Parry, 1990), this value is lower than those for other IF types included in Fig. 3: II (75-87.5%); I (81.5%); III (93-100%); IV (75%); V (75%), invertebrate

Fig. 2. Nucleotide sequence and deduced amino acid sequence of hagfish gland thread cell cDNA encoding a 66.6 kDa polypeptide. Stop codon, triple asterisk; polyadenylation signal, double-underlined; location of five sequenced tryptic fragments (Tf 1-5), delineated by broken lines; sequence of central rod, between arrows; single dot (•) beneath residues 1 and 4 of heptad repeats; some tandem and near-tandem repeats above half-brackets. PHD algorithm (Rost et al., 1993) estimates central rod to include residues 191-503 while the Combine method (Biou et al., 1985) includes 192-507. Notation and size of α-helical segments (1A,1B,2A and 2B below bold lines) and linker segments (L1,L12,L2) in the manner of Conway and Parry (1988) and Parry (1990). ST in 2B is region of discontinuity in heptad repeats. Threonine residues are in bold type. Amino acid content (%) deduced by sequence analysis compared to content after acid hydrolysis (parentheses; Spitzer et al.,1984): Ala, 7.5 (7.7); Val, 7.6 (7.0); Leu, 6.1 (6.2); Ile, 5.6 (4.8); Pro, 3.3 (3.7); Met, 2.0 (1.7); Phe, 2.2 (2.2); Trp, 0.3 (0.7); Gly, 15.9 (15.8); Ser, 10.3 (9.9); Thr, 9.0 (9.3); Tyr, 3.9 (3.4); Lys, 3.6 (3.3); Arg, 4.7 (4.9); His, 2.0 (2.1); Asp, 3.7; Glu, 4.1; Asn, 3.3; Gln, 4.8; Asx 7.0 (7.7); Glx 8.9 (9.4); Cys, 0.0 (0.2). (A single cysteine residue was identified at position 87 in one clone, but the site was assigned as tyrosine for reasons described in Materials and Methods.)

3138 E. A. Koch and others 1A 1A L1 1B 1B • • • • • • • • • • • • • • • IDPATLPSPDTVQHTRIREKQDLQTLNTKFANLVDQVRTLEQHNAILKAQISMITSP--SDTPE-----GPV-NTAVVASTVTATYNAQI LQPLNVEIDPEIQKVKSREREQIKSLNNQFASFIDKVRFMEQQNKVLQTKWELL-QQ-V-DTS-T--RT---HNLEPYFESFINNSRRRV LQPLNVKVDPQIQKVKSQEREQIKSLNDKFASFIDKVRFLEQQNKVLQTKWELL-QQ-V-DTT-T--RTQ---NLDPFFENYISILRRKV LAPLNLEIDPSIQQVRTEEKEQIKTLNNKFASFIDKVRFLEQQNKMLETKWNLL-QN-Q---K-T-TRS----NMDGMFEAYISNLRRQL LAPLNLEIDPNIQVVRTQEKEQMKSLNNRFASFIDKVRFLEQQNKMLETKWSLL-QN-Q--TA-T--RS----NIDAMFEAYINNLRRQL GGGFGGGFAGGDGLLVGSEKVTMQNLNDRLASYLDKVRALEEANADLEVKIRDWYQ------RQRP--AE-IKD-YSPYFKTIEDLRNKI GFGGGGFGGDGGGLLSGNGRVTMRNLNDRLASYMDKVRALEESNYELEGKIKEVY--R--EARQLKPR-EP-RD-YSKYYKTIEDLKGQI DFSLADAINTEFKNTRTNEKVELQELNDRFANYIDKVRFLEQQNKILLAELEQL---K-GQGK--------SR-LGDLYEEEMRELRRQV DFSLAGALNAGFKETRASERAEMMELNDRFASYIEKVRFLEQQNKALAAELNQL---R---AKE------PTK-LADVYQAELRELRLRL DVSQVAAISNDLKSIRIQEKAQLQDLNDRFASFIERVHELEQQNKVLEAGLLVL---RQ---K----HSGPSR-FRALYEQEIRDLRLAA HASAAQSPGSPTRISRMQEKEDLRHLNDRLAAYIERVRSLEADKSLLKIQLEER-EE-V-SSRE--V-T----NLRQLYETELADARKLL YQQLSSSGITDFRGTREKEKREMQNLNERLASYIEKVHFLDAQVKKLEAENEAL-RNR-K-S-ES-L--QP---IRDAYENELAQARKVI N

255 hag α hu II mu II X II gof II hu I mu I vim GFAP NFL X lam 3 He inv

1B 345 • • • • • • • • • • • • • • • • • • • • • • • • EDLRTTNTALHSEIDHLTTIINDITTKYEEQVEVTRTLETDWNTNKDNIDNTYLTIVDLQTKVQGLDEQINTTKQIYNARVREVQAAVTG hag α DQLKSDQSRLDSELKNMQDMVEDYRNKYEDEINKRTNAENEFVTIKKDVDGAYMTKVDLQAKLDNLQQEIDFLTALYQAELSQMQTQIS- hu II DSLKSDQSRMESELKNMQDLVEEYRTKYEDEINKRTNAENEFVTIKKDVDSAYMTKVELQAKRDALQQDINFFSTLYQMEMSQMQTQIS- mu II DGLGQDKMRLESELGNMQGLVEDFKNKYEDEINRRTELENEFVLLKKDVDEAYMNKVQLEARLEALTDEINFLRQLYEEELREMQSQIS- X II DSLGNDKMKLEADLHNMQGLVEDFKNKYEDEINKRTECENDFVLIKKDVDEAYMNKVELEAKLESLSDEINFLRQIFEEEIRELQSQIK- gof II LTATVDNANVLLQIDNARLAADDFRTKYETELNLRMSVEADINGLRRVLDELTLARADLEMQIESLKEELAYLKKNHEEEMNALRGQVGG hu I LTLTTDNANVLLQIDNARLAADDFRLKYENELTLRNSYEADINGLRRVLDELTLSQSVLELQIESLNEELAYLKKNLEEEMRDLQNVSTG mu I DQLTNDKARVEVERDNLAEDIMRLREKLQEEMLQREEAESTLQSFRQDVDNASLARLDLERKVESLQEEIAFLKKLHDEEIQELQAQIQ- vim DQLTANSARLEVERDNFAQDLGTLRQKLQDETNLRLEAENNLAAYRQEADEATLARVDLERKVESLEEEIQFLRKIYEEEVRDLREQLA- GFAP EDATNEKQALEGEREGLEETLRNLQARYEEEVLSREDAEGRLMEARKGADEAALARAELEKRIDSLMDEIAFLKKVHEEEIAELQAQIQI NFL DQTANERARLQVELGKVREEYRQLQARLQEQRAQIAGLESSLRDTTKQLHDEMLWRVDLENKMQTIREQLDFQKNIHTQEVKEIKKRHD- X lam 3 DELSSTKGVSEAKVAGLQDEIASLRELNAKVRELLDKIQEQNRRLRADLDTETAAHIEADCLAQTKTEEAEFYKDLLDQLELLKPEPIQI He inv L12

2A2A L2 2B 2B • • • • • • • • • • • • • • • • • • GPTAAYSIRVDNTHQ-AIDLTTSLQEMKTHYEVLATKSREEAFTQVQPRIQEMAVTVQAGPQAIIQAKEQIHVFKLQIDSVHREIDRLHR ETNVIL-SMDNN-RQ--FDLDSIIAEVKAQYEDIAQKSKAEAESLYQSKYEELQITAGRHGDSVRNSKIEISELNRMIQRLRSEIDNVKK ETNVVL-SMDNN-RQ--FDLDGIISEVKAQYDSICQRSKAEAETFYQSKYEELQITAGKHGDSVRNTKMEISELNRMIQRLRSEIDGCKK DTSVVL-SMDNN-RS--LDLDGIIAEVRAQYEDVANKSRLEVENMYQVKYQELQTSAGRYGDDLKNTKTEISELTRYTTRLQSEIDALKA DTSVVV-EMDNS-RN--LDMDAIVAEVRAQYEDIANRSRAEAEMWYKSKYEEMQTSATKYGDDLRSTKTEIADLNRMIQRLQSEIDAVKG D--VNVE-MDAA---PGVDLSRILNEMRDQYEKMAEKNRKDAEEWFFTKTEELNREVATNSELVQSGKSEISELRRTMQNLEIELNSQLS D--VNVE-MNAA---PGVDLTQLLNNMRNQYEQLAEKNRKDAEEWFNQKSKELTTEIDSNIAQMSSHKSEITELRRTVQGLEIELQSQLA EQHVQID-VDV-SK-P--DLTAALRDVRQQYESVAAKNLQEAEEWYKSKFADLSEAANRNNDALRQAKQESNEYRRQVQSLTCEVDALKG QQQVHVE-MDV-AK-P--DLTAALREIRTQYEAVATSNMQETEEWYRSKFADLTDAASRNAELLRQAKHEANDYRRQLQALTCDLESLRG A-QISVE-MDVSSK-P--DLSAALKDIRAQYEKLAAKNMQNAEEWFKSRFTVLTESAAKNTDAVRAAKDEVSESRRLLKAKTLEIEACRG T-RI-VEIDSGRRVEFESKLAEALQELRRDHEQQILEYKEHLEKNFSAKLENAQLAAAKNSDYASATREEIMATKLRVDTLSSQLNHYQK --K-G---MDYAE-FWKSELSKCVREIQSAYDEKIDMIQQDTEAKYSAQLNSLRSGNVKDGMQLQHVQEEVKKLRTQAGEKNAMYAELAA

434 hag α hu II mu II X II gof II hu I mu I vim GFAP NFL X lam 3 He inv

STST C 524 2B • • • • • • • • • • • • • • • • • • • • • KNTDVEREITVIETNIHTQSDEWTNNINSLKVDLEVIKKQITQYARDYQDLLATKMSLDVEIAAYKKLLDSEETRISHGGGITITTNAGT hag α QISNLQQSISDAEQRGENALKDAKNKLNDMEDALQQAKEDLARLLCDYHELMNTKLALDMEIATYRTLLEGEESRMSGECAPNVSVTVST hu II QISQIQQNINDAEQRGEKALKDAQNKLNEIEDALSQCKEDCARLLCDFQELMNTKLALDMEIGTYKKLLEGEEIRMSGECTPNVSVSVST mu II QRANLEAQIAEAEERGELALKDARNKLAELEAALQKAKQDMSRQLRDYQELMNVKLALDIEIATYRKLLEGEESRLESGFQNLSIQTKTV X II QRSNLENQIAEAEERGELAVRDAKARIKDLEDALQRAKQDMARQIREYQELMNVKLALDIEIATYRKLLEGEEDRLLSGIKSVNISKQST gof II MKASLENSLEETKGRYCMQLAQIQEMIGSVEEQLAQLRCEMEQQNQEYKILLDVKTRLEQEIATYRRLLEGEDAHLSSSQFSSGSQSSRD hu I LKQSLEASLAETVESLLRQLSQIQSQISALEEQLQQIRAETECQNAEYQQLLDIKTKLENEIQTYRSLLEGEGSSSGGGGGRRGGSGGGS mu I TNESLERQMREMEENFALEAANYQDTIGRLQDEIQNKKEEMARHLREYQDLLNVKMALDIEIATYRKLLEGEESRISLPLPNFSSLNLRE vim TNESLERQMREQEERHARESASYQEALARLEEEGQSLKEEMARHLQEYQDLLNVKLALDIEIATYRKLLEGEENRITIPVQTFSNLQIRE GFAP MNEALEKQLQELEDKQNADISAMQDTINKLENELRSTKSEMARYLKEYQDLLNVKMALDIEIAAYRKLLEGEETRLSFTSVGSITSGYSQ NFL QNSALEAKVRDLQDMLDRAHDMHRRQMTEKDREVTEIRHTLQGQLEEYEQLLDVKLALDMEINAYRKMLEGEEQRLKLSPSPSQRSTVSR X lam 3 KFASLQAERDSIGRQCSELERELEELRIKYNQDIGDLSNELSAVLAQLQILTDAKITMELEIACYRKLLEGEESRVGLRSLVEQAIGVQG He inv

Fig. 3. Comparison of central rod of hagfish α chain with other types of IF chains. Central rod, between arrows. Alignment based mainly on homology profiles for types I-V (Conway and Parry, 1988) but also includes a relationship of Xenopus laevis lamin 3 with H. aspera and vimentin (Doring and Stick, 1990). Origin of sequences: hu II (human, epidermal, 67 kDa, type II; Steinert et al., 1985); mu II (mouse, epidermal, 67 kDa, type II; Steinert et al., 1985); X II (X. laevis oocyte, non-epidermal, 55.7 kDa, type II; Franz and Franke, 1986); gof II (goldfish, Carassius auratus, glial cells, 58 kDa, type II; Giordano et al., 1989); hu I (human, epidermal cells, 50 kDa, type I; Marchuk et al., 1985); mu I (mouse, epidermal, 59 kDa, type I; Steinert et al., 1983); vim (hamster, vimentin, 53.5 kDa, type III; Quax et al., 1983); GFAP (mouse, brain, 50 kDa, type III; Lewis et al., 1984); NFL (mouse, neurofilament, 68 kDa, type IV; Lewis and Cowan, 1986); X lam 3 (X. laevis, lamin 3, type V; Dodemont et al., 1990; Doring and Stick, 1990); He inv (snail, H. aspersa; Dodemont et al., 1990). Residue numbering refers to hagfish sequence in Fig. 2. The numbering system refers only to the sequence of α and does not include spacer sites in L1 and L12. Apolar (a,d) residues in heptad repeats denoted by (•) correspond to hagfish but are representative of those reported in other chains. The 42 residue insert in 1B of X. laevis lamin and H. aspersa is not included but region of deletion noted (▲). Four underlined residues in α (F,Q,D,S) represent deviation from residues conserved in all other IF chains in this figure. Identical sequences among six of the 12 IFs including α are delineated within boxes.

Novel IF from hagfish 3139 (75%). Another highly conserved consensus region (EYQ•LL (D/N)VK•(R/A)L(E/D)•EIATYR(K/R)LLEGE(E/D)•R(I/M/L)) where • represents a non-conserved residue, is located in the C-terminal end of rod 2B (Parry, 1990). The α polypeptide has 75% identity with this region, a value at the lower end of the percentage identity found for other IF types: II (77-96%); I (6992%); III (100%); IV (96%); V (81%); invertebrate (73%). Hagfish α exhibits low sequence identity with central rods of other IF chains The sequence qualities described before together with localization of tripartite domains and subdomains are required to categorize α as an intermediate filament polypeptide (Steinert and Roop, 1988), but assignment of α to an established rodtype is not possible by current standards. The major criterion to date for classification is based on percentage identity in the rod region. It is generally observed that chains of the same type show 50-95% identity (Steinert and Roop, 1988; Eckert, 1988; Ferretti et al., 1991; Albers and Fuchs, 1992), but for chains of different types the identity in the rod region is 30-35% or less (Fuchs et al., 1985; Steinert and Parry, 1985; Weber and Geisler, 1985; Steinert and Roop, 1988; Albers and Fuchs, 1992; Hoffman and Hermann, 1992). Analyses of identity between α and other IF chains as aligned in Fig. 3 are summarized in Table 1, and show low total rod identity (acidic (Parry, 1990), a feature also seen in α (Table 3A). Fast Fourier transform analyses of α to evaluate the periodicity (if any) of charged residues in segments 1B and 2 indicate some significant differences with data derived from other IF types (Table 3B). The mean periodicities (acidic or basic) in 1B and 2 are approxi-

mately 9.58 and 9.84 residues among all IF types (Parry, 1990). Notably, periodicities of this magnitude were not maintained among the basic residues in segment 1B and acidic residues in segment 2 (Table 3B). In 1B the periodicity of 9.18 for acidic residues is lower than the mean for all keratins, but is nearest to the 9.3 residue mean for segment 1A in type I keratins. No period for threonine was found in α, and no common period is known in other IF chains (Table 3B). The keratin nature of α resides mainly in the terminal domains Despite the low central rod sequence identity of α with all types of IF subunits, the amino acid content and peptide sequences of both flanking regions is mainly keratin-like. The N- and C-terminal domains each have a high total content of Gly + Ser, 42% and 53%, respectively. These domains (Fig. 2) also contain four tandem repeats of FGGP (nos 153-168), a near tandem octad repeat of (GGGGVGYG)A, (nos 564-580) and other random repeats such as SRVLG (nos 49-53 and nos 76-80). The absence of negatively charged residues, D and E, in the entire C-terminal flanking domain of α is often a feature of a type II keratin as judged by similar observations within the last 85 residues of other polypeptides of this type (Hoffman et al., 1985). In α, the number of basic residues exceeds the acidic residues in both the N-terminal (+14) and C-terminal domains (+8), another characteristic of type II keratins (Parry, 1990). In segment 1A of α, the percentage of basic residues (14.3%) is greater than acidic (8.6%), a characteristic of type II but not types I, III and V IFs (Table 3A) (Parry, 1990). Type II chains have a neutral to basic pI, as does α: 7.6 by isoelectric focusing of the purified polypeptide (Spitzer et al., 1984). The low cysteine content in α (50%) with other established keratins; linear periodicity of acidic and basic residues in segments 1B and 2; L12 of 16 (type I a,b) or 17 (type II a,b) residues; H1 and H2 terminal domains. Keratin-like features for α reside mainly in the N- and C-terminal domains (192 and 133 residues, respectively), wherein each has a high total content of glycine and serine residues (42% and 53%, respectively), a frequent characteristic of keratins from mammals, reptiles and amphibians (Conway and Parry, 1988; Steinert and Roop, 1988; Parry, 1990; Hoffman et al., 1985; Giordano et al., 1989). In addition, the terminal regions of α have tandem, near-tandem and other peptide repeats, representing a welldocumented property of many keratins (Parry, 1990; Parry and Fraser, 1985; Steinert and Parry, 1985). Possible categorization of α as a type II keratin based on the net basic residues in each terminal domain (see Results) is corroborated by its neutral-to-basic pI of 7.6 (Spitzer et al., 1984) and also by its coassembly with a more acidic subunit γ, of pI 5.5, from the thread (Spitzer et al., 1984, 1988). By contrast, α lacks H1 and H2 subdomains of 36 and 20 residues, respectively, which are homologous to established keratins. When the central rod of 318 residues is aligned in relation to homology profiles delineated by Conway and Parry (1988) and in relation to apolar heptad repeats, location of subdomains and highly conserved IF sequences (Fig. 3), and then evaluated for identity with other IF types (Table 1), α exhibits relatively low identity (