Cloning of several cDNA segments coding for human liver ... - NCBI

3 downloads 0 Views 1MB Size Report
Cloning of several cDNA segments coding for human liver proteins. F. Costanzol, L. Castagnolil, L. Dentel, P. Arcari2,. M. Smith1'3, P. Costanzo2, G. Raugeil,P.
Ine tiMUU Journal Vo1.2 NO. 1 pp.57-61, 1983

Cloning of several cDNA segments coding for human liver proteins

F. Costanzol, L. Castagnolil, L. Dentel, P. Arcari2, M. Smith1'3, P. Costanzo2, G. Raugeil, P. Izzo2, T.C. Pietropaolol'2, L. Bougueleret1, F. Cimino2, F. Salvatore2 and R. Cortesel* 1European Molecular Biology Laboratory, Postfach 10.2209, D-6900 Heidelberg, FRG, and 2lstituto di Chimica Biologica, II Facolta di Medicina, Universita di Napoli, Naples, Italy Communicated by R. Cortese Received on 15 November 1982

A human cDNA library was constructed using M13 derivative vectors. The simple and rapid procedures for sequencing single-stranded DNA by the dideoxy chain temination method allowed a screening of individual clones directly by DNA sequence analysis. Some of these dones were identified as coding for: serum albumin, atantitrypsin, retinol-binding protein, prothrombin, haptoglobin, and metallothionein. Furthermore, a clone coding for aldolase B was tentatively identified on the basis of high sequence homology with rabbit muscle aldolase. Key words: a1-antitrypsin/haptoglobin/human liver cDNA library/prothrombin/retinol-binding protein Introduction In the hepatocyte, the characteristic repertoire of gene expression leads to the synthesis of a large number of plasma proteins and some liver-specific enzymes. The liver acquires the capacity to synthesize several liver-specific products at various stages during embryonal and fetal development (Gitlin and Gitlin, 1975), and switches between fetal-specific and adult-specific patterns of transcription have been described (Tilghman and Belayew, 1982). In addition to the general mechanisms responsible for the tissue-specific expression, which are probably common to all genes transcribed in the liver, there are phenomena of gene regulation, involving a subset of liver-specific genes. For instance, during acute infection, there is a considerable increase in the synthesis of some plasma proteins, the so-called acute phase proteins, which include a -antitrypsin, az-macroglobulin, haptoglobin, and others (Ricca et al., 1982; Kushner, 1982). Retinol-binding protein synthesis is induced by vitamin A (Smith et al., 1978). Some of the liver-specific genes appear to be under hormonal control (Kurtz, 1981; Mayo and Palmiter, 1982). Comparative studies on the structure and the pattern of expression of liver-specific genes should provide useful information towards the understanding of the molecular basis of both the general and the gene-specific mechanism of regulation. For this program, human liver is particularly suited for many reasons. The biochemistry of several human liver gene products is well characterized to the point that for many proteins the amino acid sequence is known; for many plasma proteins an enormous amount of genetic information has been accumulated, largely due to the medical interest: in 3Present address: MRC Laboratory of Molecular Biology, University Medical

SchooL Hills Road, Cambridge CB2 2QH, UK *To whom reprint requests should be sent. IRL Press Limited, Oxford, England.

many research centers and hospitals, routine analysis of plasma proteins are carried out on blood samples, providing information on the corresponding genes. Furthermore, for several proteins synthesized in the human liver, it is possible to envisage a variety of applications in medicine and pharmacology. We decided to direct our effort towards the purification of several liver-specific genes. In this paper we describe our general strategy and report the cloning of several cDNA segments including those coding for serum albumin, prothrombin, al-antitrypsin, retinolbinding protein, metallothionein, haptoglobin 13 chain and aldolase B.

Results The intracellular concentration of mRNA coding for several human plasma proteins can be approximately estimated on the basis of the available information about the serum concentrations of the corresponding proteins. Human serum albumin can be used as a standard for which both the intracellular mRNA concentration ( - 10% of total mRNA in the hepatocyte) and the amount in the plasma (3.5 g/100 ml) are known. On this basis it is probable that the intracellular concentration of the mRNA coding for several other plasma proteins is of the order of 1% of the total: for instance, mRNA coding for carmacroglobulin; a, 13 and y chains of fibrinogen; al7antitrypsin; transferrin, haptoglobin a and 1 chain. For several others, the relative abundance could be between 0.1 and 0.5 %, for instance mRNA for ApoAl and ApoA2 lipid-binding proteins, complement C3, al-acid glycoprotein, etc. Obviously these calculations do not take into account several factors such as protein turnover or differential efficiency of mRNA translation, but they can be a useful indication of the probability of finding a certain clone in a human liver cDNA library. The speed and accuracy of the dideoxy DNA sequencing method (Sanger et al., 1977) is certainly sufficient to set up a strategy for the identification of the clones of a human liver cDNA library, simply based on DNA sequencing and DNAprotein sequence comparison. This can be achieved without unreasonable effort, by sequencing only -100 bases per clone (sufficient to provide specific information on the amino acid sequence of the corresponding protein). We have constructed a human liver cDNA library with the aim of characterizing clones by direct DNA sequence determination. Characterization of human liver mRNA mRNA was prepared from human fetal liver according to the procedure described in Materials and methods. To establish its suitability as a starting material for a cDNA library, we used it for a series of in vitro translation experiments. The pattern of the overall in vitro protein synthesis is shown in Figure 1, slot 1; immunoprecipitation with commercially available antibodies against a rmacroglobulin (slot 4), serum albumin (slots 2 and 6) and ApoAl lipid-binding protein (slot 7) are shown; in slot 5 is the immunoprecipitation with anti-IgG antibodies, used here as negative control. On the basis of the good quality of the in vitro translation pattern, and the demonstration, by immunoprecipitation 57

F. Costanzo et al.

Table I. Human liver cDNA clones, identified by DNA-protein sequence comparison 236 Sequences screened 103 Identified sequences

--

E

0.

E

r-

34 3 3

ci. a.l

C)

-.-s

. e.9r

zi

5-r

4

,--t

+

a.

J_~

r.:

26 23

I

up"

I

'i4. do"

%.

. sm

X, 6 O-,OOr

Alu repeated sequences Mixed Alu sequencesa Mitochondrial cytochrome oxidase I Mitochondrial 16S rRNA Mitochondrial URF4 Mitochondrial URF2 Mitochondrial cytochrome B Serum albumin

-y-Globin a-Globin Prothrombin Retinol-binding protein al-Antitrypsin Aldolase Metallothionein Haptoglobin (3 chain

aMixed Alu sequences are DNA fragments where, in addition to regions showing strong homology to the Alu family of repeated sequences, there are also long stretches of sequences without any homology to Alu or any other known sequence.

4A-

-vC_.

l.

Flg. 1. In vitro translation and immunoprecipitation of human liver poly(A)+ mRNA. 0.3 ug of total human liver poly(A)+ was translated in a rabbit reticulocyte lysate in the presence of [35S]methionine as radioactive precursor and the translation products analyzed on a 7.5%7o SDS-polyacrylamide gel (slot 1). 25 Al of the translation mix were immunoprecipitated with a specific antiserum against human serum albumin (slot 2,6), human a2-macroglobulin (slot 4) and human Apo AI lipid-binding protein (slot 7). Anti-IgG antibodies were used as a negative control (slot 5). The immunoprecipitation was carried out overnight at 4°C using 20 tu of commercial antiserum for 25 ,lA of translation mix. 100 tl of a 1:1 suspension of protein A-sepharose in phosphate buffered saline were used to precipitate the antigen-antibody complex, 1 h at 4°C. The immunoprecipitate was analyzed on a 7.5%0 SDS-polyacrylamide gel.

with specific antisera, of the synthesis of some liver-specific proteins, we judged our preparation of human liver mRNA suitable as a source for a cDNA library. Construction of a human liver cDNA library in Mp9 phage vector

We chose to use as vector the M13 phage derivative Mp9. cDNA segments can be inserted at a variety of sites in this vector, and the resulting recombinant clones can be easily identified because they form white plaques (Gronenborn and

Messing, 1978). Moreover, by using a simple and rapid procedure, it is possible to prepare sufficient amounts of singlestranded DNA for several sequencing reactions. Human liver 58

Number of clones corresponding to:

double-stranded DNA was prepared as described in Materials and methods, and separately digested with one of several restriction endonucleases, to yield relatively short DNA segments. The resulting fragments were ligated into Mp9 doublestranded DNA cut with a suitable restriction endonuclease. In this way, we constructed several cDNA libraries of segments cut with HaeIII, Sau3A, AluI, Taq and HapII. Individual clones were grown in 5 ml cultures and, after centrifugation, the clear supernatant containing viral particles was dot-spotted ( 100 1d/spot) onto millipore filters, in an ordered matrix array. To immobilize single-stranded DNA on the paper, the filters were treated with 0.1 M NaOH, 1.5 M NaCl, neutralized with 0.2 M Tris HCI pH 7.5, 2 x SSC and baked for 2 h at 80°C in vacuum. We estimate that each dot contained -0.3 Ag of DNA. Following hybridization with 32P-labelled cDNA or mRNA from human liver, the positive clones were used for the DNA sequence analysis. DNA sequence analysis ofpositive clones Sequence analysis was performed on 236 clones. Initial results revealed that several clones corresponded to serum albumin or to globin y chain. Consequently, all remaining clones were screened with albumin and y-globin 32P-labelled probes, so as to eliminate these from our pool. Each DNA sequence was converted into an amino acid sequence in all possible frames and confronted with a collection of protein sequences comprising the Washington Medical Center Atlas (Dayhoff, 1972, 1978), and additional sequences collected by us from the literature of the last four years. This research yielded the results shown in Table I. The presence of 'y- and a-globin clones was to be expected, it being known that these proteins are synthesized in the human fetal liver. Other identified clones correspond to proteins synthesized by liver; their sequence and DNA-protein se-

%.A%Panng

Lpa %ALPLIA-1k

ILUUMg

AILPI IBURIEURI UVCI

PA-UtCH15

CLONE R33 AND HUMAN cxl -ANTITRYPSIN 248 258 268 W V L L M K Y L G N A T A ! F F L P D E G K L Q H L E N E L T H D I

238

--- TGGGTGCTGCTGATGAAATACCTGGCCAATGCCACCGCCATC TTCTTCCTGCCAGATGAGGGGAAACTACAGCACCTGGAAAATGAACTCACCCACGATATC

W V L L M K Y L G N A T A I F F L P D E G K L Q H L E N E L T H D I

278 I T K F L E N E D R R S ATCACCAAGTTCCTGGAMAATGAAGACAGAAGGTCT---

I T K F L E N E D R R S

CLONE T36 AND HUMAN METALLOTHIONEIN 42 5,2

32

SC C S C C P V G C A K C A

G C I C K G A S D K

---AGCTGCTGCTCCTGC 4CCTGTGGGCTGTGCCAAGTGTGCCCAGGGCTGCATCTGCAAAGGGGCH CGGA9AAG---

S C C S C Y P V G C A K C A Q G C I C K G A S E K

CLONE P2 AND HUMAN RETINOL-BINDING PROTEIN 159

169 179 L C L A R 0 Y R L I V H N G Y C D G R S E R N L

---CTGTGCCTGGCCAGGCAGTACAGGCTGATCGTCCACAACGGTTACTGCGATGGCAGATCAGAAAGAAACCTT---

L C L A R Q Y R L I V H N G Y C D G R S E R N L

CLONE Q5 AND HUMAN PROTHROMBIN 428

4L6 438 K P V A F S D Y I H P V C L P N R E

---AAGAAGCCTGTTGCCTTCAGTGACTATATTCACCCTGTGTGTCTGCC 4AEGGGAG---

K K P V A F S D Y I H P V C L P D R E

CLONE T40 AND HUMAN HAPTOGLOBIN P CHAIN

2'5

35

45

55

T T U A T L I 'N E Q W L L T T A K N L F L N H S E N A T A K D I A P

--

-AC CACAGGTGC CAC GCTGAT CAATGAACAATG GCTG CTGAC CAC GG CCAAAAATC TC TTCCT GAACCATTCAGAAAATGCAACAGC GAAAGACATTGC CCC T

T T G A T L I N E Q W L L T T A K N L F L N H S E N A T A K D I A P 65

T L T L Y V

K

ACTTTAACACTCTATGTGGGGAAA---

T L T L Y V 5 K

Fig. 2. Comparison between the nucleotide sequence of human liver cDNA clones and the corresponding protein sequence. Nucleotide sequences of cDNA clones were translated into amino acids (lower sequence) and compared with the known protein sequence (upper sequence). Amino acid differences are underlined. Predicted nucleotide differences are boxed.

CLONE R38 AND RABBIT ALDOLASE 16

26 S D I A H R I V A P G K G I L A A D Q S

GC ... TCAG

A

C0-HG-

EG

A&TTGCCCA AGJATTGTTGCCCATGGAAAGGGGATCCTGGCTGCAGAThAATCT

S E I A Q S I V A ?G K G I L A A D E S

36 46 T G S I A K R L Q S I ACN-AGNATCGCNAA

]G

ACN

AG

GT[AGGTA CCAT GGGA CCGCCTGCAGACGATAAGGGT

V G T M G N R L Q R

T E N T E

I K G

GGAAACACTGAA...

E N T E

Flg. 3. Comparison of nucleotide sequence of a human liver cDNA clone and rabbit aldolase amino acid sequence. As in Figure 2 the lower amino acid sequence is the one expected from our nucleotide sequence, the upper one is the amino acid sequence of the rabbit muscle aldolase. Differences are underlined; predicted nucleotide differences are boxed.

quence comparison are shown in Figure 2 and Figure 3. Clone R33 and human al7antitrypsin: the sequence determined from this clone corresponds perfectly to a region of the human a1-antitrypsin sequence, from amino acid residue 238 to 283 (Carrell et al., 1982). Clone T36 and human metallothionein: the sequence determined from this clone is highly homologous to that of human metallothionein Ila recently reported by Karin and Richards (1982). Clone P2 and human retinol-binding protein: the sequence determined from this clone (Figure 2) corresponds perfectly to a region of the human retinol-binding protein sequence, from amino acid residue 159 to 182 (Rask et al., 1979). Clone Q5 and human prothrombin: the sequence determined for this clone corresponds to a region of the human prothrombin sequence, from amino acid residue 428 to 446, with only one difference: residue 444 in our sequence is an aspartic acid, whereas in the sequence published by Mann and Elion (1980), it is an asparagine. Clone T40 and human haptoglobin beta chain: the se-

r. % uatuIILu VI u".

Pcinet B

Pcinel A

2:

Table H. Length in nucleotides of mRNA coding for various proteins synthesized by human liver inferred by Northern analysis

cr

E

:z CK

E

z c

E