Supplementary Information for - Nature

1 downloads 0 Views 2MB Size Report
4.6 for a fixed peak position of the low-FRET Gaussian E*1 = 0.45 (E*2 ... there is a gradual increase in occupancy of the high-FRET species and a ..... Joyce, C. M. & Derbyshire, V. [1] Purification of Escherichia coli DNA polymerase I and.
Supplementary Information for Conformational landscapes of DNA polymerase I and mutator derivatives establish fidelity checkpoints for nucleotide insertion Hohlbein et al.

1

Supplementary Figure S1. Wt KF forms an intermediate FRET species at 1 mM dTTP. (a) E* histograms for the binary complex (row 1) and the ternary complex (1 mM dTTP, AdTTP) fitted with either a fixed (E* = 0.45, row 2) or a floating (row 3) peak position of the lowFRET Gaussian (data from Figure 2A). In row 2, the residuals show systematic deviations along the E* axis, diagnostic of a poor fit for the fixed Gaussian. Specifically, the fit overestimates the number of molecules in the open conformation. Fitting the low-FRET Gaussian of the ternary complex without fixing the peak position resulted in E*1 = 0.51 (row 3), improving the fit and removing the systematic deviations in the residuals. (b) PDA for the ternary complex of panel A, row 2. Assuming a dynamic equilibrium between two states results in a poor fit (red line) with 2 = 4.6 for a fixed peak position of the low-FRET Gaussian E*1 = 0.45 (E*2 = 0.66, 1 = 0.21, 2 = 0.22, k1 = 180 s-1, and k-1 = 26 s-1). (c) PDA for the ternary complex of panel A, row 3. PDA using a fixed peak position of E*1 = 0.51 results in an improved fit (red line) compared to panel B (2 = 1.5 for E*2 = 0.66, 1 = 0.22, 2 = 0.22, k1 = 150 s-1, and k-1 = 25 s-1). 2

Supplementary Figure S2. Examples of nucleotide titrations. As described in Figure 2, all FRET histograms were fitted to a double-Gaussian function (black lines, sum of Gaussians; grey lines, individual Gaussians). The DNA concentration was 100 nM for all experiments with binary and ternary complexes. (a) Titration of the Pol-DNA binary complex (DNA1, templating base A, Fig. 1d) with the complementary nucleotide, dTTP. As the concentration of the correct nucleotide is increased, there is a gradual increase in occupancy of the high-FRET species and a concomitant decrease in the occupancy of the low-FRET species. The two dotted vertical lines mark the mean E* values of the FRET species in the binary complex (open and closed). The titration was completed in a single day. (b) Titration of the Pol-DNA binary complex (DNA1, templating base A, Figure 1D) with the mismatched nucleotide, dGTP. As the concentration of the incorrect nucleotide is increased, there is a peak shift of the low-FRET species. Since the titration in panels a and b were performed in the same day, rows 1 and 2 are identical in (a) and (b). The dashed line marks the mean E* value of the intermediate-FRET species as populated in the ternary complexes.

3

(c) Titration of the Pol-DNA binary complex (DNA 2, templating base G, Fig. 1d) with the complementary nucleotide, dCTP. We plotted the fraction of the molecules in the high-FRET species (black circles) and the peak shift of the mean E* of the low-FRET species (blue triangles) as a function of nucleotide concentration. The peak shift was normalized relative to the E* difference between the means of the open and closed conformations. The data were globally fitted with the four-state model as described (Fig. 3c), which allows for simultaneous fitting of the two FRET observables. (d) Titration of the Pol-DNA binary complex (DNA1, templating base A, Fig. 1d) with the mispaired ribonucleotide, rGTP. (e) Titration of the unliganded polymerase with dGTP and rGTP. For fitting the data, we used a four-state model as described (Fig. 3c), but without the presence of DNA.

4

Supplementary Figure S3. Nucleotide titrations for E710Q and the Y766 derivatives. Ternary complexes were formed after addition of various nucleotides to E710Q (a) and Y766A (b), and Y766F (c) in presence of DNA1. Titrations and style as in Figures 3a, 3b and Supplementary Figures S2c, S2d, S2e, but with KF mutants instead of wt KF. Data points with error bars are represented as mean +/- s.e.m., derived from three independent measurements.

5

Supplementary Figure S4. PDA analysis for the binary and ternary complex of wt KF. (a) E* histogram and PDA-generated fit for the binary complex of wt KF and DNA1. PDA analysis and figure style as in Supplementary Figures S1b and S1c. The best fit (2 kIN  O). The maximum FRET broadening seen in the simulation is ~10%, similar to what is observed in the analysis of the experimental data in panel c.

8

Supplementary Figure S6. Error rates for mispair formation by wt KF and derivatives. The error rate for each mispair corresponds to the number of errors per detectable nucleotide incorporation. The data for wt KF, E710A, and E710Q were taken from13, and Y766A from11.

9

Supplementary Table S1. Fluorophore labelling of proteins used in this study. Pol I(KF)a Protein Cy3B

ATTO647N

550R-744G

(µM)

(%)b

5.1

5.4

99.3

7.5

6.4

5.5

99.9

E710Q

11.6

11.7

9.8

99.9

Y766A

14.6

11.5

12.4

100

Y766F

2.3

1.6

2.1

99.7

(µM)

(µM)

WT

6.7

E710A

a

All proteins have the genotype: N-His6, D424A, K550C, L744C, C907S in addition to the listed

mutation. b

550R-744G as a percentage of the molecules carrying one donor and one acceptor dye label.

10

Supplementary Table S2. Single-turnover kinetic data for complementary dNTP incorporationa. Pol I(KF) b Reaction c Kd(dNTP) (µM)

a

kpol (s-1)

WT

A-dTTP

17 ± 1

40 ± 2

E710A

A-dTTP

110 ± 10

0.26 ± 0.01

E710Q

A-dTTP

320 ± 20

0.23 ± 0.01

Y766A

A-dTTP

150 ± 40

5.0 ± 0.6

Y766F

A-dTTP

41 ± 4

50 ± 3

WT

G-dCTP

6.6

30

Y766A

G-dCTP

8.2

25

Data reported as mean ± s.e.m. are average values from at least 2 experiments; the others are

single measurements. The data for wild-type Pol I(KF) are in good agreement with previous measurements23. b

All the proteins had the genotype N-His6,D424A,L744C,C907S in addition to the listed

mutations. c

A-dTTP incorporation was measured using a 32P-labeled DNA substrate. G-dCTP incorporation

was measured using a 5'-Cy5-labeled substrate, which gives ~2-fold lower kpol compared with the corresponding 32P-labeled DNA.

11

Supplementary Methods. Five-state model for data analysis. The five-state model including the presence of the partially-closed state in the binary complex is defined as follows:

[S1] where the five-states are represented as CB (closed binary), PCB (partially-closed binary), OB (open binary), PCT (partially-closed ternary) and CT (closed ternary), with N representing dNTP. The equations describing the two FRET observables are then given by: 

Normalized peak shift of the low-FRET Gaussian:

(E*obs – E*initial) / (E*closed – E*initial) = ymax [N] / ((1 + K0ʹ)*Kd1 + [N]) 

[S2]

Fraction of high-FRET species: ([C] + [CN]) / [Total] = ([C] + [CN]) / [Total]

[S3]

= (K0ʹ K3 + K1K2[N]) / (1 + K0ʹ + K0ʹ K3 + K1[N] + K1K2[N]), where K0’ represents the equilibrium constant between the open binary and partially-closed binary states, and K3 represents the equilibrium constant between partially-closed binary and closed binary states. Four-state, off-pathway model for data analysis. The data analysis using the four-state model assumes that the intermediate-FRET species, IN, is on-pathway between open and closed conformations (Fig. 3c), as seems most reasonable. The alternative off-pathway model leads to a set of equations identical to the above, except that [CN] = K2[O][N] and, as a consequence, Kd2 = 1 / K2.

12

DNA preparation. DNA oligonucleotides (hairpin DNAs, shown in Fig. 1d) were synthesized by the Keck Biotechnology Resource Laboratory at Yale Medical School and purified using denaturing polyacrylamide gel electrophoresis as described15. The 3’ terminus was a dideoxynucleotide, allowing formation of ternary complexes with incoming nucleotides, but preventing phosphoryl transfer and restricting our observations to pre-chemistry species. DNA molecules of this type bind to wt KF with a Kd < 1 nM33. Protein expression and purification. KF derivatives were prepared using either our previously described expression plasmid, with transcriptional and translational signals from bacteriophage λ23,34 or a pET-derived construct in which the protein is expressed from a bacteriophage T7 promoter35. Wild-type and mutant constructs were based on a KF genotype of N-His6,D424A,K550C,L744C,C907S to provide for double-labelling with fluorophores, as described below. The listed changes had a negligible effect on polymerase activity7. For simplicity, the N-His6,D424A,K550C,L744C,C907S protein is referred to as wild-type (wt KF). The expressed proteins were purified by affinity chromatography on Ni-NTA agarose (Qiagen)23. Site-specific labeling of KF. Double-labelling of wild-type and mutant KF proteins using maleimides of Cy3B (GE Healthcare) and ATTO647N (ATTO-TEC) was modified from our previous labelling procedure7: by including DNA and dNTP substrates during labelling, we improved the labelling bias to ~ 99% 550-ATTO647N,744-Cy3B for all the proteins in this study (Supplementary Table S1). Before labelling, the double-Cys KF derivative was reduced in 5 mM DTT and dialyzed into the nonsulfhydryl reducing agent, TCEP (Invitrogen). To the dialyzed protein, in 50 mM Tris-HCl, pH 7.5, 120 µM TCEP, was added an equimolar amount of a duplex DNA primer-template (having C as the next templating base) and the complementary dGTP (1 mM final concentration). To promote substrate binding but prevent nucleotide addition, the final mix also contained 1 mM EDTA and 5 mM CaCl2. The protein was labelled by sequential addition of the two maleimides: first, ATTO647N maleimide was added at 1.2-fold molar excess relative to the 13

protein, and allowed to react for 1 h at 22 °C; then, Cy3B maleimide was added at 3.4-fold molar excess, and the mixture was incubated for a further 16 h at 4 °C. The reaction was stopped by addition of 1 mM DTT, and the labelled protein was purified from excess dyes, DNA and dGTP using chromatography on heparin-agarose (Sigma-Aldrich). The reaction mix was loaded onto the column, and washed extensively with 20 mM Tris-HCl, pH 7.5, 1 mM EDTA, 2% (vol/vol) glycerol, 1 mM 2-mercaptoethanol, followed by the same buffer containing 50 mM NaCl. The bound protein was then eluted with the same buffer containing 0.4 M NaCl. Labelled proteins were stored at −20 °C in 50 mM Tris-HCl, pH 7.5, 1 mM DTT, 40% (vol/vol) glycerol. The molarities of protein and dye labels were calculated from absorbance spectra. The relative amounts of Cy3B and ATTO647N at positions 550 and 744 were measured using partial digestion with chymotrypsin, as described7. After fractionation on SDS-PAGE, the Cy3B and ATTO647N fluorescence in appropriate peptides was quantitated and used to estimate the fraction of the donor-acceptor molecules that had the 550R,744G labelling pattern (Supplementary Table S1). Two assays were carried out on the labelled proteins to assess the effectiveness of removal of DNA and dGTP by the heparin column. Contamination by DNA was measured using T4 polynucleotide kinase and gamma-[32P]ATP. Addition of a primer-template with C as the templating base was used to detect residual dGTP by extension of the 32P-labeled primer. Quantitation of the labelled products in these assays indicated that DNA contamination of the labelled proteins was ≤ 5% and dGTP contamination ≤ 1% on a molar basis. The enzymatic activity of KF derivatives used in this study (Supplementary Table S2) was assessed by measuring the single-turnover rate of nucleotide addition to a DNA primer terminus by chemical quench methods23,36; stopped-flow fluorescence studies of these proteins will be published elsewhere (O.B., N.D.F.G., C.M.J, unpublished). Probability Distribution Analysis (PDA) fitting parameters. PDA fitting parameters for Figure 5: wt KF static model fit: E*1 = 0.49, E*2 = 0.68, 1 = 0.27,

2 = 0.27, N1 = 0.58, N2 = 0.42; dynamic model fit: E*1 = 0.46, E*2 = 0.72, 1 = 0.17, 2 = 0.16, k1 = 150 s-1, and k-1 = 320 s-1, Y766A static model fit: E*1 = 0.51, E*2 = 0.68, 1 = 0.23, 2 = 0.24, N1 = 0.84, N2 = 0.16; dynamic model fit: E*1 = 0.49, E*2 = 0.71, 1 = 0.19, 2 = 0.20, k1 = 9 s-1, k-1 = 128 s-1, and Y766F static model fit: E*1 = 0.50, E*2 = 0.69, 1 = 0.28, 2 = 0.27, N1 =

14

0.25, N2 = 0.75; dynamic model fit: E*1 = 0.49, E*2 = 0.71, 1 = 0.23, 2 = 0.23, k1 = 435 s-1, and k-1 = 108 s-1.

15

Supplementary references 33. Turner, R. M., Grindley, N. D. F. & Joyce, C. M. Interaction of DNA polymerase I (Klenow fragment) with the single-stranded template beyond the site of synthesis. Biochemistry 42, 2373–2385 (2003). 34. Joyce, C. M. & Derbyshire, V. [1] Purification of Escherichia coli DNA polymerase I and Klenow fragment. Methods Enzymol. 262, 3–13 (1995). 35. Studier, F. W., Rosenberg, A. H., Dunn, J. J. & Dubendorff, J. W. Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185, 60–89 (1990). 36. Johnson, K. A. Rapid quench kinetic-analysis of polymerases, adenosine-triphosphatases, and enzyme intermediates. Methods Enzymol. 249, 38–61 (1995).

16